Proteome and transcriptome analyses of wheat near isogenic lines identifies key proteins and genes of wheat bread quality

The regulation of wheat protein quality is a highly complex biological process involving multiple metabolic pathways. To reveal new insights into the regulatory pathways of wheat glutenin synthesis, we used the grain-filling period wheat grains of the near-isogenic lines NIL-723 and NIL-1010, which have large differences in quality, to perform a combined transcriptome and proteome analysis. Compared with NIL-1010, NIL-723 had 1287 transcripts and 355 proteins with significantly different abundances. Certain key significantly enriched pathway were identified, and wheat quality was associated with alanine, aspartate and glutamate metabolism, nitrogen metabolism and alpha-linolenic acid metabolism. Differentially expressed proteins (DEPs) or Differentially expressed genes (DEGs) in amino acid synthesis pathways were upregulated primarily in the glycine (Gly), methionine (Met), threonine (Thr), glutamic acid (Glu), proline (proC), cysteine (Cys), and arginine (Arg) synthesis and downregulated in the tryptophan (trpE), leucine (leuC), citrulline (argE), and ornithine (argE) synthesis. Furthermore, to elucidate changes in glutenin in the grain synthesis pathway, we plotted a regulatory pathway map and found that DEGs and DEPs in ribosomes (RPL5) and the ER (HSPA5, HYOU1, PDIA3, PDIA1, Sec24, and Sec31) may play key roles in regulating glutenin synthesis. The transcriptional validation of some of the differentially expressed proteins through real-time quantitative PCR analysis further validated the transcriptome and proteomic results.


Results
Characterization of NIL-723 and NIL-1010. We identified the quality between NIL-723 and NIL-1010 as near-isogenic lines in the quality testing center of crop varieties in Hebei Province, and found that there were only differences in quality traits. (Table 1), and other agronomic traits were similar. In this study, the protein, water absorption, dough development time, extension area and sedimentation value of NIL-723 were slightly higher compared with NIL-1010, but not statistically significant. The dough stability time and the maximum resistance were significantly higher than those of NIL-1010 (P < 0.05). Previous studies have elucidated that the content and type of glutenin have a greater impact on the quality of wheat 6 , while the main differences between the near isogenic lines in this study lie in the dough stability time and the maximum tensile resistance. Screening significant differential transcripts and proteins. The quality analysis confirmed that the near-isogenic lines in this study differed primarily in stabilization time and maximum tensile resistance. Therefore, grains were collected at 25 days after flowering, and combined transcriptomic and proteomic analysis was used to study regulatory mechanisms affecting wheat quality. Through the high-throughput DNA sequencing and filtering of invalid data, 37,762,443 (11.33G) clean reads were obtained, 125,219 genes were aligned to the wheat genome (Table 2), and 57,315 genes in NIL-723 and 57,108 genes in NIL-1010 were detected in Table 1. Quality measurement index of NIL-723 and NIL-1010. *, **Represents significant difference at P < 0.05 and P < 0.01, respectively. www.nature.com/scientificreports/  Accordance represents a consistent changing trend between the transcriptome and proteome; Mono significance represents mono-significant differences between the transcriptome and proteome; opposite represents that the transcriptome and proteome have opposite changing trends. plot generated using the 'ggplot2' packages in R 31,32 .  1A) and proteome (Fig. 1B). A comparative analysis of transcriptome data was performed, which included 1287 DEGs with |log 2 (fold change) |> 2, p < 0.05, of which 869 genes were downregulated and 418 were upregulated in NIL-727 in comparison to NIL-1010 (Table 2, (Table S1)). Label-free technology was used for the quantitative analysis of proteomic changes associated with wheat quality. A total of 13,411 unique peptide fragments were identified from the six samples (Table S2). A total of 2606 proteins from NIL-723 and 2618 proteins from NIL-1010 were detected in the proteome (Table 1), of which 124 proteins were downregulated and 231 proteins were upregulated. The intersection and union of the DEGs and differentially expressed proteins (DEPs) of the experimental materials were analyzed using Venny online software (http:// bioin fogp. cnb. csic. es/ tools/ venny/ index. html). A comparison between the two wheat lines showed that NIL-723 contained 2495 specific genes and 12 specific proteins and that NIL-1010 contained 2288 specific genes and 24 specific proteins (Fig. 2). Among them, one DEP detected in the proteome was the same as the protein encoded by DEG in NIL-723, and two DEPs were identical with the proteins encoded by DEGs in NIL-1010.
Dynamics of transcript-to-protein between NIL-723 and NIL-1010. Data regarding proteins and mRNA with the same changes in expression can be used to further confirm and explain certain crucial mechanisms in organisms. On the basis of protein identification and transcriptome sequencing results, threshold conditions were selected for screening-relevant DEGs and DEPs. Genes with total reads < 10 in the six samples were eliminated to ensure the accuracy of the transcriptome. Threshold conditions for DEGs were as follows: fold difference ≥ 2; threshold condition for DEPs: fold difference ≥ 1.2. The R program was used to study consistency concerning changes in the transcript abundance of the types of proteins. Expression values were normalized by z-scoring, and the Pearson correlation coefficient was calculated for all genes with corresponding proteins identified in 723 and1010. The results asserted that NIL-723 and NIL-1010 had similar changes in most transcripts and protein species (Fig. 1C, Table S3). 1815 members showed a consistent changing trend between the transcriptome and proteome, includes 5 members that were upregulated in both transcriptome and proteome; 350 members had mono-significant differences. Among them, 8 and 35 members were down-regulated and up-regulated in the transcriptome, but there was no significant difference in the proteome. Besides, 111 and 196 members were down-regulated and up-regulated in the proteome, respectively, but there was no significant difference in the transcriptome; 3 members were in opposite directions (Fig. 1D). These analyses indicated that gene expression does not fully represent the abundance of protein species and that there may be strong posttranslational regulation during protein production 28 .
GO and pathway analysis of genes in the transcriptome and proteome. The GO functional annotation of DEGs was performed. These DEGs were categorized into three GO terms, namely, biological process, cellular component, and molecular function. Cellular protein metabolic processes accounted for the largest proportion among biological processes (Fig. 3A, Table S4). Membrane ranked first among cellular components, and transmembrane transporter activity was the largest category among molecular functions. These DEPs were also categorized into three GO terms (Fig. 3B, Table S5). Protein folding accounted for the largest proportion  www.nature.com/scientificreports/ among biological processes. Chloroplast ranked first among cellular components, and ATP binding was the largest category among molecular functions. The KEGG analysis of consistent genes after combined transcriptomic and proteomic analysis revealed that most of the genes were enriched in alpha-linolenic acid metabolism (Q value < 0.05, Fig. 4A, Table S6). Significantly upregulated genes were mostly involved in the alanine, aspartate and glutamate metabolism, nitrogen metabolism and alpha-linolenic acid metabolism (Q value < 0.05, Fig. 4B, Table S7), however, significantly downregulated genes were not enriched into any pathway (Q value < 0.05, Fig. 4C, Table S8).

Protein-protein interaction network.
We constructed a protein-protein interaction network with the top 100 differential genes and proteins (Fig. 5, Table S9). In the network, a third of the significantly downregulated protein species and two-thirds of the significantly upregulated protein species were found to be related to the synthesis and assembly of starch and protein, suggesting that the excellent quality of NIL-723 may be affected by the synthesis and assembly of starch and protein, such as TraesCS6A01G000600.1 (PDIA3), TraesCS3B01G162400.1 (SEC13), TraesCS2D01G600900.1 (BiP3), and TraesCS2D01G546500.2 (CALR), and maybe involved in glutenin development. TraesCS7D01G535600.3 (sbe1), TraesCS7A01G070100.1 (WAXY), and TraesCS7D01G190100.1 are essential enzymes involved in the synthesis of starch, which also affects the structure of gluten. TraesCS4B01G017600.1 (Glo-3A), TraesCS7B01G489800.1, TraesCS4A01G067700.3, TraesCS3D01G119800.1 and TraesCS4D01G171800.1 were related to the synthesis of storage proteins. The figure shows that the expressions of HYOU, BiP3, PDIA1, CLAR, Sec31, TraesCS5D01G492900.3, and TraesCS3D01G359300.1 were all upregulated and made greater contributions to protein interactions, suggesting that these proteins are associated with the excellent quality of NIL-723 (Fig. 5). PER1, Clo1-B, TraesC-S5D01G045800.1, TraesCS5D01G543600.1, TraesCS3B01G178200.3, and TraesCS7B01G254000.1 were all downregulated and made larger contributions to protein interactions, suggesting that the downregulation of these proteins also affects wheat quality. www.nature.com/scientificreports/

Wheat quality-related DEGs and DEPs in the biosynthesis of amino acids.
In the present study, a comprehensive investigation of changes in the amino acid synthesis pathways (www. kegg. jpkegg/ kegg1. html) of DEGs and DEPs involving cultivars NIL-723 and NIL-1010 was conducted (Fig. 6). DEPs and DEGs were mainly located in the serine, cysteine, methionine, glycine, threonine, glutamic acid, arginine, and proline synthetic.
Notably, most DEPs and DEGs are upregulated in the synthetic of glycine (Gly), methionine (Met), threonine (Thr), glutamic acid (Glu), proline (proC), cysteine (Cys), and arginine (Arg). 2,3-Bisphosphoglycerate-dependent phosphoglycerate mutase (PGAM) catalyzes the conversion of glycerate-3P to 2-phospho-D-glycerate, which in turn is converted to phosphoenolpyruvate through catalysis by enolase (ENO). Pyruvate is converted into alanine through catalysis by alanine transaminase (GPT). Enolase (ENO) and alanine transaminase (GPT) were upregulated in transcription and protein levels. L-homoserine inhibits the synthesis of O-phospho-Lhomoserine via homoserine kinase (thrB). Additionally, threonine synthase (thrC) catalyzes the synthesis of threonine, which in turn is used to synthesize glycine through catalysis by threonine aldolase (ItaE), and serine is catalyzed to synthesize glycine through glycine hydroxymethyltransferase (glyA). HMW-GSs with high proline and glycine contents and low lysine levels have abnormally high glutamic acid content. In addition, cysteine is synthesized from O-acetylserine by cysteine synthase (cysK). Cysteine residues form intermolecular disulfide bonds between HMW-GS and LMW-GS, forming protein polymers of different sizes. Furthermore, aconitate hydratase (ACO) catalyzes the conversion of oxaloacetate to 2-oxoglutarate, and aspartate aminotransferase (GOT1) also catalyzes the conversion of 2-oxoglutarate to glutamate. GLUL encodes glutamine synthetase (GS), which catalyzes the conversion of glutamate to glutamine. ProB, ProA, and ProC are upregulated during the synthesis of Phe, and glutamic acid is involved in the biosynthesis of Phe through catalysis by glutamate 5-kinase, glutamate-5-semialdehyde dehydrogenase, and pyrroline-5-carboxylate reductase. The genes and proteins of ProC and GLUL were all upregulated in the process of proline and glutamine synthesis respectively. Besides, some DEPs or DEGs were downregulated during the synthesis of tryptophan (trpE), leucine (leuC), citrulline (argE), and ornithine (argE).
Wheat quality-related DEGs and DEPs in protein synthesis and processing. A total of three and ten DEPs were mapped to the pathways of transcription and RNA splicing, respectively (Fig. 7). Interestingly, they were all up-regulated in the grains of NIL-723, excluding two protein species (PPIH and SFRS7). Moreover, six DEGs were found to be involved in DNA replication, RNA splicing, and transcription. The transcripts of four DEGs (RPA1, RPA2, RPC8, and TAF9B) for DNA replication and transcription had reduced abundances in NIL-723 grains. In the pathway of RNA splicing, SNRPA1 was downregulated, and SF3B4 was upregulated in NIL-723. Five DEPs for RNA transport, namely, EIF3A, EIF3C, EIF3D, DDX39B, and XPO1, were up-regulated in NIL-723 (Fig. 7). Moreover, four DEPs and four differential transcripts were found to be involved in RNA degradation, and they were all up-regulated in NIL-723, excluding two DEGs and one protein species (RCD1, CNOT2, and DCP1B) (Fig. 7). A total of 18 ribosome-related DEPs were detected, including 15 large subunit protein species and 3 small subunit protein species. Significantly, all of these protein species were down-regulated in NIL-723, excluding three protein species (RPL5, RPL8, and RPL17) (Fig. 7). Besides, nine DEGs were also to be involved in ribosome assembly. Similarly, the five DEGs were downregulated in NIL-723, while RPL38, RPL23, RPL5, and RPL10 up-regulated in NIL-723 (Fig. 7). Endoplasmic reticulum (ER) is a multifunctional organelle that plays a vital role in protein synthesis, folding, output, and maturation. In the category ER protein processing, three differentially expressed genes were identified (OST1 and BCAP31 downregulated and HSPA5 upregulated), and nine DEPs were detected (Fig. 7). www.nature.com/scientificreports/ Except for SSR1, the other eight DEPs were significantly upregulated in NIL-723, and the upregulation of the ribosome-anchored subunit SWP1 could promote the glycosylation of proteins and the developmental phenotype of grains. The upregulation of the endoplasmic reticulum (ER) homologs HSPA5 and HYOU1 (BiP) regulates protein synthesis through complex interactions with various substrates and regulatory proteins. The GANAB of the alpha subunit of glucosidase II plays a critical role in protein folding and quality control and is recognized by lectin-chaperones, calnexin, and calreticulin. PDIA3 and PDIA1 were upregulated in NIL-723, indicating that they played a positive regulatory role in the synthesis of gluten. CALR is a CRT family of proteins responsible for regulating calcium homeostasis and promoting the folding and synthesis of gluten with Hsp70 and PDI in the endoplasmic reticulum. Secretory24 (Sec24) and secretory31 (Sec31) upregulated in NIL-723 promote the anterograde transport of newly synthesized proteins from the endoplasmic reticulum (ER) to other compartments in the plant endometrium mediated by shell protein complex II (COPII). Heat shock 70 kDa protein (HSPA1s) and phospholipase D1(PLD1) detected and upregulated in NIL-723 were found to be essential for the ER stress response. Protein ubiquitination plays a crucial role in regulating the post-transcriptional modification of proteins. In the ubiquitin ligase complex, eight DEPs (UBE2G1, UBE2D, HSPA1s, HSP20, HSP90A, HSP90B, SKP1, and DNAJA2) and three DEGs (HSP20, HSP90A, and SKP1) were identified. The protein species HSP20, SKP1, UBE2G1, and UBE2D and the transcripts of SKP1 were down-regulated, whereas the transcripts of HSP20, HSP90A, and DEPs of HSPA1s, HSP90A, HSP90B, and DNAJA2 were up-regulated in NIL-723 (Fig. 7). Moreover, eight DEPs and four DEGs were mapped to phagolysosomes that responded differently to gluten synthesis, six of them were down-regulated, and the rest were up-regulated in NIL-723 (Fig. 7). www.nature.com/scientificreports/ Validation of differentially expressed proteins at the transcription level. The transcript abundance of some of the differentially abundant proteins and genes was validated through the transcript abundance of their corresponding genes using quantitative real-time PCR (RT-qPCR). The RT-qPCR results corroborated that the relative abundance at the transcript level was consistent with the proteomic and transcriptome analysis (Fig. 8, Table S10). However, the transcript level of some of the proteins was found to be completely opposite to that of the differential expression of their genes. The relative expression of PDIA1 was found to be insignificant, whereas the relative abundance of transcription was extremely significant. The relative abundance of RPL5, HYOU1, and HSPA5 increased significantly at the transcriptional and proteomic levels of NIL-723, which were also verified by quantitative real-time PCR. We think that these three candidate genes are related to the quality of wheat and can be used as the basis for transgenic or molecular marker-assisted selection to improve the quality of wheat. We will conduct the genetic transformation of the target gene in wheat for further verification.

Discussion
Wheat quality is substantially significant to the standard living of individuals 1 . We previously reported many HMW and LMW glutenin alleles, but the mechanisms by which these genes regulate the wheat glutenin synthesis pathway is unclear. This study aimed to investigate the potential regulatory mechanisms involved in the superior cultivar NIL-723. To this end, we collected grains of the near-isogenic lines NIL-723 and NIL-1010 during the grain filling stage for transcriptomic and proteomic analyses. Compared with NIL-1010, a total of 1287 transcripts and 355 protein species in NIL-723 had significantly different abundance in the transcriptome and proteome, respectively. Theoretically, these changes in abundance led to the higher quality of NIL-723 compared with NIL-1010. Generally, the process from mRNA to protein is highly complicated. The abundance of protein species is usually affected by complex regulatory processes, such as DNA methylation, non-coding RNA interactions, post-translational modification, mutation, degradation, and protein interactions 28 . Therefore, the R www.nature.com/scientificreports/ programming language was used to calculate the Pearson correlation coefficient, and the consistency between changes in transcript and protein species abundance between NIL-723 and NIL-1010 was studied. Changes in 2128 genes in the transcriptome were consistent with the proteome, while 384 and 3 molecules exhibited significant differences and opposite changes in the transcriptome and proteome, respectively, indicating that strong post-translational regulation in the synthetic process from mRNA to protein may be involved. We also analyzed the protein interaction network of the top 100 DEGs and DEPs and found that approximately a third of the proteins were significantly downregulated and that two-thirds were significantly upregulated. Most of these  www.nature.com/scientificreports/ proteins are associated with starch and protein synthesis and assembly, indicating that the excellent quality of NIL-723 may be affected by starch and protein synthesis. Studies have confirmed that amino acids are closely associated with glutenin synthesis 5 . DEPs and DEGs in amino acid synthesis pathways are mainly located in the synthetic pathways of serine, cysteine, methionine, glycine, threonine, glutamic acid, arginine, and proline. Most DEPs or DEGs were upregulated in the synthetic pathways of glycine, methionine, threonine, glutamic acid, proline, cysteine, and arginine. This is consistent with the finding of high glutamic acid, proline, and glycine content and low lysine levels in HMW-GS 33 . The genes and proteins of the GLUL and ProC family proteases were upregulated to catalyze the biosynthesis of glutamine and phenylalanine, respectively. In addition, the accumulation of cysteine residues promotes the formation of intermolecular disulfide bonds between HMW-GS and LMW-GS and folding into the protein polymers of different sizes 11 .
In this study, we plotted a regulatory pathway map to elucidate changes in glutenin in the grain synthesis pathway. Eleven DEPs in transcription and RNA splicing pathways were upregulated in NIL-723 grains, and one DEG (SF3B4) in RNA splicing pathways was upregulated. SF3B4 is a transcription factor for RNA splicing and a cofactor that binds to proteins on the endoplasmic reticulum (ER) that plays a key role in splicing and enhancing the translation of specific proteins in plants 34 . A total of 18 DEPs and 9 DEGs associated with ribosomes were detected, of which only 3 proteins (RPL5, RPL8, and RPL17) and 4 genes (RPL5, RPL10, RPL23, and RPL38) were upregulated in NIL-723. The RPL5 transcript and protein were upregulated in the ribosomes of NIL-723, and RP protein mediates selective translation 35 , suggesting that RPL5 may be necessary for the translation of specific glutenins. EIF3 is the most complex translation initiation factor and plays a role in translation elongation and enhances the synthesis of proteins associated with membrane function in protein synthesis 36 . EIF3. A total of nine DEPs and three DEGs associated with ER protein processing were detected. SWP1, an anchor subunit of ribosomes, is mutated in Arabidopsis with reduced protein glycosylation and severe dysplasia 37 . HSPA5 and HYOU1 (BiP) are the homologous proteins of the Hsp70 family of chaperone proteins in the ER, which are involved in most functions of protein synthesis by complex interactions with multiple substrates and regulatory proteins, such as assisting folding, assembly and disassembly of protein complexes, pulling polypeptides from ribosomes and transmembrane pores, activating and inactivating signaling proteins and controlling their degradation 38 . Zhu et al. 39 found that the expression level of BiP is associated with the subunit type and synthesis of HMW-GS. GANAB, the alpha subunit of glucosidase II, plays a critical role in protein folding and quality control and is recognized by lectin-chaperones, calnexin, and calreticulin 40 . Kimura et al. 41 argued that protein disulfide isomerase (PDI) family proteins in the ER of wheat endosperm cells play an important role in gluten synthesis and folding during grain filling. The upregulation of PDIA3 and PDIA1 in NIL-723 indicates that they play a positive regulatory role in gluten protein synthesis. At the same time, CALR is a CRT family protein that regulates calcium homeostasis and can promote gluten protein folding and synthesis in the ER via Hsp70 and PDI. Additionally, Wang et al. 19 validated that Sec24 is associated with the accumulation of glutenin precursors and plays an essential role in mediating COPII vesicle formation and promoting the transport of secreted proteins in plant cells. The present study found that Sec24 and Sec31 are upregulated in NIL-723.
The genes associated with wheat quality found in the present study provide a reference for the artificial regulation of wheat quality using genetic engineering techniques, which may have potential applications in agriculture. However, the functions of most DEPs or DEGs found in the present study are still largely unknown, and we will continue to investigate the function of these genes.

Conclusions
The combined analysis of the transcriptome and proteome produced a more complete regulatory map for the wheat glutenin synthesis pathway, which has deepened our understanding of wheat glutenin synthesis and the networks involved. In addition, genes related to wheat quality identified in this study provide a reference for future studies, and genetic engineering techniques could be used to artificially regulate wheat quality, which may have potential applications in agriculture.

Materials and methods
Plant materials. The near-isogenic lines (NILs) were generated by backcrossing Taigu genomic sterile population and high-quality cultivar Shiluan 02-1 for six generations, followed by the three generations of selfing to homozygosity. The near-isogenic lines NIL-723 and NIL-1010, which had the same agronomic traits and gliadins, HMW-GS except for the difference in LMW-GS, were provided by the Institute of Cereal and Oil Crops, Hebei Academy of Agriculture and Forestry Sciences, Shijiazhuang, China. The NIL-723 has superior bread quality, with longer stability time and greater the maximum tensile resistance. In June 2017, the seeds were planted in randomized complete blocks with three replicates at an experimental base of Shijiazhuang, Hebei Province, China (114.84°, 37.59° E N, 50 m). The management conditions and techniques of the field were consistent. At 25 days after flowering, five wheat plants with the same growth vigor were selected from each material in each repetition, and the central part grains of the main spike were collected for transcriptomic and proteomic analyses. The prepared samples and seeds were crushed in liquid nitrogen to avoid tissue oxidation. All experiments were repeated thrice. The present study complies with relevant institutional, national, and international guidelines and legislation.   42,43 . The gene expression level of RNA-Seq was estimated by the reads per kilobase per million mapped read (RPKM) 44 . DESeq2 (v1.26) was employed to analyze the differential gene expression 45 , and genes with p < 0.05 and |log2 ratio|> 1 were identified as differentially expressed genes.
Data processing for proteomics. The samples were ground into powder using a mortar and transferred to a 1.5-ml screw-capped tube. Each sample, after adding 100 μl of lysis buffer (7 M urea, 2 M sulfourea, 0.1% CHAPS, 1 × protease inhibitors), was ultrasonically crushed to extract the total protein and centrifuged at 14,000 × g at 4 °C for 30 min. The supernatant was used for further experiments. With a Pierce BCA protein assay kit (Thermo Fisher Scientific Inc.) using bovine serum albumin (BSA) as a standard, protein concentration was determined at a wavelength of 595 nm. The standard proteins were sequentially added to 96-well microtiter plates, followed by the addition of pure water, and 180 μl of Coomassie Brilliant Blue to calculate the protein concentration of each sample. Protein (350 ug) was loaded on 24 cm IPG strip with a linear gradient (pH 4-7), and 2D gel electrophoresis of SDS-PAGE was performed with 12.5% gel 46 . After SDS-PAGE electrophoresis, the proteins were stained for 2 h with 0.05% Coomassie Brilliant Blue (Sigma, USA) and then with a decoloring solution. After completely destaining and washing, Imagemaster 2D Platinum Software Version 5.0 (GE Healthcare, USA) was used to detect spots with the parameters smooth, minimum area and saliency set to 2, 15 and 8, respectively, followed by manual spot editing, such as artificial spot deletion, splitting and merging. The tryptic digestion and in-gel digestion of gel pieces were performed according to the method previously described by Shevchenko et al. 47 . The digested peptides were separated using Thermo Scientific EASY-nLC 1000 system (Nano HPLC) and detected by the Orbitrap Fusion mass spectrometer (Thermo Scientific) 46 . The full-scan mass spectrometry was obtained in the mass spectrometry over 350-1800 m/z with a resolution of 700,000 and a spectral resolution of 17,500, and the normalized collision energy was set to 29%. The MS raw data were processed using the MaxQuant software (Max-Planck-Institute of Biochemistry, Martinsried, Germany, https:// www. maxqu ant. org). The MS data were searched against the wheat genome database (http:// plants. ensem bl. org/ Triti cum_ aesti vum/ Info/ Index). The parameters of MaxQuant searches were set as follows: carbamidomethylation (C) and oxidization (M) were set as static and dynamic modifications, respectively; precursor and fragment ion mass tolerance were 15 ppm and 20 mmu, respectively; and max missed cleavages with ≤ 2 were permitted. The threshold of the global false discovery rate (FDR) for peptides and proteins was set to 0.01. After protein identification, the peptide signal intensities were used to calculate the strength of each identified protein. According to the MaxLFQ label-free quantification method, retention time alignment, labelfree quantification, and MaxLFQ normalization were performed as previously described 48 . The identification transfer protocol was performed in the experimental replication to extract quantitative information across the replication. The peak intensity from the complete set of measurements was compared by Perseus to obtain quantitative data for all peptides in the sample (version 1.4.1.3). The LFQ (label-free quantification) protein intensity obtained through MaxQuant analysis was introduced and converted to a two-based logarithmic scale. The lowest intensity value is used to replace the missing value to compensate for the low signal of low abundance protein. A two-way Student's t-test was used for protein quantitative and statistical significance analysis. Protein species with fold-change > 1.5 and FDR-adjusted (p-value < 0.05) were identified as differentially expressed protein (DEPs) between the experimental groups. FDR-adjusted (p < 0.05) was as corrected by the Benjamini-Hochberg procedure, and the mass spectrometry proteomics data were deposited to the ProteomeXchange Consortium (http:// prote omece ntral. prote omexc hange. org) via the iProX partner repository with the dataset identifier PXD020200.

Bioinformatics analysis.
The bioinformatics analysis of protein species was performed according to Lin et al. 27 , and the biological processes, molecular functions, and cellular components of DEGs and DEPs were analyzed using Blast2GO (version 5.2) (http:// www. blast 2go. com) 49 . Moreover, DEGs and DEPs were investigated using the Kyoto Encyclopedia of Genes and Genomes (KEGG) to map to the possible KEGG pathway maps for the biological interpretation of systemic functions (https:// www. kegg. jp/) 50 . Protein-protein interactions were obtained from the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) database (http:// stringdb. org/) containing known and predicted physical and functional protein-protein interactions 51 . The multiple sequence alignment was performed using MAFFT (v7.450) 52 .

Quantitative real-time PCR (RT-qPCR).
According to the manufacturers' instructions, total RNA was isolated using the Trizol reagent (Invitrogen, Grand Island, NY, USA). The first cDNA was synthesized by the M-MLV reverse transcriptase (Takara, Tokyo, Japan). The Primer Premier 6.0 (Premier Biosoft International, Palo Alto, CA, USA) was used to design specific primers. The reaction was carried out in 20 µL containing 1 µL cDNA, 10 mM Tris-HCl (pH 8.5), 50  com/ produ cts/ spss-stati stics). One-way analysis of variance(ANOVA) was used to evaluate the significance of differences between NIL-723 and NIL-1010. Experimental data was represented by the mean ± standard deviation (SD). The Student's t-test was used to evaluate the significance of the differences among NILs. In all cases, p < 0.05 was indicated statistically significant. Unsupervised PCA (principal component analysis) was performed by statistics function prcomp within R 3.5.0 (www.r-proje ct. org) 31,32 . The data was unit variance scaled before unsupervised PCA. www.nature.com/scientificreports/ Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.