Introduction

Transgenic technology has a significant influence on the development of our society1,2. Since the commercialization of genetically modified crops (GMCs) 20 years ago, GMCs have delivered substantial benefits to farmers and consumers at the agronomic, environmental, economic, health and social levels3. The area under cultivation of GMCs is increasing worldwide every year, with the area totaling more than 189.8 million hectares in 20174. This increase is occurring because more consumers are willing to accept GM food5. However, GMCs are still at the center of intense debates and have become a source of anxiety in developing countries6,7, probably due to worries about their unintended, unexpected and uncontrolled negative effects8. To address this controversy, most people largely rely on scientific risk assessments of GMCs from the government.

To detect the potential unintended effects of GMCs, a nontargeted approach is needed to survey the plants more broadly. Omics-based global profiling, such as transcriptomics, proteomics and metabolomics, is one of the more informative and cost-efficient analytical methods and thus may be a useful technique9,10. Among these profiling approaches, proteomic analysis is a direct method for investigating the unintended effects at protein level. Proteins are key players in gene function and they can act as toxins, antinutrients or allergens; therefore, they are of special concern in safety assessments of GM crops. By comparing the entire proteomes of GM crops and control lines, unintended effects can be evaluated at the protein level. Currently, comparative proteomic analysis has been widely used to evaluate the unintended effects of GMCs11,12,13,14,15. Over the past 20 years, proteomics has been substantially improved in many aspects16. With the rapid development of proteomic technology, comparative proteomic approaches coupled directly with tandem mass spectrometry (MS) technology have been widely used to detect the unintended effects of GMCs8. The two-dimensional electrophoresis (2-DE) technique has been widely used in proteomics research for decades17. Recently, the second-generation proteomic technique iTRAQ has been widely used in comparative proteomic analyses because of its accuracy and reliability18,19,20. However, 2-DE has advantages, and many highly abundant unique proteins can be easily detected by both 2-DE and iTRAQ techniques21. Therefore, the 2DE and iTRAQ approaches currently represent two major techniques used in comparative proteomics22.

Maize is an important crop worldwide, and many maize biotechnologies have been approved; as a result, GM maize is the second largest transgenic crop in terms of planting area, reaching 59.7 million hectares in3. The evaluation of unintended effects in transgenic maize through proteomics is mainly performed on MON810 maize varieties because of their potential commercial value8. Studies have shown some differences between GMCs and their control lines, but the observed differences are not substantial13,23,24,25,26. In recent years, the number of stacked biotechnology events has increased, and the cultivation of transgenic crops with stacked traits has also rapidly increased27. Unintended effects of a stacked commercial maize hybrid were examined at the proteomics level. Compared to single-event hybrids in the same genetic background, stacking two transgenic inserts may impact the overall expression of endogenous genes and may have relevance for safety assessments28.

Phytase-over-expressing maize has been approved as a potential biosafe material. The transgenic maize line BVLA430101specifically expresses the 60 kDa phyA2 protein in its seeds29. We compared the proteomics of leaves between the phytase- transgenic (PT) maize and a nontransgenic (NT) isogenic variety via using a routine 2-DE and MS-based method30. Recently, we also used both 2-DE-MS/MS and iTRAQ-based methods to identify the quantitative proteomic differences between PT and NT maize seeds grown in a greenhouse21. Some differentially accumulated proteins (DAPs) were detected, but the proteomic patterns were not substantially different between PT maize and the NT type. In the present study, we used 2-DE with MS to compare the proteomes of PT and NT maize seeds grown in the field and under greenhouse conditions. Our results may provide more insights into the unintended effect of environmental factors on protein profiles.

Results

Comparison of protein profiles between field grown PT and NT maize

The 2-DE maps of total proteins from field-grown PT and NT maize seeds were obtained as previously described30. Analysis of the protein profiles of PT and NT maize seeds revealed a total of 1027 ± 121 spots in NT maize seed gel maps and 1228 ± 284 spots in PT maize seed gel maps (Figs 1; S1). There were approximately 1079 matched spots between NT and PT maize seed gel profiles. Only those spots showing changes of >1.5-fold or <0.67-fold and detected in all replicates were determined to be DAPs30. The 2-DE image analysis revealed 37 DAPs (5 higher abundance spots and 32 lower abundance spots compared with those in NT maize) between PT and NT maize seed samples grown in the field (Table S2).

Figure 1
figure 1

Typical 2-DE gels of total proteins from maize seeds. The identified 30 DAPs between PT and NT maize seeds, including 3 increased ones (A) and 27 decreased ones (B) in PT, are indicated with arrows in the 2-DE gels.

Protein identification via MALDI TOF/TOF MS

A total of 37 DAPs were manually excisted from colloidal Coomassie Blue (CCB)-stained 2-DE gels for MS/MS analysis and 30 protein spots were successfully identified (Fig. S2). Among these identified DAPs, 3 were up-regulated proteins, and 27 were down-regulated proteins (Fig. 1). The averaged ratio of volume% of the identified protein spots was shown in Tables 1 and S2. The database search for protein identification was based on homology to Zea mays proteins. If one spot was identified as containing more than one protein via MS/MS, then the protein with the highest score was chosen for further functional analysis30. There were 29 unique proteins in the 30 identified protein species since one protein (glucose-1-phosphate adenylyltransferase large subunit 1) was represented by two spots (Tables 1, S3). The protein that was indicated as an unknown protein was subjected to BlastP (protein-protein Blast) against the National Center for Biotechnology Information (NCBI) (http://blast.ncbi.nlm.nih.gov/Blast.cgi) to determine its identity.

Table 1 DAPs of maize seeds planted in the field.

A radial chart was used to evaluate the quality of the identified protein spots. The theoretical ratios and experimental ratios of the molecular mass (Mr) were presented in the radial chart as the radial axis labels, and the theoretical ratios and experimental ratios of the isoelectric point (pI) are presented as the annular radial axis labels (Fig. 2A). Approximately 91% of the identified proteins exhibited a relative Mr ratio in the range of 1.0 ± 0.2, and 94.3% of the identified proteins exhibited a relative pI ratio in the range of 1.0 ± 0.2, which suggested that most identified proteins’ experimental Mr and pI values were similar to their theoretical values.

Figure 2
figure 2

Classification and protein-protein interaction analysis of the identified DAPs. The theoretical and experimental ratios of the molecular mass (Mr) and isoelectric points (pI) of the 30 identified DAPs are presented in the radial chart (A). Functional classification was produced by COG, and the results are provided as the percent proportion (%) of each functional category in all identified DAPs (B). The abbreviations in the figures are as follows: CTM, carbohydrate transport and metabolism; RPM, RNA processing and modification; PTM, posttranslational modification, protein turnover, chaperones; TRB, translation, ribosomal structure and biogenesis; LTM, lipid transport and metabolism; CMB, cell wall/membrane/envelope biogenesis; EPC, energy production and conversion; ATM, amino acid transport and metabolism; NTM, nucleotide transport and metabolism; SMC, secondary metabolites biosynthesis, transport, and catabolism; FUK, function unknown. The hidden disconnected nodes in the protein-protein interaction networks are shown in the five tightly connected clusters after MCL clustering (C).

Bioinformatics analysis of the identified DAPs

The identified DAPs were grouped according to their main biological activities as defined by the functional catalogue of Clusters of Orthologous Groups (COG) of proteins. COG functional analysis classified 25 of the identified proteins into 10 major categories, among which “posttranslational modification, protein turnover, chaperones” was the largest group (group PTM, containing 23% of the DAPs), followed by “energy production and conversion” (group EPC, 13.5% DAPs), and “Amino acid transport and metabolism” (group ATM, 13.5% DAPs). The remaining categories were “RNA processing and modification” (group RPM, 7% DAPs), “carbohydrate transport and metabolism” (group CTM, 7% DAPs), “cell wall/membrane/envelope biogenesis” (group CMB, 7% DAPs), “nucleotide transport and metabolism” (group NTM, 3% DAPs), “secondary metabolites biosynthesis, transport, and catabolism” (group SMC, 3% DAPs), lipid transport and metabolism(group LTM, 3% DAPs), and “translation, ribosomal structure and biogenesis” (group TRB, 3% DAPs); 5 proteins could not be classified through COG classification (Fig. 2B, Table 1).

To predict protein-protein interaction networks, the 29 identified unique proteins were subjected to STRING (v10.5) analysis online (http://string-db.org) with high confidence. Among these proteins, 13 were involved in protein-protein interactions with 3 up-regulated and 10 down-regulated proteins. Hided disconnected nodes in the network, there were five tightly connected clusters after MCL clustering (Fig. 2C). There were 4 proteins in the red cluster, including HSP70, EIF4A, EF2, and RuBisco β. HSP70 and EF2 were found to be the most interactive proteins in these interaction networks, associating with three other proteins, followed by the yellow cluster, with 3 proteins. The 3 remaining clusters contained two proteins that interacted with each other. Among these proteins, four interacting proteins were mainly related to “post-translational modification, protein turnover, and chaperones”, while three interacting proteins were related to “energy production and conversion” among the COG categories.

To confirm the significantly enriched Gene Ontology (GO) functional groups of the identified DAPs in cellular component, biological process, and molecular function categories, GO annotation was further conducted through an online search using WEGO software (http://wego.genomics.org.cn/cgi-bin/wego/index.pl). GO information was obtained with BLAST2GO. The results showed that 30 proteins were successfully mapped with GO annotations, which were classified into three ontologies containing 43 functional groups (Fig. 3A). At the cellular level, 11 GO terms were obtained, including the cellular component category (GO: 00044464), which contained 38.7% of the proteins. For the molecular function ontology, 11 GO terms were found, and the major functional groups were binding functional groups (GO: 0005488), containing 44.7% of the proteins, and catalytic activity (GO: 0003824), containing 35% of the proteins. In the biological process, 21 GO terms were assigned. The major functional group of the proteins was involved in metabolic process (GO: 0008152), including 53.6% of the proteins, followed by cellular processes (GO: 0009987) with 51.2% of the proteins.

Figure 3
figure 3

GO annotation and pathway analysis of the identified DAPs. The identified 30 DAPs between the PT and NT maize seeds planted in the field were subjected to GO (A) and KEGG (B) analyses. The abbreviations for the KEGG pathways are as follows: PU, purine metabolism; CF, carbon fixation; TM, thiamine metabolism; GM, glutathione metabolism; PM, pyruvate metabolism; CM, cysteine and methionine metabolism; SM, starch and sucrose metabolism; SB, streptomycin biosynthesis; AG, alanine, aspartate and glutamate metabolism; AN, amino sugar and nucleotide sugar metabolism.

A Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis of the identified DAPs was performed using the BLAST2GO 4.0 program to investigate their biological functions. The results showed that a total of 18 proteins (58%) were mapped to 28 pathways in the KEGG database. The most represented pathway was “purine metabolism”, which contained five sequences (spots 3, 5, 11, 15 and 27). The other major pathway was “carbon fixation pathways” which contained three sequences (spots 4, 13 and 30). Two proteins were involved in each of the following pathways: “thiamine metabolism”, “starch and sucrose metabolism”, “glutathione metabolism”, “pyruvate metabolism”, “cysteine and methionine metabolism”, “streptomycin biosynthesis”, “alanine, aspartate and glutamate metabolism”, and “amino sugar and nucleotide sugar metabolism”. The remaining pathways contained only one protein sequence (Fig. 3B, Table S4).

Comparison of the protein accumulation and gene expression patterns

We selected ten identified proteins for qRT-PCR analysis to validate the expression patterns of their corresponding genes. To obtain the PT/NT fold-change ratios, the transcript level of the NT maize template was set to 1.0. The changes in the protein accumulation and mRNA expression levels of the selected identified proteins are shown in Fig. 4. Most of the proteins exhibited similar changes at the translational and transcriptional levels; only one down-regulated protein (glutathione transferase 41, spot 20) showed no difference at the transcriptional level. Such inconsistency between the patterns of change in protein accumulation and mRNA expression levels was described in our previous studies21,30; this phenomenon probably resulted from the presence of various posttranslational modifications31.

Figure 4
figure 4

Comparison of the changes in the identified DAPs at protein abundance and gene expression levels. The selected protein spots in the 2-DE gel profiles of NT and PT maize seeds are highlighted (A). The mean abundance values (Vol%) of these spots were calculated (B). Results of qRT-PCR analysis of the corresponding gene expression patterns of the identified proteins are shown in column (C). The gray dotted line in each qRT-PCR bar chart represents a 1.0 ratio value. Error bars represent the standard deviation (SD) among three replicates. The comparison showed that almost all the examined genes and proteins exhibited a similar pattern in the maize seeds.

Comparison of protein profiles in maize seeds from different environments

We identified quantitative differences in the protein profiles between greenhouse-planted PT and NT maize seeds using both the traditional 2-DE and the newly developed high-throughput iTRAQ-based approaches21. Then we compared the protein profiles between field-planted PT and NT maize seeds using traditional 2-DE approaches. To analyze the effects of different planting environments on the PT maize seeds and the control, we further compared the 2-DE gel profiles of maize seeds planted in the field or in a greenhouse (Figs 5, S1, Table 2). The protein spots with changes >1.5-fold were termed as DAPs. There were 76 DEPs between the NT maize seeds grown in two different environments, including 45 up-regulated protein spots in the greenhouse and 31 up-regulated protein spots in the field (Fig. 5A,B, Table S5). Seventy-seven DEPs were detected in the PT maize seeds, with 32 up-regulated protein spots in the greenhouse and 45 up-regulated ones in the field (Fig. 5C,D, Table S6). However, as mentioned above, after comparing the 2-DE profiles of PT and NT maize seeds in the same planting environment, there were only 43 DAPs (PT/NT, planted in the greenhouse)21 or 37 DAPs (PT/NT, planted in the field). These results demonstrated that the growth environment was more important than the gene modification itself for the protein profiles in maize seeds.

Table 2 Comparison of the DAPs of maize seeds planted under different conditions.
Figure 5
figure 5

Typical 2-DE gels of the proteins from maize seeds under different growth environments. The proteins from seeds of NT plants grown under greenhouse (A) and field (B) conditions, as well as PT plants in the greenhouse (C) and the field (D), were subjected to 2-DE, and the DAPs in typical 2-DE gels are highlighted. Arrows indicate the protein spots with an increased abundance in each sample.

Discussion

Many DAPs in the field-grown maize seeds were posttranslational modification-related chaperone proteins

The 30 identified DAPs were obtained between PT and NT maize seeds, which were collected from the field. COG functional classification showed that the largest group (23% of the DAPs) was associated with “posttranslational modification, protein turnover, chaperones”, such as HSP70, ubiquitin carboxyl-terminal hydrolase, glutathione transferase, and ubiquitin-conjugating enzyme. Under field conditions, plants are vulnerable to various stresses, such as drought, disease and insect pests. Posttranslational modification proteins may play important roles in response of abiotic stresses32. As a chaperone protein, HSP70 promotes the degradation of aberrant proteins, prevents the aggregation of denatured proteins and promotes proper folding of denatured proteins33,34. Ubiquitination is an important process in all eukaryotic cells, and the ubiquitin proteasome pathway participates in all aspects of the regulation of eukaryotic cells due to the degradation of proteins in such cells35,36. Ubiquitin-conjugating enzyme E2 can catalyze ubiquitin substrate transfer to protein hydrolysis37.

Environmental influence is more important than gene insertion

In evaluating the unintended effects in GMCs, an important factor to consider is the impact of environmental conditions during maize planting38. We compared the proteomics of PT maize seeds and a control planted in a greenhouse to eliminate variation related to the genome alteration21. In contrast, comparing the proteomic profiles of the same variety (NT/NT, PT/PT) grown under different environmental conditions enabled the elimination of any variation related to the environmental effects on maize seed proteomic profiles26.

In a comparison of the seed proteome profiles of the same variety grown under different environmental conditions, e.g., in a greenhouse or the field, DAPs would be related to the environmental impact. The genomes of NT or PT maize seeds were not different between the greenhouse and field. The 2-DE gel maps of NT maize seeds revealed 76 DAPs between the greenhouse and field-planted seeds, and similarly, there were 77 DAPs in PT maize seeds between greenhouse and field planted samples. However, under the same growth conditions, there were only 43 DAPs (greenhouse) or 37 DAPs (field) when PT maize was compared with NT maize. These data revealed that the insertion of exogenous genes can lead to plant genomic changes causing DAPs, but the influence of the environment on protein profiles (numbers of DAPs) is stronger than the influence of exogenous genes. We think that environmental factors have more important effects than exogenous gene insertion on seed protein profiles. In addition, comparative proteomics of NT maize seeds planted in a greenhouse vs. in the field also revealed that the occurrence of unintended effects is not specific to GM crops. This is a common inherent phenomenon, as it often occurs in the traditional breeding of crops. Environmental impacts on crops are much stronger than those of gene insertion, which is consistent with a previous report26,39. Previous observations also indicated that transgenes have very limited unintended effects, while large differences were observed between lines produced by conventional breeding40,41,42,43.

To clearly understand whether PT maize causes unintended effects, we systematically compared the proteomics of seedling leaves and seeds between PT and NT maize grown under control conditions21,30. We detected insubstantial differences between the seeds of PT maize and those of NT maize. In this study, we further compared the proteomes of PT and NT maize seeds planted in the field condition, and 30 DAPs were successfully identified in these samples. COG functional classification showed that the largest group was associated with “posttranslational modification, protein turnover, chaperones”. In addition, we compared the seed proteome profiles of the same maize species but grown in different locations. Our results revealed that the number of DAPs caused by the environment was much greater than that caused by the insertion of exogenous genes. Thus, the environment had more important effects on seed protein profiles than exogenous gene insertion, as it. The occurrence of unintended effects is not specific to GM crops, and it often occurs in traditional breeding. Our comparative proteomics techniques serve as an exploratory method to determine the safety of GM maize seeds. In addition, in this study, a proteomic comparison of maize seeds was carried out for only one season of field planting. However, the proteome is highly dynamic and can be changed by the cell cycle, environmental influences, and tissue/cell types44. Therefore, the proteomes in long-term- and multi-season-planted maize seeds need to be further compared. In conclusion, the proteomics data of PT maize seeds provided much more information and will be beneficial for the biosafety assessment of PT maize in the future.

Materials and Methods

Plant materials and growth conditions

The phytase-transgenic maize variety is 10TPY006 (PT maize), and the corresponding near-isogenic variety is the conventional hybrid LIYU16 (NT maize). PT and NT maize seeds were provided by Beijing Origin Seed Technology, Inc. The genetic background of the materials was as previously described30. First, conventional maize (LIYU91158 and LIYU953) was crossed with the phyA2 transgenic maize line BVLA430101, and a phyA2-insertion event was introduced into the LIYU91158 and LIYU953 backgrounds. Then, the LIYU91158 and LIYU953 transgenic lines were backcrossed six times to their recurrent parents to minimize genetic background mixing, and two self-pollinations were performed to obtain homozygous plants (OSL931 and OSL930) of each inbred line. Because its DNA was similar to that of LIYU16, the GM line LIYU006 was further derived by crossing OSL931 and OSL930. In the same manner, the NT line of LIYU16 (used as a non-GM control) was derived by crossing the LIYU91158 and LIYU953 inbred lines as described in our previous study30. Materials were planted at the experimental base of the Institute of Tropical Biosciences and Biotechnology (E: 110°45′42″; N: 19°32′18″). These PT and NT seeds were planted side-by-side in the field, and each line was planted in three microplots to represent three replicates. These maize seeds were planted in the same experimental sites as those grown in a greenhouse21. After sowing, the plants were treated according to local agricultural practices. Ears of each microplot were harvested at the same time on the same day when physiologically mature and immediately stored at −80 °C for further study.

Protein extraction

For comparative proteomic analysis, central seeds of each ear were ground into fine powders in liquid nitrogen using a mortar and pestle. Semiquantitative RT-PCR and western blotting analysis were conducted to detect the expression of exogenous genes and the accumulation of target proteins as described previously21,30.

Three biological replicates of PT and NT seed proteins were extracted using a modified Borax/PVPP/Phenol (BPP) protein extraction method as described previously21,45. Approximately 3 g of frozen maize seed fine powders were resuspended in precooled extraction buffer. After added an equal volume Tris-saturated phenol (pH 8.0), the mixtures were centrifuged. Then the upper phase was transferred into a new centrifuge tube and clarified twice. After adding ammonium sulfate saturated-methanol and protein precipitates were obtained. The proteins were quantified according to the Bradford method for the following experiments or were stored at −20 °C.

2-DE

2-DE was performed on an Ettan IPGphor isoelectric focusing system according to the manufacturer’s instructions (2-DE Manual, GE Healthcare, Uppsala, Sweden). The 24 cm IPG strips (immobilized pH gradient) with a linear pH gradient of 4–7 (GE Healthcare) were used, approximately 1,300 µg protein samples were loaded on, and 12.5% sodium dodecyl sulfate (SDS) polyacrylamide gels were used for SDS- polyacrylamide gel electrophoresis (SDS-PAGE). Each protein extracts were performed on 2-DE gels in triplicate for technical replicates. The experimental procedures were as previously described30.

Gels were stained using a GAP staining method46 and scanned with the ImageMaster Labscan V3.0 (GE Healthcare). Image analysis was conducted using a ImageMaster 2D Platinum software package (GE Healthcare). Only the spots that were present in all replicate gels and shown a Student’s t test p-value < 0.05 and a relative change in quantity of at least 1.5-fold in their quantity, were considered as DAPs for further analysis30.

Protein identification in 2-DE Gels via MALDI TOF MS

DAPs were manually excised from 2-DE gels, washed with MilliQ water, and then destained using a destaining solution containing 50 mM NH4HCO3 and 50% acetonitrile (ACN). After air dried, in-gel digestion with bovine trypsin (Trypsin, Roche, Cat. 11418025001) was performed as previously described47.

The digested protein peptides were detected for peptide map fingerprinting (PMF) by using an AB SCIEX matrix-assisted laser desorption/ionization time-of-flight (MALDI TOF) 5800 system (AB SCIEX, Shanghai, China) equipped with neodymium and a laser wavelength of 349 nm. Mass spectra were obtained as previously described48 and searched against the Zea mays amino acid sequence database (including 87,603 sequences) using MASCOT software in-house for protein identification. The search parameters were set as described30. If peptides matched to multiple proteins, the protein with the highest score was selected for bioinformatics analysis. For unknown proteins, a BLAST search was performed in NCBI (http://www.ncbi.nlm) to identify homologous proteins.

Bioinformatics analysis

Functional annotations of the identified DAPs were performed. COG analysis of DAPs was conducted for functional classification through an online search (http://eggnogdb.embl.de/). GO classification was carried out online using WEGO software according to GO terms as described (http://wego.genomics.org.cn)49. In addition, KEGG pathways were analyzed to predict the main reaction networks in which DAPs were involved in using Blast2GO 4.0 software. Finally, protein-protein interactions were analyzed using the STRING database (version 10.5) online (http://string-db.org) and network was clustered to a specified “MCL inflation parameter”.

qRT-PCR analysis

Total RNA was isolated from maize seeds with TRIzol reagent (CWBIO, Beijing, China), and cDNA was generated with a reverse transcriptase kit (TaKaRa, Tokyo, Japan) for quantitative real-time RT-PCR. Approximately 20 μL of mixed solution was prepared for qRT-PCR reaction using SYBR Green PCR Master Mix (TaKaRa, Tokyo, Japan), and the reactions were performed on an Mx3005P sequence detection system according to the manufacturer’s instructions. The maize endogenous gene zSSIIb was used as an internal control to normalize the amount of template cDNA. qRT-PCR primer pairs were designed with Primer 5.0 software (Table S1). Data were analyzed with MxPro software (version 4.10).