Introduction

Embryo abortion is very common in plants and has attracted considerable attention in plant biology, as it not only seriously reduces the seed and fruit production of many crops but also negatively affects the efficiency of plant cross breeding1,2. To solve this important problem, many researchers have carried out a vast number of studies in the past several decades to investigate factors that cause plant embryo abortion3,4. However, most of these studies have mainly focused on morphological, anatomical and physiological levels due to the limitations of traditional biological techniques. Few studies have been reported that examine the factors contributing to plant embryo abortion at a molecular level. Therefore, the mechanisms underlying plant embryo abortion are poorly understood.

The recent development of modern molecular biological techniques, such as novel high-throughput sequencing technologies including Solexa/Illumina RNA sequencing (RNA-seq) and a new proteomics approach (i.e., isobaric tags for relative and absolute quantitation, iTRAQ) may provide a promising means to address the genes and proteins associated with plant embryo abortion5,6,7. In particular, transcriptome sequencing is a very powerful method that produces functional genomic data for non-model organisms without prior information about the entire genome8,9. This novel technique has been extensively applied to many plant species, but surprisingly, few studies on embryo abortion have been reported to date10,11.

The chrysanthemum (Chrysanthemum morifolium) is the second most important flower in the world and plays a vital role in the flower production industry12. More importantly, this species is also grown as a medicinal tea and nutritious vegetable13. Thus, chrysanthemum is an economically important crop both in China and the rest of the world. However, a large problem exists in chrysanthemum cultivation: most chrysanthemum cultivars are only weakly resistant to biotic and abiotic factors13,14. To improve chrysanthemum resistance, breeders often carry out wide crosses between chrysanthemum cultivars and wild chrysanthemum species with high levels of tolerance to biotic and abiotic stresses. However, reproductive barriers often exist in these crosses and as a consequence, breeding efficiency is considerably reduced15. Our previous investigation indicated that embryo abortion is the main reproductive barrier responsible for the failure of these crosses15,16. So far, studies have focused on the causes of embryo abortion, such as maternal genotype,endosperm nutrition17 and the sensitivity of the endosperm from the parental genome dosage imbalance that enhanced or repressed seed development18,19,20, among others. To our knowledge, no study has investigated molecular events occurring in embryo abortion during chrysanthemum cross breeding, or any other crop and thus little information is available pertaining to the genes and proteins involved. Therefore, it is necessary to examine the components implicated in chrysanthemum embryo abortion at the molecular level, which will increase our understanding of the molecular events involved in embryo abortion of chrysanthemum and other crops.

Transcriptome analysis has been instrumental in identifying differentially expressed genes (DEGs) implicated in various organismal biological processes21,22. However, it cannot capture the key proteins implicated in the same processes. Therefore, in addition to transcriptome analysis by RNA-seq, we have also investigated proteins associated with chrysanthemum embryo abortion using iTRAQ, which has mainly been used in proteome analyses of animal and microorganisms23,24,25. The objectives of this study were to identify functional genes and proteins involved in embryo abortion and highlight molecular mechanisms related to chrysanthemum embryo abortion.

Results

Embryo development and abortion

Nearly all ovaries were normal in morphology just before pollination (Fig. 1a, e). At 12 days after pollination (DAP), most chrysanthemum embryos reached the stage of globular embryo and 89.7 ± 3.1% of embryos were normal at this developmental stage (Fig. 1b, f, i). Subsequently, many embryos began to degenerate. At 18 DAP, only 43.3 ± 1.8% of embryos were normal and most developed into the heart-shaped embryo stage (Fig. 1c, g, j) and the remaining embryos underwent degeneration (Fig. 1d, h, k). These results indicate that the period from globular to heart-shaped embryo stages were the key developmental stage of chrysanthemum embryo abortion in the cross.

Figure 1
figure 1

Morphological and anatomical features of chrysanthemum ovaries and embryos.

Chrysanthemum ovaries just before pollination (a, e). Normal chrysanthemum ovaries (b, f) and embryo (i) at 12 days after pollination (DAP). Normal chrysanthemum ovaries (c, g) and embryo (j) at 18 DAP. Abortive chrysanthemum ovaries (d, h) and embryo (k) at 18 DAP. Bars = 400 μm (a–d), 200 μm (e–h), 100 μm (i–k).

High-throughput transcriptome sequencing and read assembly

To study transcriptomic gene expression during embryonic development, three cDNA libraries from normal embryos at 12 DAP (NE12), normal embryos at 18 DAP (NE18) and abortive embryos at 18 DAP (AE18) were subjected to Illumina sequencing (Table 1). After removing invalid reads and data cleaning, we obtained 152,347,304 clean reads containing a total of 13,711,257,360 nucleotides (13.7 Gb) with a mean length of 90 bp. The Q20 percentage (proportion of nucleotides with quality value larger than 20 in reads) and GC (guanine and cytosine) percentage were 97.6% and 43.6%, respectively. After clustering the high-quality reads, we finally obtained 615,312 contigs and 116,697 unigenes. The unigenes with an average length of 743 bp included 52,860 clusters and 63,837 singletons.

Table 1 Summary for chrysanthemum embryo transcriptome

In addition, we compared RNA-Seq datasets with the EST datasets produced from vegetative tissues of Chrysanthemum nankingense26. The results indicate there are 31,674 genes with the same sequence of CDS in both libraries, occupying 27.14% of the 116,697 unigenes in the chrysanthemum embryo transcriptome.

Gene annotation and functional classification

A total of 116,697 unigenes were annotated using Blastx to protein databases and Blastn to nucleotide databases Nt (E-value<0.00001). In the Nr, Nt, SwissProt, KEGG, COG and GO databases, 65,327, 43,427, 43,359, 40,363, 25,297 and 49,535 unigenes were aligned, respectively (Table 2).

Table 2 Annotation of unigene sequences in chrysanthemum embryos

To understand the function of unigenes intuitively and effectively, we used the COGs functional annotation system for the assembled unigenes. As a result, 25,297 unigenes have specific protein function definition, accounting for 37.12% of the total annotated unigenes and involving 25 COG functional classes (Fig. S1). The five largest categories were ‘general function’ (8,167, 16.1%); ‘transcription’ (4,729, 9.3%); ‘Replication, recombination and repair’ (4,147, 8.2%); ‘Posttranslational modification, protein turnover, chaperones’ (3,899, 7.7%); ‘translation, ribosomal structure and biogenesis’ (3,756, 7.4%). The following two categories, ‘nuclear structure’ (14, 0.028%) and ‘extracellular structures’ (35, 0.069%) accounted for the lowest percentages.

We used the GO assignments to classify the functions of the predicted C. morifolium genes according to the annotation from Nr database. The results showed that 49,535 unigenes were categorised into 55 functional groups, which could be classified into three main categories (biological process, cellular component and molecular function) and these contain 24, 15 and 16 functional subcategories, respectively (Fig. S2). In the biological processes category, the major subcategories were ‘cellular process’, ‘metabolic process’ and ‘response to stimulus’. In the two remaining categories, ‘cell’, ‘cell part’ and ‘catalytic activity’, ‘binding’ were remarkable, respectively. This classification suggests that these biological processes may play a significant role during embryonic development. On the other hand, of these terms, ‘channel regulator activity’, ‘metallochaperone activity’, ‘translation regulator activity’, ‘extracellular matrix part’ and ‘protein tag’ were rare (Fig. S2).

To identify biological pathways activated in the embryos of C. morifolium, the assembled unigenes were annotated against the KEGG database (E-value<0.00001). A total of 40,363 unigenes were assigned to the 129 KEGG pathways (Table S1). The majority of these pathways were ‘metabolic pathways [ko01100]’ (9,241 genes), ‘biosynthesis of secondary metabolites [ko01110]’ (4,736 genes), ‘plant-pathogen interaction [ko04626]’ (2,346 genes) and ‘plant hormone signal transduction [ko04075]’ (1,894 genes). In addition, 824 genes were related to ‘RNA degradation [ko03018]’, 765 genes to ‘ubiquitin mediated proteolysis [ko04120]’ and 260 genes to ‘proteasome [ko03050]’, which were closely related to protein degradation and cell death.

DEGs in three samples

In analysis, using FPKM values27 calculated gene expression levels at the different embryonic developmental stages and DEGs with FDR ≤ 0.001 and |log2(ratio)| larger than 2. To explore gene expression levels in AE18, which is the reference point, variations in gene expression were identified in two comparisons: between AE18 and NE12 and between AE18 and NE18. The number of up- and down-regulated genes is shown in Venn diagrams28 (http://bioinfogp.cnb.csic.es/tools/venny/index.html). Compared with NE12, there were 4,399 and 3,868 genes up- and down-regulated in AE18, respectively. Compared with NE18, in AE18, there were 1,310 and 1,010 genes were up- and down-regulated, respectively. In AE18, 306 and 178 genes were up- and down-regulated, respectively, compared with the two samples of NE12 and NE18 (Fig. S3; Tables S2 & S3). Changes in differentially expressed genes at different stages indicated that the abundance of transcripts differed between developmental stages of the embryo.

DEGs were selected based on their possible function in the Nr annotation database and the expression level fold-change. In addition, DEGs with similar functions were clustered in the same category. There were a total of 143 DEGs and 10 categories listed in one heat map (Fig. 2; Table S4). The number of genes related to cell senescence and death, energy metabolism and protein synthesis accounted for a high proportion of these categories. Compared with NE12, NE18 and AE18 were significantly changed in the following categories: cytoskeleton, cell division, phytohormone signalling, reactive oxygen species (ROS) scavenger and protein degradation. In addition, transcription regulators, a key factor in regulating gene expression, were changed, including AP2/ERF, WRKY, zinc finger protein and others. Among the three samples of NE12, NE18 and AE 18, AP2/ERF, WRKY, MYB and ethylene-responsive transcription factor trended upwards. Conversely, zinc finger protein and leucine-rich repeat receptor protein kinase trended downward.

Figure 2
figure 2

Heatmap of 143 differentially expressed genes in normal and abortive chrysanthemum embryos.

The 143 genes are grouped into 10 main categories and the bar represents the scale of the expression levels for each gene. The bar represents the scale of the expression levels for each gene (log10FPKM (fragments per kb per million fragments)) in the NE12, NE18 and AE18 as indicated by red/green rectangles. Red rectangles mean up-regulation of genes and green means down-regulation. All genes in this list have a P-value for differential expression < 10−5 (The P-value was presented in Tables S2 & S3). All information for each gene list can be found in Tables S4.

Validation of gene expression profiles by qRT-PCR

The quality of the RNA-Seq libraries was further validated experimentally using qRT-PCR. Expression quantification of 14 genes, randomly selected from DEGs, related to cell death and division, phytohormone signalling, late embryogenesis abundant protein, plasma membrane H+ ATPase and transcription factor, were performed. qRT-PCR results agree with the RNA-Seq data (Fig. 3), confirming the reliability of our sequencing data.

Figure 3
figure 3

Validation of RNA-seq results by qRT-PCR.

Correlation between the expression profiles of the 14 transcripts was determined by RNA-Seq (red bars) and qRT-PCR (blue bars). The 14 points (A–N) from left to right on the x-axis represent genes encoding caspase putative, senescence-associated protein, beta expansin 2 precursor, cytoskeletal protein, plasma membrane H+ ATPase, cyclic nucleotide-gated ion channel 4-like, gibberellin 20-oxidase No3, ethylene-responsive transcriptional coactivator-like protein, ethylene-responsive transcription factor putative, proteasome component (PCI) domain protein, late embryogenesis abundant protein-like, glutathione peroxidase, leucine-rich repeat receptor protein kinase EXS precursor putative and WRKY transcription factor 2, respectively.

Protein profiles of normal and abortive embryos

Based on iTRAQ-nano-HPLC-MS/MS proteomic investigation, we obtained 715 proteins in protein profiles of the three samples. When AE18 was the references point, there were 41 proteins with significant changes in abundance, 21 of which were up-regulated and 20 of which were down-regulated in NE12 and NE18 (Table 3). Proteins associated with cell senescence and death, embryo sac development arrest, reactive oxygen species scavenger and protein degradation were significantly up-regulated in AE18 relative to those in NE12 during embryo development. Meanwhile, compared with NE18, levels of these proteins were up-regulated as well, which involved normal and abortive embryos in the same developmental stage. For example, the abundance of two proteins related to cell senescence and death and three proteins associated with protein degradation was significantly higher in AE18 than NE12 and higher than NE18 (Table 3). In contrast, proteins associated with cytoskeleton and ribosome constituent, energy metabolism, as well as protein and amino-acid biosynthesis in AE18 were significantly down-regulated among NE12, NE18 and AE18. For instance, five proteins associated with cytoskeleton and ribosome constituent, eight proteins implicated in energy metabolism (two in tricarboxylic acid cycle and three in glycolysis) and five proteins involved in protein and amino-acid biosynthesis were significantly down-regulated in AE18 (Table 3).

Table 3 Differentially expressed proteins in normal and abortive chrysanthemum embryos

Correlation analysis of transcriptomic and proteomic data

We carried out a correlation analysis between DEPs and transcriptomic results to examine the congruence between transcriptomic and proteomic data. Forty-one DEPs were identified in the proteomic analysis, 39 of which (95.1%) were found to have corresponding genes in the transcriptomic data. The corresponding genes of ‘Cysteine synthase-like [Glycine max]’ and ‘Omega-3 fatty acid desaturase, endoplasmic reticulum [Arabidopsis thaliana]’ were not found in the transcriptomic data (Table 3), indicating that transcriptomic and proteomic data are of high identity match. The corresponding genes of three DEPs, ‘Pathogenesis-related protein [Zinnia violacea]’, ‘Pyruvate kinase [A. thaliana]’ and ‘Eukaryotic initiation factor 4A [Oryza sativa Japonica Group]’, were also screened in the list of DEGs (Fig. 2; Table S4). However, the corresponding genes of the other 36 DEPs were not listed in Fig. 2. The reason for this is that the change in expression ratios of those genes was less than two fold. In the comparison between AE18 and NE 12, the expression trend of 21 DEPs (53.8%) was consistent with that of the corresponding genes, whereas the expression trend of the remaining 18 DEPs was not. Similar results were found in the comparison between AE18 and NE 18, where 24 of 39 DEPs (61.5%) had an expression trend consistent with that of the corresponding genes, but the expression trend of the remaining 15 DEPs was not inconsistent with that of the corresponding genes. Such results demonstrate only a moderate correlation between the mRNA and protein abundance ratios, as transcriptional and translational control appeared rather uncoupled in this biological system.

Caspase-like activities in normal and abortive embryos

AE18 (abortive embryos) has the highest cleavage activities for Ac-LEVD-AMC, Ac-VEID-AMC and Ac-LEHD-AMC compared with NE12 and NE18 (Fig. 4). The results indicate that chrysanthemum embryo abortion is may be attributed to programmed cell death and the corresponding three executioner caspase (LEVDase, VEIDase and LEHDase) played a vital role in embryo abortion. It is interesting that NE12 has the highest activity of DEVDase among the three samples. A possible reason is that programmed cell death often occurs in developing embryos to eliminate unwanted cells, such as suspensor cells and DEVDase plays a key role in this process.

Figure 4
figure 4

Caspase-like activities in normal and abortive chrysanthemum embryos.

Embryo extracts were measured for substrate specificity using different caspase substrates: Ac-YVAD-AMC (YVADase), Ac-DEVD-AMC (DEVDase), Ac-LEVD-AMC (LEVDase), Ac-VEID-AMC (VEIDase), Ac-LEHD-AMC (LEHDase). Relative fluorescence units were calculated and expressed as a percentage of the LEHDase activity. AE 18 (abortive embryos) has the highest cleavage activities for Ac-LEVD-AMC, Ac-VEID-AMC and Ac-LEHD-AMC compared with NE12 and NE 18, indicating that chrysanthemum embryo abortion is a result of programmed cell death. Different letters indicate significant differences at alpha = 0.05 by the Bonferroni t-test (the bar is standard error).

Discussion

Genes and proteins associated with programmed cell death during embryo abortion

Plant embryo abortion is a very complex bioprocess during which many events work collectively11,29. Among those events, cell senescence and death are often considered the most important that occur during plant embryo abortion. In plants, cell death is a physiological process required to remove unwanted or damaged cells and usually has two types, programmed cell death (PCD) and necrosis30. PCD commonly occurs in flowering and reproduction, seed and embryo development and development of leaf, stem and root29,30,31. It has several key diagnostic features including leakage of cytochrome c from mitochondria into the cytosol, nuclear DNA fragmentation, increase in caspase-like and metacaspase activities and increase in ROS30,32,33,34. In this study, we found that many genes or proteins associated with PCD features were significantly up-regulated in abortive embryos relative to normal embryos (NE12 and NE18), such as caspase-putative, aspartic proteinase, cysteine protease, cysteine synthase-like and cytochrome c. In addition, the expression levels of some ROS scavengers were much higher in abortive embryos than normal embryos, indicating that higher levels of ROS were produced in abortive embryos. Moreover, several genes encoding senescence-related proteins were also dramatically up-regulated in abortive embryos. Similarly, two senescence-associated genes were once identified and considered to potentially contribute to peanut embryo abortion11. Furthermore, among the three samples of NE12, NE18 and AE 18, caspase activity assays showed that AE18 has the highest activities of three executioner caspase (caspase 4, caspase 6, caspase 9). The results presented here demonstrate that chrysanthemum embryo abortion during cross breeding is possibly a result of PCD and the senescence-related genes is a key trigger of PCD.

Genes and proteins related to phytohormone are implicated in embryo abortion

Change in phytohormone concentration is an important event associated with embryo abortion. For example, the authors found that the level of IAA was much lower in abortive watermelon embryos than normal embryos, whereas ABA concentration was significantly higher in abortive embryos35. Similar results were also found in Chinese white poplar and Litchi Chinensis36. Except for IAA and ABA, ethylene production also plays an important role in embryo abortion, as increased ethylene production has been related to accelerated senescence, reduced growth and responses to environmental stresses. For instance, Hays et al. reported that high temperature treatment resulted in kernel abortion of heat-susceptible winter wheat (Triticum aestivum). Enhanced ethylene production induced by heat stress was responsible for its kernel abortion, as treatment with the ethylene receptor inhibitor 1-methylcyclopropane (1-MCP) prior to exposure to heat stress blocked kernel abortion37. Those previous studies established a link between phytohormone and embryo abortion at a physiological level, but the underlying mechanism was not clear. In the current work, some auxin-related genes or proteins, such as auxin-induced protein and auxin response factor, were significantly down-regulated in abortive chrysanthemum embryos, which is consistent with down-regulation of several genes related to cell division and expansion. In contrast, some ethylene-associated genes or proteins (e.g., ethylene responsive factor and ERF transcription factor) were significantly up-regulated in abortive embryos. Such results suggest that phytohormone was implicated in chrysanthemum embryo abortion and ethylene may also be a factor causing embryo abortion.

Genes and proteins associated with energy metabolism

Energy metabolism is the process of the energy associated with the release, transfer and utilisation occurring in mitochondrion, the key function of which is the generation of adenosine triphosphate (ATP), the major energy source of the cell38. The rate of living theory, a classic theory of aging, proposed the existence of a link between energy metabolism and aging. The central thought of the theory is that life span is determined by the rate at which it produces and uses energy at the cellular level39. In Pinus bungeana, authors found that the abundance of several proteins associated with energy production were down-regulated in the pollen tubes inhibited by nifedipine and thus concluded that a decrease in energy production may be a factor contributing to inhibition of pollen tubes40. In the present study, transcriptomic and proteomic analyses showed that several energy-related genes, such as ATP synthase subunit, ATPase subunit and plasma membrane H+ ATPase etc. and eight proteins (i.e., ATP synthase beta chain, UDP-glucose 6-dehydrogenase and pyruvate kinase) implicated in tricarboxylic acid cycle or in glycolysis were significantly down-regulated in abortive chrysanthemum embryos compared with normal embryos, indicating that the rate of energy metabolism in abortive embryos was significantly lower than that in normal embryos. In other words, the chrysanthemum abortion process was concomitant with a significant reduction in energy production.

Abnormal protein metabolism was also related to chrysanthemum embryo abortion. Protein is the basis material of life, involved in cell growth and cell cycle progression, a key factor for robust plant growth. Protein synthesis is affected by many factors, for example, initiation, elongation and release factors. Modification of some factors affects the rate of translation and protein synthesis41. In the current study, we found that some genes and proteins associated with protein synthesis, such as initiation factor, elongation factor and ribosomal protein, were down-regulated in abnormal embryos. For example, the abundance of cytoskeleton constituent including actin and tubulin in NE12 and NE18 were higher than that in AE18. Because cytoskeleton is crucial for cell structure, cell division and expansion and signal transduction42, the significant down-regulation of actin and tubulin in AE18 may affect the elongation or reduction of protein synthesis and may be associated with chrysanthemum embryo abortion.

Meanwhile, the mRNA levels of several genes encoding ubiquitin, 26S proteasome and ubiquitin-conjugating enzyme were significantly higher in abortive embryos relative to normal embryos. The ubiquitin-proteasome system is the main protein-degradation pathway; more than 80% of protein degradation is dependent on this pathway in plants43,44,45. Therefore, the higher abundance of several components in ubiquitin-proteasome system suggests that the process of protein degradation is accelerated in chrysanthemum abortion. Consequently, our findings suggest that reduced protein synthesis and enhanced protein degradation were closely related to chrysanthemum embryo abortion.

Transcriptional factors during embryo development

Transcription factors are essential for the regulation of gene expression during flower development, embryogenesis and seed maturation as well as biotic and abiotic stresses that include pathogen, drought, high temperature and salt46,47. In this study, the abundance of many transcription factors, such as AP2/ERF, WRKY, Myb and zinc fingers, were changed during abortive chrysanthemum embryos compared with normal embryos. Interestingly, most genes encoding stress-responsive transcription factors, including AP2/ERF, WRKY and Myb, were significantly up-regulated in abortive embryos. Conversely, some others encoding zinc finger protein and leucine-rich repeat receptor protein kinase were significantly down-regulated in abortive embryos relative to normal embryos. Zinc finger proteins and leucine zipper protein have been reported to be required to balance cell proliferation and maintain specification of undifferentiated cells during active growth, including during embryogenesis and seed germination46,48. A possible reason is that chrysanthemum embryos may receive incompatibility signalling when they reach the globular embryo stage. This signalling may be similar to high expression of stress-response transcription factors in response to a stimulus of biotic and abiotic stresses. Therefore, transcription factors play an important role in complex network of chrysanthemum embryo abortion.

In conclusion, we performed transcriptomic and proteomic analysis as complementary approaches to investigate embryo abortion during chrysanthemum cross breeding. Many functional genes and proteins were identified that potentially contribute to chrysanthemum embryo abortion, especially those associated with cell senescence and death. In addition, a new and important finding is that abortive embryos have higher caspase-like activities compared with normal embryos. Caspase-like activity is a key factor in PCD, so we presume that embryo abortion is a result of PCD. Taken together, this study highlights our understanding of mechanisms related to chrysanthemum embryo abortion and may be useful in the future for overcoming embryo abortion during cross breeding of chrysanthemum and other crops.

Methods

Plant material and cross

An interspecific cross was performed as described in the previous report15. C. morifolium and C. nankingense were female and male parents. Because chrysanthemum embryos are too small to be sampled and endosperm development also influences embryo development, ovules including embryo, endosperm and nucellus were collected by manual dissection with the naked eye. At 12 DAP, when nearly all the embryos reached the stage of globular embryo and were normal, ovules were collected. At 18 DAP when normal embryos developed into the heart-shaped embryo stage and abnormal embryos were undergoing degeneration, normal and abnormal ovules were collected (Fig. 1a–k). We had three planting areas and 20 chrysanthemum plants were planted in each area. Each chrysanthemum plant had more than 50 inflorescences and approximately 1000 inflorescences can be used in each planting area. The interspecific cross was performed and at least 500 inflorescences (each inflorescence with 18–23 female ligulate florets or ray florets containing one ovule) were hand-emasculated and bagged in each planting area. We collected samples at two stages after pollination. At 12 days after pollination, we collected approximately 2300 normal ovules (>0.7 g) from each planting area and there were nearly 7000 ovules (>2.1 g) in the three planting areas. At 18 days after pollination, we similarly collected >2.1 g normal and abnormal ovules, respectively. Then, 0.5 g, 0.5 g and 0.6 g ovules were used for RNA-Seq, proteomics and caspase activity assays. After collection, the three samples, normal embryos at 12 DAP (NE12), normal embryos at 18 DAP (NE18) and abortive embryos at 18 DAP (AE18), were immediately frozen in liquid nitrogen and stored at −80°C.

Total RNA and protein extraction

Total RNA was extracted using Trizol reagent according to the manufacturer's protocol (Takara Bio Inc., Otsu, Japan). The integrity and purity of the RNA was determined by Agilent 2100 RNA 6000 Kit and electrophoresis on 1% agarose gel.

cDNA preparation and Illumina deep-sequencing

Illumina sequencing was performed at the Beijing Genomics Institute (BGI)-Shenzhen, Shenzhen, China (http://www.genomics.cn/index.php) according to the manufacturer's instructions (Illumina, San Diego, CA, USA)49.

De novo assembly and sequence clustering

We obtained clean reads from raw data. The application of Trinity51 completed the de novo assembly of the transcriptome into unigenes. These assembled unigenes can be taken into further processing of sequence splicing and redundancy removing with sequence clustering software to acquire non-redundant unigenes. Finally, the application of Blastx (E-value < 0.00001) revealed alignment between unigenes and protein databases including NCBI nonredundant protein (Nr), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Groups (COG) and the directions of sequences were determined by the best aligning results. ESTScan50 was also used to predict the directions of sequences when a unigene was not aligned to those in any of the databases mentioned above.

Unigene function annotation

Unigene sequences were aligned by Blastx to protein databases such as Nr, Swiss-Prot, KEGG and COG and aligned by Blastn to the nucleotide database Nt (E-value < 0.00001). In this way, we achieved the best functional annotations. With Nr annotation, we used the Blast2 GO program51 to obtain gene ontology (GO) annotation of unigenes. GO has three ontologies describing molecular function, cellular component and biological process51. We then used WEGO software52 to do GO functional classification for all unigenes and to understand the distribution of gene functions at the macro level. KEGG mapping, which helps study complicated biological behaviours of genes, was used to determine metabolic pathway annotation of unigenes.

Analysis of difference in unigenes expression

The expression levels of unigenes were calculated using the method of fragments per kb per million fragments (FPKM)27. This method eliminates the influence of different gene length and sequencing level on the calculation of gene expression. Therefore, the calculated gene expression can be directly used for comparing the difference of gene expression between samples. Referring to the statistical method of Audic and Claverie53 based on sequencing, we analysed the differential expression of unigenes. Firstly, we mapped all differentially expressed genes (DEGs) to each term of GO database and calculated the gene numbers for each GO term, which yields a gene list and gene numbers for every certain GO term. Next, we used a hypergeometric test to find significantly enriched GO terms in DEGs compared to the genome background.

Quantitative real-time PCR (qRT-PCR) validation

To ensure the libraries' reliability, we randomly selected 14 differentially expressed genes, occupying about one-tenth of the 143 DEGs and validated the data using the three biological replicates samples by qRT-PCR. qRT-PCR assays were conducted as described by Song et al.54. Special primers for thirty genes were designed using PRIMER3 RELEASE 2.3.455 and the reference sequence for quantitative expression analysis was Elongation Factor 1α (EF1α) gene (Table S5). All reactions were performed in biological triplicate. The workflow for transcriptomic analysis is illustrated in Fig. S4.

Protein extraction, digestion and iTRAQ labelling

Protein extraction was performed using trichloroacetic acid (TCA)/acetone method40. The protein was redissolved in dissolution buffer (0.5 M triethylammonium bicarbonate, 0.1% (w/v) SDS). Protein concentration was determined according to Bradford assay. From each sample, 100 μg of protein (~30 μL) was denatured and alkylated and the cysteines blocked according to the manufacturer's instructions of the 8-plex iTRAQ reagent kit (Applied Biosystems, California, USA). Protein was digested with trypsin (Promega, Madison, WI) at a trypsin/protein ratio of 1:50 for 16 h at 37°C. For labelling, each iTRAQ reagent was dissolved in 80 μL of isopropanol and added to the respective peptide mixture. Proteins were labelled with the iTRAQ tags as follows: 1–118 isobaric tag, 2–119 isobaric tag, 3–121 isobaric tag. The labelled samples were combined and dried in vacuo. A SepPac™ C18 cartridge (1 cm3/50 mg, Waters Corporation, Milford, MA) was used to remove the salt buffer. The peptide mixture was re-dissolved in low salt buffer (5 mM K3PO4 with 25% (v/v) acetonitrile, pH = 2.5, adjusted with H3PO4) and then subjected to strong cation exchange chromatography to remove the hydrolysed unbound iTRAQ reagents using a PolySULFOETHYL A (2.1 × 100 mmi.d.) HPLC column (PolyLC Inc, Columbia, MD, USA). Peptide mixture was collected after high salt buffer (10 mM K3PO4, 500 mM KCl, 25% (v/v) acetonitrile, pH = 2.5, adjusted with H3PO4) elution. After the acetonitrile was removed by volatilisation, the SepPac™ C18 cartridge was used for the purpose of removing remove excess KCl and then dried in a vacuum concentrator.

High pH reverse-phase separation

The peptide mixture was re-dissolved in buffer A (20 mM NH4HCO2, pH10.0, adjusted with NH4OH) and then fractionated by high pH separation using a Shimazu UFLC system (Shimazu, Kyoto, Japan) connected to a reverse phase column (PolyRP-300 column, 2.1 mm × 150 mm, 5 μm, 300 Å, The Sepax Technologies, Inc. USA). High pH separation was performed using a four-step linear gradient. Starting from 5% B (20 mM NH4HCO2 in 90% acetonitrile, pH 10.0, adjusted with NH4OH) to 35% B in 30 min, increased to 80% B in 2 min and then to 5% B in 2 min. The column was re-equilibrated at initial conditions for 15 min. The column flow rate was maintained at 200 μL/min and column temperature was maintained at room temperature. Fifteen fractions were collected and each fraction was dried in a vacuum concentrator for the next step.

Low pH nano-HPLC-MS/MS analysis

The peptides were resuspended with 20 μl solvent A (0.1% (v/v) formic acid in water, pH 2.5), separated by nanoLC and analysed by on-line electrospray tandem mass spectrometry. The experiments were performed on a Nano Aquity UPLC system (Waters Corporation, Milford, MA) connected to an LTQ Orbitrap XL mass spectrometer (Thermo Electron Corp., Bremen, Germany) equipped with an online nanoelectrospray ion source (Michrom Bioresources, Auburn, USA). An 18 μL peptide sample was loaded onto the Captrap Peptide column (Michrom Bioresources, Auburn, USA), with a flow of 20 μL/min for 5 min and subsequently eluted with a three-step linear gradient. Starting from 5% B to 45% solvent B (0.1% (v/v) formic acid in 90% acetonitrile, pH 2.5) in 100 min, increased to 80% B in 1 min and then hold on 80% B for 4 min. The column was re-equilibrated at initial conditions for 15 min. The column flow rate was maintained at 500 nL/min and column temperature was maintained at 35°C. The electrospray voltage of 1.6 kV versus the inlet of the mass spectrometer was used. LTQ Orbitrap XL mass spectrometer was operated in the data-dependent mode to switch automatically between MS and MS/MS acquisition. Survey full-scan MS spectra (m/z 400–1800) were acquired in the Obitrap with a mass resolution of 30,000 at m/z 400, followed by five sequential HCD-MS/MS scans. The automatic gain control was set to 500 000 ions to prevent over-filling of the ion trap. The minimum MS signal for triggering MS/MS was set to 1000. In all cases, one microscan was recorded. MS/MS scans were acquired in the Obitrap with a mass resolution of 7500. The dissociation mode was HCD (higher energy C-trap dissociation). Dynamic exclusion was used with two repeat counts, 10 s repeat duration and the m/z values triggering MS/MS were put on an exclusion list for 120 s. For MS/MS, precursor ions were activated using 40% normalised collision energy and an activation time of 30 ms.

Database searching and criteria

The protein data search methods and software were carried out as presented in a previous article56. The experiments for proteomics investigation were repeated three times and the workflow for proteomics investigation is illustrated in Fig. S5.

Caspase activity assays

Embryo protein extraction and caspase activity assays were performed as described by Maurice Bosch and Vernonica E. Franklin-Tong30. Caspase-like activities were measured by using AMC substrates: Ac-YVAD-AMC, Ac-DEVD-AMC, Ac-LEVD-AMC, Ac-VEID-AMC, Ac-LEHD-AMC (sigma, USA), which are selective fluorogenic substrate for caspase 1, caspase 3, caspase 4, caspase 6, caspase 9, respectively. Each 200 μL reaction mixture contained 20 μg of embryo protein extract and 100 μM fluorogenic substrate. MicroplateReader (Tecan Infinite M200, Switzerland) was used to measure (excitation 380 nm, emission 460 nm) the release of fluorophore by cleavage at 27°C for 2 h. All analyses were repeated three times.