Stress-induced and epigenetic-mediated maize transcriptome regulation study by means of transcriptome reannotation and differential expression analysis

Plant’s response and adaptation to abiotic stresses involve sophisticated genetic and epigenetic regulatory systems. To obtain a global view of molecular response to osmotic stresses, including the non-coding portion of genome, we conducted a total leaf transcriptome analysis on maize plants subjected to prolonged drought and salt stresses. Stress application to both B73 wild type and the epiregulator mutant rpd1-1/rmr6 allowed dissection of the epigenetic component of stress response. Coupling total RNA-Seq and transcriptome re-assembly we annotated thousands of new maize transcripts, together with 13,387 lncRNAs that may play critical roles in regulating gene expression. Differential expression analysis revealed hundreds of genes modulated by long-term stress application, including also many lncRNAs and transposons specifically induced by stresses. The amplitude and dynamic of the stress-modulated gene sets are very different between B73 and rpd1-1/rmr6 mutant plants, as result of stress-like effect on genome regulation caused by the mutation itself, which activates many stress-related genes even in control condition. The analyzed extensive set of total RNA-Seq data, together with the improvement of the transcriptome and the identification of the non-coding portion of the transcriptome give a revealing insight into the genetic and epigenetic mechanism responsible for maize molecular response to abiotic stresses.


Supplementary Table S1: Summary of differential expression analyses results
Number of genes differentially expressed (log2 fold change ratio ≥ |1| and FDR-adjusted p value ≤ 0.05) in each pairwise comparison divided by their annotation class.  Figure S1: Flow-chart of the overall experimental design and RNA-Seq data analysis.
The scheme summarize the whole experimental design, starting from plant growing condition, stress application and samples collection to NGS Illumina sequencing and software employed for transcriptome re-annotation, characterization and differential expression analysis.

Supplementary Figure S2: Histogram comparing GO terms newly assigned to Z. mays transcripts to EnsemblPlants/Biomart reference annotation
The results obtained with WEGO annotation plotting tool are summarized for biological process, cellular component and molecular function categories. The right Y-axis represents the number of transcripts and the left Y-axis shows the percentage of total transcripts. Red bars represent values corresponding to new GO annotation while blue ones stand for the reference annotation.

Supplementary Figure S3: Comparison of DEG profiles under different osmotic stresses
Expression profile and clusters of DEGs obtained by the STEM clustering of B73 (red) and rmr6 (blue) subjected to the three different stress conditions. Each box corresponds to one of the model expression profiles and the numbers indicated profile unique ID (upper-left corner) and number of genes falling into the cluster (bottom-left corner). Clusters were ordered according to the number of genes, while significantly enriched profiles (that have a statistically significant number of genes assigned compared to the number of genes expected based on the permutation test) were represented by different background colors (see also Figure 5a-b).

Supplementary Figure S4: Comparison of DEG profiles under different osmotic stresses and recovery stage
Expression profile and clusters of DEGs obtained by the STEM clustering of B73 (red) and rmr6 (blue) subjected to the three different stress conditions and the recovery stage, plotting the to the log2 T7/T0 expression ratio of each DE gene for the three analyzed stresses. Each box corresponds to one of the model expression profiles and the numbers indicated profile unique ID (upper-left corner) and number of genes falling into the cluster (bottom-left corner). Clusters were ordered according to the number of genes, while significantly enriched profiles (that have a statistically significant number of genes assigned compared to the number of genes expected based on the permutation test) were represented by different background colors (see also Figure 7a-b).

Supplementary Figure S5: Multiple alignment between maize and Arabidopsis terpene synthase proteins.
The N-terminal domain, partially missing in GRMZM2G046615_T01_j_3, presents high variability in terms of length and composition between the TPS proteins of maize and Arabidopsis, compared to the more conserved central and C-terminal regions.

Supplementary Figure S6: Sequence and expression validation of newly identified transcripts a)
Multiple alignment between GRMZM2G046615 genomic sequence, GRMZM2G046615_T01 reference annotated transcripts and GRMZM2G046615_T01_j_1 and GRMZM2G046615_T01_j_3 newly identified new isoforms, showing primers used for sequence amplification and sequencing (yellow and blue arrows) and sequence-validated features. Arrowhead in GRMZM2G046615_T01_j_2 marks the premature stop codon. b) Expression validation by Q-PCR for the selected stress-induced genes.

Supplementary Data S1: RNA-Seq summary statistics relative to the 32 sequenced libraries
Number of total sequenced reads, clean reads (after quality trimming and contaminant filtering) and filtered reads (read mapped to the maize B73 AGPv3 reference genome with a Mapping quality score -MAPQ-of at least 1, after the removal of PCR duplicates) and related statistics for each of the sequenced libraries.

Supplementary Data S2: Blast2GO GO annotation file
Comma separated Annot file reporting the Blast2GO summary: specific GO terms were successfully assigned to 77,403 transcripts.

Supplementary Data S3: Functional enrichment of GO-terms analysis in newly identified transcripts
Overrepresentation of Gene Ontology terms was analyzed in each newly annotated Cufflinks transcript class with Ontologizer software. Raw and Revigo summarized results are reported for each transcript class. Statistically enriched terms (adjusted p-value < 0.1) are reported in bold format.

Supplementary Data S4: Newly identified antisense/sense loci pairs
Each newly antisense Class X locus was coupled with its sense reference locus. Transcript information and putative annotation are reported for each sense/antisense pair.

Supplementary Data S5: ncRNAs classification summary
Characteristics of all the putative ncRNAs identified.
Expression levels (expresses as FPKM) retrieved from a recently published expression atlas (Stelpflug et al., 2016) and from the qTeller tool (http://www.qteller.com/) for the subset of lncRNAs and non-coding siRNA precursors enclosed these databases. Supplementary Data S7: Complete Cuffdiff results of the three differential expression tests at

T0
Statistically differentially expressed genes and transcripts identified by Cuffdiff for the three differential expression analyses performed at T0.

Supplementary Data S8: FPKM values in B73 and rmr6 samples
Summary of FPKM values in the different samples for subset of transcripts differentially expressed in B73 in response to osmotic stresses.

Supplementary Data S9: Complete Cuffdiff results of the two differential expression tests at T7
Statistically differentially expressed genes and transcripts identified by Cuffdiff for the three differential expression analyses performed at T7.