Cells responds to diverse stimuli by changing the levels of specific effector proteins. These changes are usually examined using high throughput RNA sequencing data (RNA-Seq); transcriptional regulation is generally assumed to directly influence protein abundances. However, the correlation between RNA-Seq and proteomics data is in general quite limited owing to differences in protein stability and translational regulation. Here we perform RNA-Seq, ribosome profiling and proteomics analyses in baker’s yeast cells grown in rich media and oxidative stress conditions to examine gene expression regulation at various levels. With the exception of a small set of genes involved in the maintenance of the redox state, which are regulated at the transcriptional level, modulation of protein expression is largely driven by changes in the relative ribosome density across conditions. The majority of shifts in mRNA abundance are compensated by changes in the opposite direction in the number of translating ribosomes and are predicted to result in no net change at the protein level. We also identify a subset of mRNAs which is likely to undergo specific translational repression during stress and which includes cell cycle control genes. The study suggests that post-transcriptional buffering of gene expression may be more common than previously anticipated.
In recent years high throughput RNA sequencing (RNA-Seq) has become the method of choice for measuring shifts in gene expression between cells grown in different conditions1. However, diverse studies have shown that mRNA levels only partially explain protein levels in the cell2,3,4,5. In yeast, the correlation between mRNA and protein abundance is typically in the range 0.6–0.72. In addition, the ratio between protein and mRNA levels may vary across different conditions3. For instance, substantial differences in this ratio have been observed during osmotic stress in yeast6,7 or after the treatment of human cells with epidermal growth factor8.
In contrast to RNA-Seq, which measures the total amount of mRNA in the cell, ribosome profiling (Ribo-Seq) only captures those mRNAs that are being actively translated9. Each Ribo-Seq read corresponds to one translating ribosome, providing a quantitative view of the amount of protein produced by the cell at any given time. Although this remains an indirect estimate of protein abundance, it has several advantages over proteomics, such as the fact that with Ribo-Seq virtually all translated sequences can be captured, and that one can apply the same pipelines and statistical methods as for RNA-Seq to identify differentially expressed genes.
The response to oxidative stress in the yeast Saccharomyces cerevisiae involves a general decrease in mRNA translation initiation as well as the selective transcriptional activation of a set of proteins involved in the maintenance of the redox state of the cell9,10,11. A previous study reported changes in the ratio between the normalized number of Ribo-Seq and RNA-Seq reads, or translational efficiency (TE), of hundreds of genes upon oxidative stress10, suggesting extensive translational regulation. However, changes in TE alone do not necessarily imply changes in the abundance of the translated proteins. By performing a separate analysis of Ribo-Seq and RNA-Seq data, we show that the majority of genes that show statistically significant differences at the RNA-Seq level do not correspond to similar differences at the Ribo-Seq level, suggesting that, in most cases, changes in mRNA abundance are compensated by changes in ribosome density and are not expected to be propagated to the protein level. Our approach also uncovers a subset of differentially expressed genes in which regulation appears to be mainly exerted at the translational level.
Ribosome profiling experiments in normal and stress conditions
We extracted ribosome-protected RNA fragments, as well as complete polyadenylated RNAs, from Saccharomyces cerevisiae grown in rich media (normal) and in H2O2-induced oxidative stress conditions (stress)(Fig. 1). We added 1.5 mM H2O2 to the media 30 minutes prior to harvesting the cells; these conditions induced severe oxidative stress and approximately halved the growth rate of the cells11,12. We then sequenced the ribosome-protected RNA fragments (Ribo-Seq) as well as complete mRNAs (RNA-Seq) using a strand-specific protocol. The Ribo-Seq data provided a snapshot of the translatome, each read corresponding to one translating ribosome, whereas the number of RNA-Seq reads mapping to a gene was used to quantify the relative abundance of the transcript.
After quality control of the sequencing reads we obtained 31–36 million Ribo-Seq reads and 12–15 million RNA-Seq reads per sample (Supplementary Table S1). We mapped the reads to the genome and generated a table of read counts per gene for each of the samples. After filtering out non-expressed genes (see Methods), the table contained data for 5,419S. cerevisiae annotated genes (ORFs).
To be able to compare the relative abundances of mRNAs and ribosome-bound mRNAs between different samples we normalized the RNA-Seq and Ribo-Seq table of counts by calculating normalized counts per million (CPM) in logarithmic scale, or log2CPM (Supplementary Fig. S1). The correlation coefficient between the average Ribo-Seq and RNA-Seq log2CPM expression values was 0.84 in normal conditions and 0.87 in stress conditions (Fig. 2A,B, respectively). As the differences in log2CPM between RNA-Seq or Ribo-Seq replicates were negligible (Fig. 2C,D, Supplementary Table S2), these values reflect the amount of disagreement between total mRNA and translated protein abundances.
Ribo-Seq shows a higher correlation with proteomics than RNA-Seq
The next step was to compare the quantification of gene expression by RNA-Seq and Ribo-Seq to that obtained using proteomics. For proteomics-based quantification we extracted the protein fraction from yeast grown in normal and stress conditions and estimated the abundance of different yeast proteins, i.e. the proteome, using mass spectrometry (Fig. 1). We could reliably quantify the protein products of 2,200 genes (see Methods), representing about 40% of the genes quantified by RNA-Seq or Ribo-Seq. Normalized protein abundances between pairs of proteomics replicates showed correlation coefficients in the range 0.83–0.93 (Supplementary Table S3), lower than for RNA-Seq or Ribo-Seq replicates (>0.99).
In normal conditions the correlation coefficient between the transcriptome (RNA-Seq) and the proteome normalized abundances was 0.46. This increased to 0.71 when comparing the translatome (Ribo-Seq) and the proteome normalized abundances (Fig. 3), indicating that Ribo-Seq-based quantification of gene expression provides a more accurate picture of protein abundance than RNA-Seq data. The average correlation coefficient between the three pairs of proteome replicates was 0.91, setting up a maximum value for any correlation with the other types of data. Differences between RNA-Seq and proteomics quantification estimates may arise because of differences in the half life of the proteins with respect to their cognate mRNAs as well as variations in the translation rate or ribosome density across the transcripts. As the value of 0.71 (Ribo-Seq versus proteomics) is intermediate between 0.46 (RNA-Seq versus proteomics) and 0.91 (proteomics replicates), the two above mentioned factors appear to be relevant to explain the strong uncoupling between mRNA and protein abundance in this system.
In stress conditions the correlation coefficient between the transcriptome and proteome measurements was 0.62, somewhat higher than in normal conditions. Whereas the correlation coefficient between the translatome and the proteome was 0.67, again higher than the same value between the transcriptome and the proteome but lower than the correlation between the proteome stress replicates (0.86). Taken together, these results are consistent with the hypothesis that differences in ribosome density play a role in modulating protein expression under severe stress conditions.
Analysis of three nucleotide periodicity
In actively translated regions mapped Ribo-Seq reads exhibit a characteristic three nucleotide periodicity that results from the codon-to-codon ratcheting movement of the ribosome along the coding sequence9. We used the program RibORF13 to assess the nucleotide periodicity and homogeneity of the Ribo-Seq reads in the annotated coding sequences. According to this analysis, in the vast majority of genes (98%, 5198 out of 5304 analyzed genes) the annotated ORF showed a clear signature of translation in both normal and stress conditions, validating our approach of considering all the reads that mapped to the annotated ORFs for the quantification of protein translation.
In a small fraction of genes, however, we found evidence of alternative translated ORFs (Supplementary Table S4). One example was TOS8, which encodes a homeodomain-containing transcription factor. In this gene active translation of the canonical 831 amino acid long protein by RibORF was only detected in stress conditions; in contrast, a protein of only 81 amino acids was the main translated polypeptide in rich media. The shorter alternative ORF was on a different reading frame to the main protein product and showed no homology to any previously characterized protein. These cases illustrate how detailed examination of the distribution of the Ribo-Seq reads may help uncover proteins that have remained hidden within longer ORFs.
Ribo-Seq estimates of changes in gene expression are more conservative
We next calculated the gene expression level fold change (FC) between the two conditions, using RNA-Seq and Ribo-Seq data separately. The log2FC distribution based on the Ribo-Seq data had a lower variance than the log2FC distribution using RNA-Seq data (Fig. 4A). This indicated a higher range of variation in the mRNA levels, as estimated by RNA-Seq, than in the ribosome-protected fragments. This finding was consistent with the existence of post-transcriptional buffering of gene expression, as also reported for inter-specific gene expression comparisons of S.cerevisiae and S.paradoxus14.
We considered the possibility that the about 2.5 times higher number of Ribo-Seq reads than RNA-Seq reads in the original datasets biased the comparison of log2FC distributions. In order to test it we subsampled the mapped reads so as to have a similar number of reads in all the RNA-Seq and Ribo-Seq samples (Supplementary Tables S5 and S6). The results were very similar to those observed without subsampling (Supplementary Fig. S2), indicating that the differences in sequencing coverage between samples do not affect the results.
We also used an alternative method, multidimensional scaling (MDS)15, to quantify the distance between Ribo-Seq and RNA-Seq gene expression measurements (Fig. 4B). We found that the distance between Ribo-Seq normal and stress conditions was shorter that the distance between RNA-Seq normal and stress conditions, which was consistent with the previous observation that log2FC variance was lower for Ribo-Seq than for RNA-Seq. In conclusion, the variation in gene expression between normal and stress conditions was attenuated in the Ribo-Seq data when compared to RNA-Seq data.
Extensive post-transcriptional buffering of gene expression
We performed differential gene expression analysis, separately for Ribo-Seq and RNA-Seq data, using the package EdgeR16 for normalization of the data and Limma17 for multivariable linear regression and identification of differentially expressed genes. We selected genes with an adjusted p-value < 0.05 and a log2FC larger than one standard deviation; the latter corresponded to a minimum FC of 1.49 for RNA-Seq data and 1.36 for Ribo-Seq data. We used the standard deviation instead of a fixed value to accommodate for the differences in the width of the log2FC distributions. The number of genes that were differentially expressed was 1,530 for RNA-Seq and 536 for Ribo-Seq. A similarly high number of yeast genes were previously found to be regulated by hydrogen peroxide at the level of transcription using microarray technology18 and sequencing data10.
The correlation between RNA-Seq and Ribo-Seq gene log2FC values was quite low (0.18), indicating an important disconnect between the two kinds of data (Fig. 4C). Only 127 genes showed a significant change in the same direction i.e. homodirectional changes. Genes that were up-regulated during stress according to both RNA-Seq and Ribo-Seq included protein functions known to be activated at the transcriptional level in response to stress, such as hexoquinases or heat shock proteins19. The number of genes annotated with the Gene Ontology (GO) term ‘oxidation reduction process’ was similar for RNA-Seq or Ribo-Seq up-regulated genes (17 and 15, respectively), supporting that these genes are essentially regulated at the level of transcription and can be effectively detected with both kinds of sequencing data.
However, the vast majority of genes were only significant at the transcriptome or the translatome levels (1,413 and 409 genes, respectively; Fig. 4C). The largest group was formed by genes that showed significant changes in relative transcript abundance but not in the relative number of ribosome-protected fragments, supporting extensive post-transcriptional buffering of gene expression. The data indicated that about a quarter of the genes in the genome may be undergoing compensatory changes: when mRNA levels increase ribosome density per transcript decreases and the other way round.
The second group, translatome-only differentially expressed genes, represented cases in which mRNA levels did not change but the density of ribosomes per transcript showed a significant increase or decrease in stress relative to normal. This would be consistent with the expression of these genes being primarily modulated at the level of translation. We identified many more genes under differential translational repression than activation (360 versus 49, Fig. 4C), suggesting that the former mechanism may be more prevalent that the first one in response to severe oxidative stress.
We also identified a subset of cases showing opposite changes in RNA-Seq and Ribo-Seq data. The main group was formed by 70 genes showing increased mRNA levels but decreased translation in stress versus normal. One simple explanation would be that, for these genes, there is an mRNA fraction that is stored in a translational inactive highly stable form, whereas the rest is translated at the usual level. More complex scenarios could involve a combination of transcriptional and translational regulatory events.
We applied the same differential gene expression pipeline to the normalized proteomics data but did not recover any significant hits. This is not unexpected given the lower number of proteins that could be quantified and, above all, the high variability in the protein abundance estimates between replicates.
Dissecting differential regulation by functional class
To better understand the biological relevance of our observations, we investigated if certain functional classes were significantly enriched among the sets of differentially expressed genes. We used DAVID20 to identify significantly over-represented functional clusters (Fig. 4D). Only one class, ‘oxidation-reduction process’, was enriched among genes up-regulated during stress both using RNA-Seq and Ribo-Seq data. This is consistent with transcriptional activation of this set of genes upon stress, increasing the signal for both total mRNA and the translated fraction.
Three other classes – ‘translation’, ‘ATPase’ and ‘proteasome’ – showed increased mRNA levels during stress; conversely this was not reflected in an increase
in for the translated fraction. Although translation-related proteins have been reported to be transcriptionally repressed during oxidative stress18,21, there are also indications that this is reverted when H2O2 concentration is high11, as is the case in our study.
Among genes that were differentially expressed only when we used Ribo-Seq data ‘cell wall’, ‘mitochondrial intermembrane space’ and ‘catalytic activity’ were enriched among up-regulated genes, whereas ‘cell cycle’ was enriched among down-regulated genes (Fig. 4D).
Translational efficiency and protein level changes
To obtain further insights into the regulatory mechanisms of gene expression during oxidative stress in yeast we also compared the translational efficiency (TE; Ribo-Seq normalized counts divided by RNA-Seq normalized counts) of the different genes in the two conditions using the program Ribodiff22. We detected 470 genes that showed significantly increased TE during stress (adjusted p-value < 0.05; see Methods); about 82% of them were cases in which the relative mRNA levels had decreased during stress but this change had been compensated by an increase in ribosome density so that no significant changes in the amount of translated protein would be expected (transcriptome downregulated, Table 1). In only about 3% of cases increased TE was associated with translational activation and increased protein production (translatome upregulated, Table 1).
In the case of genes with significantly lower TE in stress than in normal conditions the percentage of compensatory cases was also the predominant scenario, accounting for 50% of the genes in the class (356 out of 714, Table 1). The second most numerous group were genes likely to be actively repressed at the level of translation, accounting for 29% of the genes with significantly decreased TE (29%). The latter genes showed no change in mRNA levels but the relative number of associated ribosomes was lower in stress than in normal conditions, which would be expected to lead to a decrease in the protein levels. This group included 12 genes from the cell cycle functional category (Supplementary Table S7).
The adaptation of organisms to variations in different environmental conditions is associated with the activation or repression of gene expression. These changes are usually studied at the level of complete mRNA molecules using microarrays or next generation sequencing. However, changes in mRNA concentration do not necessarily reflect changes in their encoded protein products8,11.
Here we have explored the usefulness of ribosome profiling data to close the gap between mRNA and protein abundance estimates using oxidative stress in baker’s yeast as a model system. Each ribosome profiling read corresponds to one translating ribosome and thus the number of reads that map to a gene reflects the amount of translated protein9,23. Numerous recent studies have used ribosome profiling to gain insights into novel translation regulatory mechanisms24,25 or to discover new translated RNA sequences20,21,22,23. However, there is a lack of studies addressing how ribosome profiling can be used to improve the estimates of protein abundance changes over RNA-Seq-based estimates. Our study shows that Ribo-Seq provides better estimates of protein abundance than RNA-Seq and that the results of differential gene expression analyses are drastically altered if we use Ribo-Seq of RNA-Seq as the source sequencing data.
The abundance of different proteins in the cell is usually estimated using mass spectrometry proteomics data26,27. This provides a direct measurement of protein abundance that can account for the variations in the stability of different proteins; however, proteomics methods are much less sensitive than current RNA sequencing approaches and not all proteins can be detected in routine analyses28. In addition, the results of high throughput sequencing techniques are more reproducible across biological replicates than those obtained with mass spec proteomics; this confers the former studies increased power to perform differential gene expression analyses.
While it lacks the specificity of Ribo-Seq, earlier studies in yeast indicated that polysomal mRNA abundances provided a better approximation to protein level changes than total mRNA6. It was also reported that Ribo-Seq showed a higher correlation with proteomics data than RNA-Seq, but these conclusions were drawn after comparing data obtained from different laboratories9. Here we generated RNA-Seq, Ribo-Seq and proteomics data for yeast grown in identical conditions, leading to less biased comparisons. Our results support the hypothesis that the Ribo-Seq read counts provide a better approximation to protein levels than RNA-Seq read counts.
We observed that many of the genes that were detected as significantly up- or down-regulated in stress by RNA-Seq did not show any significant changes using the Ribo-Seq data, and that many genes were likely to be regulated at the level of translation only. Important differences between the transcriptome and the translatome were reported in previous studies that compared total mRNA and polysomal mRNA changes upon different environmental stresses6,11,29. One of these studies used two different H2O2 concentrations to induce oxidative stress −0.2 and 2 mM H2O2 – observing large differences in the translated versus total mRNA fractions depending on the severity of the stress11. In the mentioned study, the addition of 2 mM H2O2 resulted in the up-regulation of genes involved in ribosome biogenesis and rRNA processing; similar to our findings using 1.5 mM H2O2. Additionally, we could observe that this increase in mRNA levels is compensated by a decrease in the density of ribosomes on the transcripts, consistent with post-transcriptional buffering of gene expression. This affected hundreds of genes, including many ribosomal proteins but also members of the proteasome and ATPase complexes.
In general, changes in genes expression identified by RNA-Seq data showed higher variability than those identified by Ribo-Seq data. Intriguingly, studies comparing the expression of orthologous genes from closely related species have also reported that gene expression is in general more variable when measured by RNA-Seq than Ribo-Seq14,30. In the case of oxidative stress we also have to consider that some mRNAs could be transiently stored in P-bodies or stress granules31,32,33, becoming inaccessible to the translation machinery. Translation of these transcript could be rapidly reactivated when the stress disappears.
We identified signatures of translational repression, beyond any general effects induced by the stress, in hundreds of proteins, including proteins involved in the cell cycle. In this case there was no apparent change in the number of mRNA molecules but ribosome density decreased during stress, presumably reflecting lower translation rates in this subset of proteins than in the rest. Repression of cell cycle proteins would be consistent with the observed marked slow down of cell division under stress. Genes potentially up-regulated at the translational level were enriched in cell wall and mitochondrial membrane proteins. Translation up-regulation of a selected set of genes may allow the cells to rapidly react to adverse conditions, and has also been observed in response to hyperosmotic stress6.
The results of this study illustrate the importance of performing ribosome profiling experiments to differentiate between changes in mRNA that are likely to result in changes in the protein levels to those that are not. Although obtaining Ribo-Seq data is more labour-intensive than RNA-Seq, the protocols are being simplified and its use is rapidly growing34,35,36. The methodological framework we have developed can be applied to other datasets and help advance our understanding of gene regulation in other conditions.
We grew S. cerevisiae (S288C) in 500 ml of rich media in a 2 L Erlenmeyer flask; two replicates were grown in parallel for each condition- a total of four flasks37. In order to induce oxidative stress in two stress condition replicates, 30 minutes before harvesting the cells we added diluted H2O2 to the media for a final concentration of 1.5 mM. In both conditions, the cells were harvested in log growth phase via vacuum filtration and frozen with liquid nitrogen when they reached an OD600 of approximately 0.25. In order to capture ribosome protected mRNAs, cyclohexamide was added to the flask one minute before the cells were harvested. Cyclohexamide is commonly used as a protein synthesis inhibitor in order to prevent ribosome run-off and the subsequent loss of ribosome-transcript complexes. One third of each culture was used for ribosome profiling (Ribo-Seq); the rest was reserved for RNA-Seq.
The cells harvested for RNA-Seq were processed with the Monarch® Total RNA Miniprep Kit protocol. After verifying the quality of the total RNA from each sample with the BioAnalyzer, polyA mRNAs were selected and libraries were prepared using NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina®. Subsequently the samples were indexed, pooled, and sequenced (35 × 2 stranded) on the Illumina NextSeq platform.
Cells were lysed using the freezer/mill method (SPEX SamplePrep); after preliminary preparations, lysates were treated with RNaseI (Ambion), and subsequently with SUPERaseIn (Ambion). Monosomal fractions were collected; SDS was added to stop any possible RNAse activity, then samples were flash-frozen with N2(l). Digested extracts were loaded in 7–47% sucrose gradients. RNA was isolated from monosomal fractions using the hot acid phenol method. Ribosome-Protected Fragments (RPFs) were selected by isolating RNA fragments of 28–32 nucleotides (nt) using gel electrophoresis. The preparation of sequencing libraries for Ribo-Seq and RNA-Seq was based on a previously described protocol38. Pair-end sequencing reads of size 35 nucleotides were produced for Ribo-Seq on the Illumina MiSeq platform. The data has been deposited at NCBI Bioproject PRJNA435567 (https://www.ncbi.nlm.nih.gov/bioproject/435567).
Processing of the sequencing data
The RNA-Seq data was filtered using Trimmomatic with default parameters (version 0.36)39. In the Ribo-Seq data we discarded the second read pair as it was redundant and of poorer quality than the first read, and then used Cutadapt40 to eliminate the adapters and to trim five and four nucleotides at 5’ and 3’ edges, respectively. Ribosomal RNA was depleted from the Ribo-Seq data in silico by removing all reads which mapped to annotated rRNAs. Ribo-Seq reads shorter than 25 nucleotides were not used.
After quality check and read trimming, the reads were aligned against the S. cerevisiae genome (S288C R64-2-1) using Bowtie 241. For annotation we used a previously generated S. cerevisiae transcriptome containing 6,184 annotated coding sequences plus 1,009 non-annotated assembled transcripts (see Supplementary data). SAMtools42 was used to filter out unmapped reads.
We counted the number of reads that mapped to each gene with HTSeq-count43. We used the mode ‘intersection strict’ to generate a table of counts from the data; the procedure removed about 5% of the reads in the case of RNA-Seq, and 8% in the case of Ribo-Seq. Only genes in which the average read count of the two replicates was larger than 10 in all conditions (normal and stress, for RNA-Seq and for Ribo-Seq) were kept. The filtered table of counts contained data for 5,419 genes; nearly all of them corresponded to annotated genes (5,312 genes).
For subsampling the number of mapped reads we used SAMtools42. We used the function ‘samtools view’ with option ‘-s 0.X’, where X is the percentage of reads that we wish to keep.
Analysis of three nucleotide periodicity in the mapped Ribo-Seq reads
We used RibORF13 to analyze the mapped Ribo-Seq. We analyzed all possible ORFs with a minium length of 9 amino acids and at least 10 mapped reads. We analyzed 5,304 annotated ORFs. RibORF counts the number of reads that fall in each frame and calculates the distribution of reads along the length of the ORF. We used the original proposed cutoff (score > 0.7) to predict translated ORFs.
Quantification of protein abundance by mass spectrometry
For our proteomics experiment, we analysed 3 replicates per condition by LCMSMS using a 90-min gradient in the Orbitrap Fusion Lumos. These samples were not treated with cyclohexamide. The proteins were extracted by milling the cells with glass beads in a denaturing buffer in a vortex at 4 °C. After dialysis with a 3KDa filter, protein concentration was measured using a BCA kit; subsequently the proteins were digested with sequenec-grade trypsin in urea. As a quality control measure, BSA controls were digested in parallel and ran between each sample to avoid carry-over and assess the instrument performance. The peptides were searched against SwissProt Yeast database, using the Mascot v2.5.1 search algorithm. The search was performed with the following parameters: peptide mass tolerance MS1 7 ppm and peptide mass tolerance MS2 0.5 Da; three maximum missed cleavages; trypsin digestion after K or R except KP or KR; dynamic modifications oxidation (M) and acetyl (N-term), static modification carbamidomethyl (C). Protein areas were obtained from the average area of the three most intense unique peptides per protein group. Considering the data from all 6 samples, we detected proteins from 3,336 genes. We limited our quantitative analysis to a subset of 2,200 proteins which had proteomics hits for at least 3 unique peptides; this filter eliminates noise arising from technical challenges of quantifying lowly abundant proteins with LCMSMS.
Differential gene expression analysis
The table of counts was normalized to log2 Counts per Million (log2CPM) using the function ‘cpm’ in the R package edgeR16. Before performing differential gene expression analysis, we normalized the data using Trimmed Mean of M-values (TMM) from the same package. Finally, we applied the Limma voom method17 to identify differentially expressed genes, separately for RNA-Seq and Ribo-Seq data (adjusted p-value < 0.05 and |log2FC| > 1 SD(log2FC)).
We applied the same pipeline to the proteomics data using normalized area values as a quantitative measure of protein abundance. To ensure robustness of the differential expression analysis we used genes which had at least 3 unique peptides and could be quantified in all 6 replicates (1,580 genes); the procedure did not identify any significantly up or down regulated genes, using an adjusted p-value < 0.05. Low sensitivity of this procedure is expected considering the relatively poor correlation of the mass spec replicates (r between 0.83 and 0.93).
Analysis of functional clusters
We identified significantly enriched functional clusters in differentially expressed genes using DAVID20. The analysis was done separately for over- and under-expressed genes and for RNA-Seq and Ribo-Seq derived data. Only clusters with enrichment score ≥ 1.5 and adjusted p-val < 0.05 were retained. In each cluster we chose a representative Gene Ontology (GO) term44, with the highest number of genes inside the cluster. Figure 4 integrates the results obtained with the Ribo-Seq and the RNA-Seq data, the log10 fold enrichment of the significant GO terms is plotted.
Analysis of translational efficiency
We searched for genes with significantly increased or decreased translational efficiency (TE)9 using the RiboDiff program22. We selected genes significant at an adjusted p-value < 0.05 and showing log2(TEstress/TEnormal) higher than 0.67 or lower than −0.67 (plus or minus one standard deviation of the log2(TEstress/TEnormal) distribution.
Supplementary data files have been uploaded to Figshare and can be accessed at https://doi.org/10.6084/m9.figshare.5809812. This includes the following files: Blevins_Tavella_etal_Scer_transcriptome.gtf: Genomic coordinates of S.cerevisiae (S288C) transcriptome including annotated features as well as de novo assembled transcripts. Blevins_Tavella_etal_tableofcounts.txt contains the number of mapped reads per gene in Ribo-Seq (RF) and RNA-Seq (RNA) for control and oxidative stress conditions; Blevins_Tavella_etal_RNA_up.csv contains a list of genes significantly up-regulated by RNA-Seq; Blevins_Tavella_etal_RNA_down.csv contains a list of genes significantly down-regulated by RNA-Seq; Blevins_Tavella_etal_RP_up.csv contains a list of genes significantly up-regulated by Ribo-Seq; Blevins_Tavella_etal_RP_down.csv contains a list of genes significantly down-regulated by Ribo-Seq; Blevins_Tavella_gene_lists.xlsx contains gene expression values obtained with RNA-Seq and Ribo-Seq as well as information from over-represented functional clusters. The original sequencing data is at https://www.ncbi.nlm.nih.gov/bioproject/435567 (NCBI Bioproject PRJNA435567).
Rapaport, F. et al. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biol. 14, R95 (2013).
de Sousa Abreu, R., Penalva, L. O., Marcotte, E. M. & Vogel, C. Global signatures of protein and mRNA expression levels. Mol. Biosyst. 5, 1512–26 (2009).
Schwanhäusser, B. et al. Global quantification of mammalian gene expression control. Nature 473, 337–342 (2011).
Payne, S. H. The utility of protein and mRNA correlation. Trends Biochem. Sci. 40, 1–3 (2015).
Ponnala, L., Wang, Y., Sun, Q. & van Wijk, K. J. Correlation of mRNA and protein abundance in the developing maize leaf. Plant J. 78, 424–440 (2014).
Warringer, J., Hult, M., Regot, S., Posas, F. & Sunnerhagen, P. The HOG Pathway Dictates the Short-Term Translational Response after Hyperosmotic Shock. Mol. Biol. Cell 21, 3080–3092 (2010).
Lee, M. V. et al. A dynamic model of proteome changes reveals new roles for transcript alteration in yeast. Mol. Syst. Biol. 7, 514 (2011).
Tebaldi, T. et al. Widespread uncoupling between transcriptome and translatome variations after a stimulus in mammalian cells. BMC Genomics 13, 220 (2012).
Ingolia, N. T., Ghaemmaghami, S., Newman, J. R. S. & Weissman, J. S. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science 324, 218–23 (2009).
Gerashchenko, M. V., Lobanov, A. V. & Gladyshev, V. N. Genome-wide ribosome profiling reveals complex translational regulation in response to oxidative stress. Proc. Natl. Acad. Sci. 109, 17394–17399 (2012).
Shenton, D. et al. Global Translational Responses to Oxidative Stress Impact upon Multiple Levels of Protein Synthesis. J. Biol. Chem. 281, 29011–29021 (2006).
Blevins, W. R., Carey, L. B. & Albà, M. M. Transcriptomics data of 11 species of yeast identically grown in rich media and oxidative stress conditions. BMC Res. Notes 12, 250 (2019).
Ji, Z., Song, R., Regev, A. & Struhl, K. Many lncRNAs, 5′UTRs, and pseudogenes are translated and some are likely to express functional proteins. Elife 4, e08890 (2015).
Mcmanus, C. J., May, G. E., Spealman, P. & Shteyman, A. Ribosome profiling reveals post-transcriptional buffering of divergent gene expression in yeast. Genome Res. 24, 422–430 (2014).
Borg, I. & Groenen, P. J. F. Modern multidimensional scaling. (Springer, 1997).
Robinson, M. D., McCarthy, D. J. & Smyth, G. K. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26, 139–140 (2010).
Law, C. W., Chen, Y., Shi, W. & Smyth, G. K. voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biol. 15, R29 (2014).
Gasch, A. P. et al. Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes. Mol. Biol. Cell 11, 4241–4257 (2000).
Morano, K. A., Grant, C. M. & Moye-Rowley, W. S. The response to heat shock and oxidative stress in Saccharomyces cerevisiae. Genetics 190, 1157–95 (2012).
Huang, D. W., Sherman, B. T. & Lempicki, R. A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 4, 44–57 (2009).
Vogel, C., Silva, G. M. & Marcotte, E. M. Protein Expression Regulation under Oxidative Stress. Mol. Cell. Proteomics 10, M111.009217 (2011).
Zhong, Y. et al. RiboDiff: detecting changes of mRNA translation efficiency from ribosome footprints. Bioinformatics 33, 139–141 (2017).
Ingolia, N. T. Ribosome Footprint Profiling of Translation throughout the Genome. Cell 165, 22–33 (2016).
Jungfleisch, J. et al. A novel translational control mechanism involving RNA structures within coding sequences. Genome Res. 27, 95–106 (2017).
Yordanova, M. M. et al. AMD1 mRNA employs ribosome stalling as a mechanism for molecular memory formation. Nature 553, 356–360 (2018).
Gerber, S. A., Rush, J., Stemman, O., Kirschner, M. W. & Gygi, S. P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. Proc. Natl. Acad. Sci. 100, 6940–6945 (2003).
Edfors, F. et al. Gene-specific correlation of RNA and protein levels in human cells and tissues. Mol. Syst. Biol. 12, 883 (2016).
Slavoff, S. A. et al. Peptidomic discovery of short open reading frame-encoded peptides in human cells. Nat. Chem. Biol. 9, 59–64 (2013).
Smirnova, J. B. et al. Global Gene Expression Profiling Reveals Widespread yet Distinctive Translational Responses to Different Eukaryotic Translation Initiation Factor 2B-Targeting Stress Pathways. Mol. Cell. Biol. 25, 9340–9349 (2005).
Hubbard, T. J. P. et al. Ensembl 2009. Nucleic Acids Res. 37, D690–7 (2009).
Zid, B. M. & O’Shea, E. K. Promoter sequences direct cytoplasmic localization and translation of mRNAs during starvation in yeast. Nature 514, 117–121 (2014).
Khong, A. et al. The Stress Granule Transcriptome Reveals Principles of mRNA Accumulation in Stress Granules. Mol. Cell 68, 808–820.e5 (2017).
Luo, Y., Na, Z. & Slavoff, S. A. P-Bodies: Composition, Properties, and Functions. Biochemistry 57, 2424–2431 (2018).
Reid, D. W., Shenolikar, S. & Nicchitta, C. V. Simple and inexpensive ribosome profiling analysis of mRNA translation. Methods 91, 69–74 (2015).
Xie, S.-Q. et al. RPFdb: a database for genome wide information of translated mRNA generated from ribosome profiling. Nucleic Acids Res. 44, D254–D258 (2016).
Liu, W., Xiang, L., Zheng, T., Jin, J. & Zhang, G. TranslatomeDB: a comprehensive database and cloud-based analysis platform for translatome sequencing data. Nucleic Acids Res. 46, D206–D212 (2018).
Tsankov, A. M., Thompson, D. A., Socha, A., Regev, A. & Rando, O. J. The Role of Nucleosome Positioning in the Evolution of Gene Regulation. PLoS Biol. 8, e1000414 (2010).
Ingolia, N. T., Brar, Ga, Rouskin, S., McGeachy, A. M. & Weissman, J. S. The ribosome profiling strategy for monitoring translation in vivo by deep sequencing of ribosome-protected mRNA fragments. Nat. Protoc. 7, 1534–50 (2012).
Bolger, A. M., Lohse, M. & Usadel, B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–20 (2014).
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet.journal 17, 1 (2011).
Langmead, B., Trapnell, C., Pop, M. & Salzberg, S. L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 10, R25 (2009).
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
Anders, S., Pyl, P. T. & Huber, W. HTSeq–a Python framework to work with high-throughput sequencing data. Bioinformatics 31, 166–9 (2015).
Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat. Genet. 25, 25–9 (2000).
We acknowledge the Proteomics Unit of Center for Regulatory Genomics and Universitat Pompeu Fabra for the isolation of proteins from yeast cultures. We are also grateful to Robert Castelo for advice during this project. The work was funded by grants BFU2015–65235-P, BFU2015-68351-P and BFU2016-80039-R, from Ministerio de Economía e Innovación (Spanish Government) - FEDER (EU), and from grant PT17/0009/0014 from Instituto de Salud Carlos III – FEDER. We also received funding from the “Maria de Maeztu” Programme for Units of Excellence in R&D (MDM-2014-0370) and from Agència de Gestió d’Ajuts Universitaris i de Recerca Generalitat de Catalunya (AGAUR), grant number 2014SGR1121, 2014SGR0974, 2017SGR01020 and, predoctoral fellowship (FI) to W.B. We also acknowledge support from the EU Erasmus Programme to T.T.
The authors declare no competing interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Blevins, W.R., Tavella, T., Moro, S.G. et al. Extensive post-transcriptional buffering of gene expression in the response to severe oxidative stress in baker’s yeast. Sci Rep 9, 11005 (2019) doi:10.1038/s41598-019-47424-w