Introduction

In recent years high throughput RNA sequencing (RNA-Seq) has become the method of choice for measuring shifts in gene expression between cells grown in different conditions1. However, diverse studies have shown that mRNA levels only partially explain protein levels in the cell2,3,4,5. In yeast, the correlation between mRNA and protein abundance is typically in the range 0.6–0.72. In addition, the ratio between protein and mRNA levels may vary across different conditions3. For instance, substantial differences in this ratio have been observed during osmotic stress in yeast6,7 or after the treatment of human cells with epidermal growth factor8.

In contrast to RNA-Seq, which measures the total amount of mRNA in the cell, ribosome profiling (Ribo-Seq) only captures those mRNAs that are being actively translated9. Each Ribo-Seq read corresponds to one translating ribosome, providing a quantitative view of the amount of protein produced by the cell at any given time. Although this remains an indirect estimate of protein abundance, it has several advantages over proteomics, such as the fact that with Ribo-Seq virtually all translated sequences can be captured, and that one can apply the same pipelines and statistical methods as for RNA-Seq to identify differentially expressed genes.

The response to oxidative stress in the yeast Saccharomyces cerevisiae involves a general decrease in mRNA translation initiation as well as the selective transcriptional activation of a set of proteins involved in the maintenance of the redox state of the cell9,10,11. A previous study reported changes in the ratio between the normalized number of Ribo-Seq and RNA-Seq reads, or translational efficiency (TE), of hundreds of genes upon oxidative stress10, suggesting extensive translational regulation. However, changes in TE alone do not necessarily imply changes in the abundance of the translated proteins. By performing a separate analysis of Ribo-Seq and RNA-Seq data, we show that the majority of genes that show statistically significant differences at the RNA-Seq level do not correspond to similar differences at the Ribo-Seq level, suggesting that, in most cases, changes in mRNA abundance are compensated by changes in ribosome density and are not expected to be propagated to the protein level. Our approach also uncovers a subset of differentially expressed genes in which regulation appears to be mainly exerted at the translational level.

Results

Ribosome profiling experiments in normal and stress conditions

We extracted ribosome-protected RNA fragments, as well as complete polyadenylated RNAs, from Saccharomyces cerevisiae grown in rich media (normal) and in H2O2-induced oxidative stress conditions (stress)(Fig. 1). We added 1.5 mM H2O2 to the media 30 minutes prior to harvesting the cells; these conditions induced severe oxidative stress and approximately halved the growth rate of the cells11,12. We then sequenced the ribosome-protected RNA fragments (Ribo-Seq) as well as complete mRNAs (RNA-Seq) using a strand-specific protocol. The Ribo-Seq data provided a snapshot of the translatome, each read corresponding to one translating ribosome, whereas the number of RNA-Seq reads mapping to a gene was used to quantify the relative abundance of the transcript.

Figure 1
figure 1

Experimental design. Baker’s yeast (S. cerevisiae) was grown in rich media and oxidative stress conditions in parallel. The cultures were used to extract total RNA, ribosome-protected RNA fragments and proteins.

After quality control of the sequencing reads we obtained 31–36 million Ribo-Seq reads and 12–15 million RNA-Seq reads per sample (Supplementary Table S1). We mapped the reads to the genome and generated a table of read counts per gene for each of the samples. After filtering out non-expressed genes (see Methods), the table contained data for 5,419S. cerevisiae annotated genes (ORFs).

To be able to compare the relative abundances of mRNAs and ribosome-bound mRNAs between different samples we normalized the RNA-Seq and Ribo-Seq table of counts by calculating normalized counts per million (CPM) in logarithmic scale, or log2CPM (Supplementary Fig. S1). The correlation coefficient between the average Ribo-Seq and RNA-Seq log2CPM expression values was 0.84 in normal conditions and 0.87 in stress conditions (Fig. 2A,B, respectively). As the differences in log2CPM between RNA-Seq or Ribo-Seq replicates were negligible (Fig. 2C,D, Supplementary Table S2), these values reflect the amount of disagreement between total mRNA and translated protein abundances.

Figure 2
figure 2

Representative gene expression correlations between RNA sequencing samples. (A) RNA-Seq normal replicate 1 versus Ribo-Seq normal replicate 1. (B) RNA-Seq stress replicate 1 versus Ribo-Seq stress replicate 1. (C) RNA-Seq normal replicate 1 versus RNA-Seq normal replicate 2. (D) Ribo-Seq normal replicate 1 versus Ribo-Seq normal replicate 2. Expression units are CPM in logarithm scale; R: Spearman correlation value. N: normal growth conditions (two replicates N1 and N2); S: stress conditions (two replicates S1 and S2).

Ribo-Seq shows a higher correlation with proteomics than RNA-Seq

The next step was to compare the quantification of gene expression by RNA-Seq and Ribo-Seq to that obtained using proteomics. For proteomics-based quantification we extracted the protein fraction from yeast grown in normal and stress conditions and estimated the abundance of different yeast proteins, i.e. the proteome, using mass spectrometry (Fig. 1). We could reliably quantify the protein products of 2,200 genes (see Methods), representing about 40% of the genes quantified by RNA-Seq or Ribo-Seq. Normalized protein abundances between pairs of proteomics replicates showed correlation coefficients in the range 0.83–0.93 (Supplementary Table S3), lower than for RNA-Seq or Ribo-Seq replicates (>0.99).

In normal conditions the correlation coefficient between the transcriptome (RNA-Seq) and the proteome normalized abundances was 0.46. This increased to 0.71 when comparing the translatome (Ribo-Seq) and the proteome normalized abundances (Fig. 3), indicating that Ribo-Seq-based quantification of gene expression provides a more accurate picture of protein abundance than RNA-Seq data. The average correlation coefficient between the three pairs of proteome replicates was 0.91, setting up a maximum value for any correlation with the other types of data. Differences between RNA-Seq and proteomics quantification estimates may arise because of differences in the half life of the proteins with respect to their cognate mRNAs as well as variations in the translation rate or ribosome density across the transcripts. As the value of 0.71 (Ribo-Seq versus proteomics) is intermediate between 0.46 (RNA-Seq versus proteomics) and 0.91 (proteomics replicates), the two above mentioned factors appear to be relevant to explain the strong uncoupling between mRNA and protein abundance in this system.

Figure 3
figure 3

Proteomics shows a stronger correlation with Ribo-Seq than with RNA-Seq data. (A) RNA-Seq versus proteomics, normal growth conditions. (B) RNA-Seq versus proteomics, oxidative stress. (C) Ribo-Seq versus proteomics, normal growth conditions. (D) Ribo-Seq versus proteomics, oxidative stress. CPM: counts per million for RNA-Seq and RNA-Seq data (represented in logarithmic scale, average between replicates). log2 normalized area: relative abundance for proteomics data (average between replicates). R: Spearman correlation value. Plot and correlations comprise 2200 genes for which ≥3 unique peptides were detected by LCMSMS.

In stress conditions the correlation coefficient between the transcriptome and proteome measurements was 0.62, somewhat higher than in normal conditions. Whereas the correlation coefficient between the translatome and the proteome was 0.67, again higher than the same value between the transcriptome and the proteome but lower than the correlation between the proteome stress replicates (0.86). Taken together, these results are consistent with the hypothesis that differences in ribosome density play a role in modulating protein expression under severe stress conditions.

Analysis of three nucleotide periodicity

In actively translated regions mapped Ribo-Seq reads exhibit a characteristic three nucleotide periodicity that results from the codon-to-codon ratcheting movement of the ribosome along the coding sequence9. We used the program RibORF13 to assess the nucleotide periodicity and homogeneity of the Ribo-Seq reads in the annotated coding sequences. According to this analysis, in the vast majority of genes (98%, 5198 out of 5304 analyzed genes) the annotated ORF showed a clear signature of translation in both normal and stress conditions, validating our approach of considering all the reads that mapped to the annotated ORFs for the quantification of protein translation.

In a small fraction of genes, however, we found evidence of alternative translated ORFs (Supplementary Table S4). One example was TOS8, which encodes a homeodomain-containing transcription factor. In this gene active translation of the canonical 831 amino acid long protein by RibORF was only detected in stress conditions; in contrast, a protein of only 81 amino acids was the main translated polypeptide in rich media. The shorter alternative ORF was on a different reading frame to the main protein product and showed no homology to any previously characterized protein. These cases illustrate how detailed examination of the distribution of the Ribo-Seq reads may help uncover proteins that have remained hidden within longer ORFs.

Ribo-Seq estimates of changes in gene expression are more conservative

We next calculated the gene expression level fold change (FC) between the two conditions, using RNA-Seq and Ribo-Seq data separately. The log2FC distribution based on the Ribo-Seq data had a lower variance than the log2FC distribution using RNA-Seq data (Fig. 4A). This indicated a higher range of variation in the mRNA levels, as estimated by RNA-Seq, than in the ribosome-protected fragments. This finding was consistent with the existence of post-transcriptional buffering of gene expression, as also reported for inter-specific gene expression comparisons of S.cerevisiae and S.paradoxus14.

Figure 4
figure 4

Integrated analysis of RNA sequencing and ribosome profiling data. (A) Distribution of gene expression fold change (FC) values. FC was calculated as the ratio between the number of reads in oxidative stress and normal conditions. We took the average number of reads per gene among the replicates. The standard deviation of log2FC was 0.44 for Ribo-Seq (RP) and 0.57 for RNA-Seq (RNA). (B) Multidimensional scaling (MDS) plot using the gene expression values of each sample. MDS was based on the log2CPM values for each gene. Data was for 5,419S. cerevisiae genes. RP: Ribo-Seq data; RNA: RNA-Seq data; N: normal growth conditions; S: stress conditions. Two sequencing replicates were generated per condition. (C) Correlation between log fold change (FC) gene expression values. The X axis corresponds to the RNA-Seq data, or transcriptome, the Y axis to the Ribo-Seq data, or translatome. Coloured dots correspond to differentially expressed genes. In the legend homodirectional means up-regulated, or down-regulated, both at the transcriptome and translatome levels; opposite_change is up-regulated at one level and down-regulated at the other one; translatome means significant differences in Ribo-Seq only; transcriptome means significant differences in RNA-Seq only. (D) Significant gene functional classes among differentially expressed genes. Shown is a 2-D plot of the enrichment score values, in logarithmic scale, provided by the software DAVID for differentially expressed genes using RNA-Seq (transcriptome) or Ribo-Seq (translatome) data. Significant enrichment scores are associated with a p-val < 0.05. Functional classes associated with positive values are significantly enriched among up-regulated genes, and functional classes with negative values are significantly enriched among down-regulated genes. Non-significant enrichment scores are given a value of 0 in the plot.

We considered the possibility that the about 2.5 times higher number of Ribo-Seq reads than RNA-Seq reads in the original datasets biased the comparison of log2FC distributions. In order to test it we subsampled the mapped reads so as to have a similar number of reads in all the RNA-Seq and Ribo-Seq samples (Supplementary Tables S5 and S6). The results were very similar to those observed without subsampling (Supplementary Fig. S2), indicating that the differences in sequencing coverage between samples do not affect the results.

We also used an alternative method, multidimensional scaling (MDS)15, to quantify the distance between Ribo-Seq and RNA-Seq gene expression measurements (Fig. 4B). We found that the distance between Ribo-Seq normal and stress conditions was shorter that the distance between RNA-Seq normal and stress conditions, which was consistent with the previous observation that log2FC variance was lower for Ribo-Seq than for RNA-Seq. In conclusion, the variation in gene expression between normal and stress conditions was attenuated in the Ribo-Seq data when compared to RNA-Seq data.

Extensive post-transcriptional buffering of gene expression

We performed differential gene expression analysis, separately for Ribo-Seq and RNA-Seq data, using the package EdgeR16 for normalization of the data and Limma17 for multivariable linear regression and identification of differentially expressed genes. We selected genes with an adjusted p-value < 0.05 and a log2FC larger than one standard deviation; the latter corresponded to a minimum FC of 1.49 for RNA-Seq data and 1.36 for Ribo-Seq data. We used the standard deviation instead of a fixed value to accommodate for the differences in the width of the log2FC distributions. The number of genes that were differentially expressed was 1,530 for RNA-Seq and 536 for Ribo-Seq. A similarly high number of yeast genes were previously found to be regulated by hydrogen peroxide at the level of transcription using microarray technology18 and sequencing data10.

The correlation between RNA-Seq and Ribo-Seq gene log2FC values was quite low (0.18), indicating an important disconnect between the two kinds of data (Fig. 4C). Only 127 genes showed a significant change in the same direction i.e. homodirectional changes. Genes that were up-regulated during stress according to both RNA-Seq and Ribo-Seq included protein functions known to be activated at the transcriptional level in response to stress, such as hexoquinases or heat shock proteins19. The number of genes annotated with the Gene Ontology (GO) term ‘oxidation reduction process’ was similar for RNA-Seq or Ribo-Seq up-regulated genes (17 and 15, respectively), supporting that these genes are essentially regulated at the level of transcription and can be effectively detected with both kinds of sequencing data.

However, the vast majority of genes were only significant at the transcriptome or the translatome levels (1,413 and 409 genes, respectively; Fig. 4C). The largest group was formed by genes that showed significant changes in relative transcript abundance but not in the relative number of ribosome-protected fragments, supporting extensive post-transcriptional buffering of gene expression. The data indicated that about a quarter of the genes in the genome may be undergoing compensatory changes: when mRNA levels increase ribosome density per transcript decreases and the other way round.

The second group, translatome-only differentially expressed genes, represented cases in which mRNA levels did not change but the density of ribosomes per transcript showed a significant increase or decrease in stress relative to normal. This would be consistent with the expression of these genes being primarily modulated at the level of translation. We identified many more genes under differential translational repression than activation (360 versus 49, Fig. 4C), suggesting that the former mechanism may be more prevalent that the first one in response to severe oxidative stress.

We also identified a subset of cases showing opposite changes in RNA-Seq and Ribo-Seq data. The main group was formed by 70 genes showing increased mRNA levels but decreased translation in stress versus normal. One simple explanation would be that, for these genes, there is an mRNA fraction that is stored in a translational inactive highly stable form, whereas the rest is translated at the usual level. More complex scenarios could involve a combination of transcriptional and translational regulatory events.

We applied the same differential gene expression pipeline to the normalized proteomics data but did not recover any significant hits. This is not unexpected given the lower number of proteins that could be quantified and, above all, the high variability in the protein abundance estimates between replicates.

Dissecting differential regulation by functional class

To better understand the biological relevance of our observations, we investigated if certain functional classes were significantly enriched among the sets of differentially expressed genes. We used DAVID20 to identify significantly over-represented functional clusters (Fig. 4D). Only one class, ‘oxidation-reduction process’, was enriched among genes up-regulated during stress both using RNA-Seq and Ribo-Seq data. This is consistent with transcriptional activation of this set of genes upon stress, increasing the signal for both total mRNA and the translated fraction.

Three other classes – ‘translation’, ‘ATPase’ and ‘proteasome’ – showed increased mRNA levels during stress; conversely this was not reflected in an increase in for the translated fraction. Although translation-related proteins have been reported to be transcriptionally repressed during oxidative stress18,21, there are also indications that this is reverted when H2O2 concentration is high11, as is the case in our study.

Among genes that were differentially expressed only when we used Ribo-Seq data ‘cell wall’, ‘mitochondrial intermembrane space’ and ‘catalytic activity’ were enriched among up-regulated genes, whereas ‘cell cycle’ was enriched among down-regulated genes (Fig. 4D).

Translational efficiency and protein level changes

To obtain further insights into the regulatory mechanisms of gene expression during oxidative stress in yeast we also compared the translational efficiency (TE; Ribo-Seq normalized counts divided by RNA-Seq normalized counts) of the different genes in the two conditions using the program Ribodiff22. We detected 470 genes that showed significantly increased TE during stress (adjusted p-value < 0.05; see Methods); about 82% of them were cases in which the relative mRNA levels had decreased during stress but this change had been compensated by an increase in ribosome density so that no significant changes in the amount of translated protein would be expected (transcriptome downregulated, Table 1). In only about 3% of cases increased TE was associated with translational activation and increased protein production (translatome upregulated, Table 1).

Table 1 Genes with significantly increased or decrease translational efficiency during oxidative stress.

In the case of genes with significantly lower TE in stress than in normal conditions the percentage of compensatory cases was also the predominant scenario, accounting for 50% of the genes in the class (356 out of 714, Table 1). The second most numerous group were genes likely to be actively repressed at the level of translation, accounting for 29% of the genes with significantly decreased TE (29%). The latter genes showed no change in mRNA levels but the relative number of associated ribosomes was lower in stress than in normal conditions, which would be expected to lead to a decrease in the protein levels. This group included 12 genes from the cell cycle functional category (Supplementary Table S7).

Discussion

The adaptation of organisms to variations in different environmental conditions is associated with the activation or repression of gene expression. These changes are usually studied at the level of complete mRNA molecules using microarrays or next generation sequencing. However, changes in mRNA concentration do not necessarily reflect changes in their encoded protein products8,11.

Here we have explored the usefulness of ribosome profiling data to close the gap between mRNA and protein abundance estimates using oxidative stress in baker’s yeast as a model system. Each ribosome profiling read corresponds to one translating ribosome and thus the number of reads that map to a gene reflects the amount of translated protein9,23. Numerous recent studies have used ribosome profiling to gain insights into novel translation regulatory mechanisms24,25 or to discover new translated RNA sequences20,21,22,23. However, there is a lack of studies addressing how ribosome profiling can be used to improve the estimates of protein abundance changes over RNA-Seq-based estimates. Our study shows that Ribo-Seq provides better estimates of protein abundance than RNA-Seq and that the results of differential gene expression analyses are drastically altered if we use Ribo-Seq of RNA-Seq as the source sequencing data.

The abundance of different proteins in the cell is usually estimated using mass spectrometry proteomics data26,27. This provides a direct measurement of protein abundance that can account for the variations in the stability of different proteins; however, proteomics methods are much less sensitive than current RNA sequencing approaches and not all proteins can be detected in routine analyses28. In addition, the results of high throughput sequencing techniques are more reproducible across biological replicates than those obtained with mass spec proteomics; this confers the former studies increased power to perform differential gene expression analyses.

While it lacks the specificity of Ribo-Seq, earlier studies in yeast indicated that polysomal mRNA abundances provided a better approximation to protein level changes than total mRNA6. It was also reported that Ribo-Seq showed a higher correlation with proteomics data than RNA-Seq, but these conclusions were drawn after comparing data obtained from different laboratories9. Here we generated RNA-Seq, Ribo-Seq and proteomics data for yeast grown in identical conditions, leading to less biased comparisons. Our results support the hypothesis that the Ribo-Seq read counts provide a better approximation to protein levels than RNA-Seq read counts.

We observed that many of the genes that were detected as significantly up- or down-regulated in stress by RNA-Seq did not show any significant changes using the Ribo-Seq data, and that many genes were likely to be regulated at the level of translation only. Important differences between the transcriptome and the translatome were reported in previous studies that compared total mRNA and polysomal mRNA changes upon different environmental stresses6,11,29. One of these studies used two different H2O2 concentrations to induce oxidative stress −0.2 and 2 mM H2O2 – observing large differences in the translated versus total mRNA fractions depending on the severity of the stress11. In the mentioned study, the addition of 2 mM H2O2 resulted in the up-regulation of genes involved in ribosome biogenesis and rRNA processing; similar to our findings using 1.5 mM H2O2. Additionally, we could observe that this increase in mRNA levels is compensated by a decrease in the density of ribosomes on the transcripts, consistent with post-transcriptional buffering of gene expression. This affected hundreds of genes, including many ribosomal proteins but also members of the proteasome and ATPase complexes.

In general, changes in genes expression identified by RNA-Seq data showed higher variability than those identified by Ribo-Seq data. Intriguingly, studies comparing the expression of orthologous genes from closely related species have also reported that gene expression is in general more variable when measured by RNA-Seq than Ribo-Seq14,30. In the case of oxidative stress we also have to consider that some mRNAs could be transiently stored in P-bodies or stress granules31,32,33, becoming inaccessible to the translation machinery. Translation of these transcript could be rapidly reactivated when the stress disappears.

We identified signatures of translational repression, beyond any general effects induced by the stress, in hundreds of proteins, including proteins involved in the cell cycle. In this case there was no apparent change in the number of mRNA molecules but ribosome density decreased during stress, presumably reflecting lower translation rates in this subset of proteins than in the rest. Repression of cell cycle proteins would be consistent with the observed marked slow down of cell division under stress. Genes potentially up-regulated at the translational level were enriched in cell wall and mitochondrial membrane proteins. Translation up-regulation of a selected set of genes may allow the cells to rapidly react to adverse conditions, and has also been observed in response to hyperosmotic stress6.

The results of this study illustrate the importance of performing ribosome profiling experiments to differentiate between changes in mRNA that are likely to result in changes in the protein levels to those that are not. Although obtaining Ribo-Seq data is more labour-intensive than RNA-Seq, the protocols are being simplified and its use is rapidly growing34,35,36. The methodological framework we have developed can be applied to other datasets and help advance our understanding of gene regulation in other conditions.

Methods

Biological material

We grew S. cerevisiae (S288C) in 500 ml of rich media in a 2 L Erlenmeyer flask; two replicates were grown in parallel for each condition- a total of four flasks37. In order to induce oxidative stress in two stress condition replicates, 30 minutes before harvesting the cells we added diluted H2O2 to the media for a final concentration of 1.5 mM. In both conditions, the cells were harvested in log growth phase via vacuum filtration and frozen with liquid nitrogen when they reached an OD600 of approximately 0.25. In order to capture ribosome protected mRNAs, cyclohexamide was added to the flask one minute before the cells were harvested. Cyclohexamide is commonly used as a protein synthesis inhibitor in order to prevent ribosome run-off and the subsequent loss of ribosome-transcript complexes. One third of each culture was used for ribosome profiling (Ribo-Seq); the rest was reserved for RNA-Seq.

RNA-Seq

The cells harvested for RNA-Seq were processed with the Monarch® Total RNA Miniprep Kit protocol. After verifying the quality of the total RNA from each sample with the BioAnalyzer, polyA mRNAs were selected and libraries were prepared using NEBNext® Ultra™ II Directional RNA Library Prep Kit for Illumina®. Subsequently the samples were indexed, pooled, and sequenced (35 × 2 stranded) on the Illumina NextSeq platform.

Ribosome profiling

Cells were lysed using the freezer/mill method (SPEX SamplePrep); after preliminary preparations, lysates were treated with RNaseI (Ambion), and subsequently with SUPERaseIn (Ambion). Monosomal fractions were collected; SDS was added to stop any possible RNAse activity, then samples were flash-frozen with N2(l). Digested extracts were loaded in 7–47% sucrose gradients. RNA was isolated from monosomal fractions using the hot acid phenol method. Ribosome-Protected Fragments (RPFs) were selected by isolating RNA fragments of 28–32 nucleotides (nt) using gel electrophoresis. The preparation of sequencing libraries for Ribo-Seq and RNA-Seq was based on a previously described protocol38. Pair-end sequencing reads of size 35 nucleotides were produced for Ribo-Seq on the Illumina MiSeq platform. The data has been deposited at NCBI Bioproject PRJNA435567 (https://www.ncbi.nlm.nih.gov/bioproject/435567).

Processing of the sequencing data

The RNA-Seq data was filtered using Trimmomatic with default parameters (version 0.36)39. In the Ribo-Seq data we discarded the second read pair as it was redundant and of poorer quality than the first read, and then used Cutadapt40 to eliminate the adapters and to trim five and four nucleotides at 5’ and 3’ edges, respectively. Ribosomal RNA was depleted from the Ribo-Seq data in silico by removing all reads which mapped to annotated rRNAs. Ribo-Seq reads shorter than 25 nucleotides were not used.

After quality check and read trimming, the reads were aligned against the S. cerevisiae genome (S288C R64-2-1) using Bowtie 241. For annotation we used a previously generated S. cerevisiae transcriptome containing 6,184 annotated coding sequences plus 1,009 non-annotated assembled transcripts (see Supplementary data). SAMtools42 was used to filter out unmapped reads.

We counted the number of reads that mapped to each gene with HTSeq-count43. We used the mode ‘intersection strict’ to generate a table of counts from the data; the procedure removed about 5% of the reads in the case of RNA-Seq, and 8% in the case of Ribo-Seq. Only genes in which the average read count of the two replicates was larger than 10 in all conditions (normal and stress, for RNA-Seq and for Ribo-Seq) were kept. The filtered table of counts contained data for 5,419 genes; nearly all of them corresponded to annotated genes (5,312 genes).

For subsampling the number of mapped reads we used SAMtools42. We used the function ‘samtools view’ with option ‘-s 0.X’, where X is the percentage of reads that we wish to keep.

Analysis of three nucleotide periodicity in the mapped Ribo-Seq reads

We used RibORF13 to analyze the mapped Ribo-Seq. We analyzed all possible ORFs with a minium length of 9 amino acids and at least 10 mapped reads. We analyzed 5,304 annotated ORFs. RibORF counts the number of reads that fall in each frame and calculates the distribution of reads along the length of the ORF. We used the original proposed cutoff (score > 0.7) to predict translated ORFs.

Quantification of protein abundance by mass spectrometry

For our proteomics experiment, we analysed 3 replicates per condition by LCMSMS using a 90-min gradient in the Orbitrap Fusion Lumos. These samples were not treated with cyclohexamide. The proteins were extracted by milling the cells with glass beads in a denaturing buffer in a vortex at 4 °C. After dialysis with a 3KDa filter, protein concentration was measured using a BCA kit; subsequently the proteins were digested with sequenec-grade trypsin in urea. As a quality control measure, BSA controls were digested in parallel and ran between each sample to avoid carry-over and assess the instrument performance. The peptides were searched against SwissProt Yeast database, using the Mascot v2.5.1 search algorithm. The search was performed with the following parameters: peptide mass tolerance MS1 7 ppm and peptide mass tolerance MS2 0.5 Da; three maximum missed cleavages; trypsin digestion after K or R except KP or KR; dynamic modifications oxidation (M) and acetyl (N-term), static modification carbamidomethyl (C). Protein areas were obtained from the average area of the three most intense unique peptides per protein group. Considering the data from all 6 samples, we detected proteins from 3,336 genes. We limited our quantitative analysis to a subset of 2,200 proteins which had proteomics hits for at least 3 unique peptides; this filter eliminates noise arising from technical challenges of quantifying lowly abundant proteins with LCMSMS.

Differential gene expression analysis

The table of counts was normalized to log2 Counts per Million (log2CPM) using the function ‘cpm’ in the R package edgeR16. Before performing differential gene expression analysis, we normalized the data using Trimmed Mean of M-values (TMM) from the same package. Finally, we applied the Limma voom method17 to identify differentially expressed genes, separately for RNA-Seq and Ribo-Seq data (adjusted p-value < 0.05 and |log2FC| > 1 SD(log2FC)).

We applied the same pipeline to the proteomics data using normalized area values as a quantitative measure of protein abundance. To ensure robustness of the differential expression analysis we used genes which had at least 3 unique peptides and could be quantified in all 6 replicates (1,580 genes); the procedure did not identify any significantly up or down regulated genes, using an adjusted p-value < 0.05. Low sensitivity of this procedure is expected considering the relatively poor correlation of the mass spec replicates (r between 0.83 and 0.93).

Analysis of functional clusters

We identified significantly enriched functional clusters in differentially expressed genes using DAVID20. The analysis was done separately for over- and under-expressed genes and for RNA-Seq and Ribo-Seq derived data. Only clusters with enrichment score ≥ 1.5 and adjusted p-val < 0.05 were retained. In each cluster we chose a representative Gene Ontology (GO) term44, with the highest number of genes inside the cluster. Figure 4 integrates the results obtained with the Ribo-Seq and the RNA-Seq data, the log10 fold enrichment of the significant GO terms is plotted.

Analysis of translational efficiency

We searched for genes with significantly increased or decreased translational efficiency (TE)9 using the RiboDiff program22. We selected genes significant at an adjusted p-value < 0.05 and showing log2(TEstress/TEnormal) higher than 0.67 or lower than −0.67 (plus or minus one standard deviation of the log2(TEstress/TEnormal) distribution.