Integrated multi-omics analyses reveal the pleiotropic nature of the control of gene expression by Puf3p

The PUF family of RNA-binding proteins regulate gene expression post-transcriptionally. Saccharomyces cerevisiae Puf3p is characterised as binding nuclear-encoded mRNAs specifying mitochondrial proteins. Extensive studies of its regulation of COX17 demonstrate its role in mRNA decay. Using integrated genome-wide approaches we define an expanded set of Puf3p target mRNAs and quantitatively assessed the global impact of loss of PUF3 on gene expression using mRNA and polysome profiling and quantitative proteomics. In agreement with prior studies, our sequencing of affinity-purified Puf3-TAP associated mRNAs (RIP-seq) identified mRNAs encoding mitochondrially-targeted proteins. Additionally, we also found 720  new mRNA targets that predominantly encode proteins that enter the nucleus. Comparing transcript levels in wild-type and puf3∆ cells revealed that only a small fraction of mRNA levels alter, suggesting Puf3p determines mRNA stability for only a limited subset of its target mRNAs. Finally, proteomic and translatomic studies suggest that loss of Puf3p has widespread, but modest, impact on mRNA translation. Taken together our integrated multi-omics data point to multiple classes of Puf3p targets, which display coherent post-transcriptional regulatory properties and suggest Puf3p plays a broad, but nuanced, role in the fine-tuning of gene expression.


Index
Supplementary Dataset S1 page 1 Supplementary Results Text S1 page 2

Supplementary Results Text S1
To investigate whether the 720 RSU targets represented bona fide Puf3 candidates or were enriched in our data for other reasons. Three possible sources of indirect positive interactions were considered: 1. To address non-specific binding by the IgG-coupled beads, we performed additional control TAP-IPs using an untagged strain, Puf3-TAP, Puf5-TAP and eIF4E-TAP and amplified specific mRNAs using an endpoint RT-PCR approach. Puf5p is a related PUF protein that binds to distinct mRNAs through a motif related to the Puf3-binding motif 8,9 , while eIF4E is a general mRNA 5'cap-binding protein that is important for translation initiation 40 . PCR of the Puf3-TAP IP amplified both the prototypical Core Puf3 target RNA COX17 and CBP3, a novel target only identified by our RIP-seq study, but did not amplify PGK1 an example mRNA identified by PAR-clip or other control mRNAs. Similarly Puf5-TAP bound only its target ORC2, while the untagged strain failed to amplify any products, while all mRNAs tested were found to bind to eIF4E-TAP, as expected (Supplemental Figure 2A). This analysis confirms that our experimental approach can isolate specific mRNAs.
In addition, we were unable to purify sufficient RNA from an untagged strain to perform sequencing. As a further test for non-specific binding to our affinity matrix,  Figure 2C), since most are directly reported in BioGRID from previous studies 9, 23 . In contrast the novel RSU targets are, at best, potential third order or higher order interactions, similar to non-Puf3p targets. The absence of second order interactions suggests that the RSU targets can not be explained simply as a result of indirect binding via other known protein partners and their associated RNAs.
Additionally, we checked if any protein-protein interactions with other RBPs might cause the misidentification of Puf3p targets. We found that RBPs that bind any of the RSU targets also bind many non-Puf3p targets. By these independent measures we suggest that the 720 mRNAs comprise novel Puf3p targets.

Processing of SOLiD Sequencing data
Reads were mapped to the S. cerevisiae genome (genome assembly EF4 downloaded from ENSEMBL) using Bowtie; sequences were then assigned to genomic features using HTseqcount (mapping against the corresponding EF4 GTF file). Sequencing data are publicly available at ArrayExpress; E-MTAB-3406, E-MTAB-3407, and E-MTAB-3413.
Transcript enrichment/depletion analyses were performed using different tests implemented in the edgeR package 44 . Enrichments were tested for using the Fisher test, and applying the Benjamini and Hochberg correction to the calculated P-values. The contrasts between the transcriptome and the monosome or polysome fractions were performed using the exact test in the classical approach. In addition, we compared the transcriptome counts to the average of monosome and polysome counts (translatome counts). We used the generalized linear model (GLM) approach for this analysis. We also used the GLM approach when we compared the monosome and polysome fractions, as we had an experimental design with paired samples. Functional enrichment analyses were performed in-house. GO-Slim mapping annotations were downloaded from the Saccharomyces Genome Database (www.yeastgenome.org).

RNA-Protein Network Analyses
Physical and genetic interactions were downloaded from the BioGRID database (version 3.2.111). In order to study if indirect binding could cause the pull down of some mRNAs, we performed graph analyses where we counted the number of Puf3p targets that could be explained by first, second, third or higher order interactions according to current knowledge.
Additionally, we analysed the importance of unreported Puf3p-RBP-RNA interactions. For each RBP with known RNA targets, but not known to bind Puf3p, we assumed that an interaction could be identified in the future. Then, we compared the number of Puf3p target RNAs that could be explained by indirect binding this way with the number of non-targets that would conflict with the indirect binding hypothesis.
Motif discovery MEME (version 4.10.0) was run locally to identify commons motifs 24 . In order to increase the discriminative power of the tool, we used the set of non-targets as a negative set for calculating position-specific priors. We used UTR sequences reported in RNA-Seq experiments 45 . For Core targets, the motif was found in 201 out of the available 204 3' UTRs.
The reported motif E-value was 2.3 x 10 -187 . 3' UTR sequences were available for 183 RSU targets and the motif was found in all 183 3' UTRs, which a corresponding motif E-value of 4.0 x 10 -11 . During an exploratory phase more than one motif was considered, but no additional motifs returned were significant. We also looked for motifs in 5' UTRs and a selection of ORF sequences, but we did not find any motif with a low E-value and/or present in most of the input sequences.