High-throughput interrogation of programmed ribosomal frameshifting in human cells

Programmed ribosomal frameshifting (PRF) is the controlled slippage of the translating ribosome to an alternative frame. This process is widely employed by human viruses such as HIV and SARS coronavirus and is critical for their replication. Here, we developed a high-throughput approach to assess the frameshifting potential of a sequence. We designed and tested >12,000 sequences based on 15 viral and human PRF events, allowing us to systematically dissect the rules governing ribosomal frameshifting and discover novel regulatory inputs based on amino acid properties and tRNA availability. We assessed the natural variation in HIV gag-pol frameshifting rates by testing >500 clinical isolates and identified subtype-specific differences and associations between viral load in patients and the optimality of PRF rates. We devised computational models that accurately predict frameshifting potential and frameshifting rates, including subtle differences between HIV isolates. This approach can contribute to the development of antiviral agents targeting PRF.

Page 2. Lanes 45-46: "Most human cases known to date were found serendipitously or through homologous genes." This statement is a bit problematic. Indeed, in general, some cases of frameshifting were discovered serendipitously, but not most. Most were predicted based on sequence analysis and then confirmed experimentally, this includes even the first cellular example, bacterial release factor 2. First, it was predicted ( Craigen et al 1985) and only then confirmed (Craigen & Caskey 1986). If we limit ourselves to human genes only, then we have only a handful of examples, oaz1 frameshifting was first characterized in rat and then found in humans, oaz2 and oaz3 were found as paralogs of oaz1. Frameshifting in PNMA3 was discovered as a result of a systematic bioinformatics search. Even proclaimed frameshifting in CCR5 stemmed from an attempt to identify novel cases of ribosomal frameshifting systematically. Perhaps only Edr/PEG10 might classify as serendipitous discovery and even that is a bit of an overstretch because frameshifting was already suspected because of the two long overlapping ORFs in the reconstructed mRNA sequence. Besides, strictly speaking t was first characterized in mice. Outside of humans, most of the recently discovered frameshifting cases (or other recoding events) follow the same trajectory: phylogenetic analysis followed by experimental validation, see publications by Andrew Firth and Manolis Kellis labs. To me one potential advantage of Mikl et al approach is that it may potentially enable discovery of frameshifting cases whose identification is not possible using phylogenetic analysis, e.g. recently evolved cases.
In relation to the description of the human cases the authors correctly cite the discovery of PNMA3, oaz1, and oaz2. Oaz3 was discovered by two labs independently, but the authors omitted Ivanov et al (2000). The reference on frameshifting discovery in Edr/PEG10 (Shigemoto et al 2001) is also missing.
Some of the observations made by the authors (including frameshifting efficiencies) differ from what was published earlier. This is not surprising because experimental methods have their own unique limitations and considerable discrepancies in measured frameshifting efficiencies have been often observed in the past. The authors' description of the method limitations is far better now than what was in the original manuscript, however, many factors that may be responsible for inaccurate frameshifting measurements are not mentioned. I am sure that we don't even know many such factors, but perhaps the authors will find some of the following useful and worthy mentioning in the manuscript while discussing the limitations of their approach.
1. The effect of protein sequence encoded in the tested cassette on reporter activity or its stability, see Loughran et al (2017).
2. Protein factors involved in modulation of frameshifting, see Napthine et al (2019). Such proteins may not be present in the cell line where frameshifting is being tested.
3. Concentrations of specific metabolites, e.g. the ribosomal frameshifting in testis-specific antizyme 3 is very low in other cells, see Howard et al (2001). This is probably because polyamine levels in germ cells are far higher where endogenous levels of oaz3 would also be expected to be higher. Gurvich et al (2005) While the last two examples relate to frameshifting in bacteria, there is no reason to believe that such factors would be irrelevant in human cells. Speaking of yet unknown factors we could speculate that coand posttranscriptional RNA modifications may affect frameshifting but may not be reproduced in the reporter constructs, e.g. it has been shown that inosines cause ribosome pauses, see (Licht et al 2019) In relation to , I think it could be helpful to mention that the reported frameshifting in CCR5 is simply an artifact of a dual-luciferase reporter as has been shown recently by Khan et al (2019). I suggest that the authors should decide on whether to cite Khan et al (2019) in consultation with the editor whom I will provide with additional confidential information that I cannot mention here . In this manuscript, Mikl et al describe a new fluorescent-based reporter for high-throughput assessment of sequence variations introduced into frameshifting cassettes (frameshift site and stimulatory elements). They have also demonstrated the applicability of this system for testing the sequence constrains of many frameshifting cassettes operational in human cells (of viral and cellular origins). The manuscript has been transferred from another journal and I reviewed two previous versions of the manuscript. The current version is substantially different from the previous version as the authors dropped the claim regarding the discovery of novel cases of ribosomal frameshifting and related parts of this work. I respect and support this decision. This claim requires more substantial experimental validation than what was presented in the previous version and if the authors indeed will pursue this endeavor, I sincerely hope that my previous review will be of help. As for the current manuscript, I think the method described here will be very useful for studying ribosomal frameshifting (as well as of other recoding mechanisms) and indeed could be used as a "fishing" tool for identification of a pool of potential low-confidence candidate frameshifting events. My further suggestions for improving the current manuscript are comparatively minor.

References
Page 2. Lanes 45-46: "Most human cases known to date were found serendipitously or through homologous genes." This statement is a bit problematic. Indeed, in general, some cases of frameshifting were discovered serendipitously, but not most. Most were predicted based on sequence analysis and then confirmed experimentally, this includes even the first cellular example, bacterial release factor 2. First, it was predicted (Craigen et al 1985) and only then confirmed (Craigen & Caskey 1986). If we limit ourselves to human genes only, then we have only a handful of examples, oaz1 frameshifting was first characterized in rat and then found in humans, oaz2 and oaz3 were found as paralogs of oaz1. Frameshifting in PNMA3 was discovered as a result of a systematic bioinformatics search. Even proclaimed frameshifting in CCR5 stemmed from an attempt to identify novel cases of ribosomal frameshifting systematically. Perhaps only Edr/PEG10 might classify as serendipitous discovery and even that is a bit of an overstretch because frameshifting was already suspected because of the two long overlapping ORFs in the reconstructed mRNA sequence. Besides, strictly speaking t was first characterized in mice. Outside of humans, most of the recently discovered frameshifting cases (or other recoding events) follow the same trajectory: phylogenetic analysis followed by experimental validation, see publications by Andrew Firth and Manolis Kellis labs. To me one potential advantage of Mikl et al approach is that it may potentially enable discovery of frameshifting cases whose identification is not possible using phylogenetic analysis, e.g. recently evolved cases.
We changed this sentence according to the reviewer's suggestions to "Many of the human PRF events were found through homology." (lines 41-42) In relation to the description of the human cases the authors correctly cite the discovery of PNMA3, oaz1, and oaz2. Oaz3 was discovered by two labs independently, but the authors omitted Ivanov et al (2000). The reference on frameshifting discovery in Edr/PEG10 (Shigemoto et al 2001) is also missing.
We apologize for this omission and added the missing references in the revised manuscript (line 37).
Some of the observations made by the authors (including frameshifting efficiencies) differ from what was published earlier. This is not surprising because experimental methods have their own unique limitations and considerable discrepancies in measured frameshifting efficiencies have been often observed in the past. The authors' description of the method limitations is far better now than what was in the original manuscript, however, many factors that may be responsible for inaccurate frameshifting measurements are not mentioned. I am sure that we don't even know many such factors, but perhaps the authors will find some of the following useful and worthy mentioning in the manuscript while discussing the limitations of their approach.
1. The effect of protein sequence encoded in the tested cassette on reporter activity or its stability, see Loughran et al (2017).
2. Protein factors involved in modulation of frameshifting, see Napthine et al (2019). Such proteins may not be present in the cell line where frameshifting is being tested.
3. Concentrations of specific metabolites, e.g. the ribosomal frameshifting in testis-specific antizyme 3 is very low in other cells, see Howard et al (2001). This is probably because polyamine levels in germ cells are far higher where endogenous levels of oaz3 would also be expected to be higher. 4. Distance between the ribosomes, see Smith et al (2019). Gurvich et al (2005) While the last two examples relate to frameshifting in bacteria, there is no reason to believe that such factors would be irrelevant in human cells. Speaking of yet unknown factors we could speculate that co-and posttranscriptional RNA modifications may affect frameshifting but may not be reproduced in the reporter constructs, e.g. it has been shown that inosines cause ribosome pauses, see (Licht et al 2019) We thank the reviewer for their comments and mention these points in the revised manuscript "Moreover, many of the general caveats in using reporter systems apply also here, such as the non-native sequence context and expression levels, the effect of the tested sequence on fluorescent readout and protein stability and the concentration of potential trans-acting proteins or metabolites in the particular cell type used in the experiment." (lines 521-525) In relation to , I think it could be helpful to mention that the reported frameshifting in CCR5 is simply an artifact of a dual-luciferase reporter as has been shown recently by Khan et al (2019). I suggest that the authors should decide on whether to cite Khan et al (2019) in consultation with the editor whom I will provide with additional confidential information that I cannot mention here.

Expression levels, see
We mention in the revised manuscript that CCR5 frameshifting has been contested by Khan et al. (2019).