CUG initiation and frameshifting enable production of dipeptide repeat proteins from ALS/FTD C9ORF72 transcripts

Expansion of G4C2 repeats in the C9ORF72 gene is the most prevalent inherited form of amyotrophic lateral sclerosis and frontotemporal dementia. Expanded transcripts undergo repeat-associated non-AUG (RAN) translation producing dipeptide repeat proteins from all reading frames. We determined cis-factors and trans-factors influencing translation of the human C9ORF72 transcripts. G4C2 translation operates through a 5′–3′ cap-dependent scanning mechanism, requiring a CUG codon located upstream of the repeats and an initiator Met-tRNAMeti. Production of poly-GA, poly-GP, and poly-GR proteins from the three frames is influenced by mutation of the same CUG start codon supporting a frameshifting mechanism. RAN translation is also regulated by an upstream open reading frame (uORF) present in mis-spliced C9ORF72 transcripts. Inhibitors of the pre-initiation ribosomal complex and RNA antisense oligonucleotides selectively targeting the 5′-flanking G4C2 sequence block ribosomal scanning and prevent translation. Finally, we identified an unexpected affinity of expanded transcripts for the ribosomal subunits independently from translation.


(General). It is very puzzling that the products of RAN translation could be labelled with [35S]
Methionine in an RRL (Figs. 2b,e and 3c). The lysate contains methionine aminopeptidase that cotranslationally cleaves the initiator methionine when the nascent polypeptides are ~20 residues long. Usually, the initiator N-formyl[35S] Met-tRNAi, rather that [35S] Methionine is used in this type of experiments. This tRNA labels only the NH2-terminus of proteins, and the formyl group prevents the action of the aminopeptidase. Unless incorporation from N-formyl[35S] Met-tRNAi is shown, they could not conclude unequivocally about the role of Met-tRNAi in translation initiation. 2. Figure 1a-c. What is the ~14 kDa translation product? (Legend) It is not clear what concentrations of mRNAs were used. 3. Figure 2B. The mRNA minus control is missing. In addition, their system exhibits very high translation of endogenous globin mRNA. Wasn't the lysate treated with micrococcal -nuclease? 4. Figure 3. it would be interesting to investigate the role of the optimal -3/+4 nucleotide context in RAN translation. In addition, one wonders whether the substitution of the canonical initiator AUG for C UG in construct 4 can enhance translation. The authors might have these data already. 5. Figure 4b. It is bothersome that mutating the C UG codon to C C G does not inhibit the production of poly-GP in HEK293 cells, as it does in RRL ( Figure 3d). The authors suggest that in cells the poly -GP translation occurs from an alternative start codon and is influenced by trans -acting factors that are absent in RRL. This raises questions about the relevance of the use of the RRL system to study the translational mechanism. C ould they rescue poly -GP synthesis from the mutated construct in RRL by adding an extract from HEK293 cells? 6. Figure 7. It is concluded that the G4C 2 containing transcripts sequester ribosomal subunits, and presumably inhibit global translation. This result is rather preliminary. C ould the y to test this prediction by exploring the effect of these transcripts' sequestration, in trans on the translation of a reporter mRNA in RRL? 7. Figure 7e shows the position of RNP. This makes no sense as their assays were done only with mRNA and ribosomal subunits. Do I miss something?
Minor comments: This manuscript examines the ALS/FTD C 90rf72 gene and its expression as RAN translation. From the studies conducted, the authors conclude that RAN translation of this mRNA occurs in a cap -dependent manner utilizing a C UG codon for initiation. They also find that these transcripts are sticky/bind to either 40S or 60S subunits.
Major concerns 1. The authors use the IGR from the C ricket paralysis virus as a control for "efficient translation". However, it would seem more appropriate to use a normal, cap-dependent reporter to see just how efficiently the repeat transcripts are expressed (i.e. a globin mRNA derivative). Secondly, it is curious that there appears to be little dependency on mRNA input with only the GR product showing an increase with increased RNA (66) although there is a decrease with added RNA for the 30 repeats. Third, as relates to the relative synthesis of either GA, GP or GR, is it possible that this reflects the tRNA populations present in RRL? An examination of the rabbit beta globin chain mRNA indicates the following use of codons that might arise from the G4C 2 repeat: arginine -no C GG codons used; alanine -half of the codons used are GC C ; proline -no C C G codons used; glycine -more than half of the codons used are either GGG or GGC . If one assumes that in the reticulocytes, which are synthesizing 95-98% hemoglobin, that the tRNA population is a match to the amino acids in hemoglobin, might this then be reflected in the synthesis seen in Figure 1, panels A, B and C ? 2. A more convincing proof that RAN translation initiates with methionine would be to add poly(IC ) to activate PKR and show that phosphorylation of eIF2 reduces expression of the peptides. It is noted even for globin synthesis that the N-terminal, initiating methionine is removed and thus the only methionine registered is from an internal methionine. Based upon the N-end rule, the amino acid coded for following the C UG codon would be glutamic acid (GAA) and this should result in the removal of the N-terminal methionine (see Huang et al. Biochemistry 1987).
3. The authors do show that the translation of their transcripts is favored when the mRNA is capped. However, what is the evidence that in vivo the mRNA responsible for RAN translation is capped? 4. What is the evidence that the C UG initiating codon does in fact direct the binding and use of Met-tRNAi (see above concern)? The use of C UG codons and leucyl-tRNA have shown up in several recent publications. Figure 7 -binding of GGGGC C transcripts to ribosomal subunits. This experiment is uncontrolled. It would appear that the transcripts are being bound non-specifically. C ontrols such as globin mRNA or the IRG segment used in Figure 1 should be used to ensure that the observed binding is of some relevance. This is especially worrisome for the appparent "polysome -like" aggregates seen in Panel E.

5.
Minor concerns 1. The authors would benefit from reviewing their manuscript for better use of English and to remove some technical errors (i.e. Introduction -"… that requires numerous elongation initiation factors (eIFs)…" The e in eIF stands for eukaryotic, not elongation.). 2. Why is the level of cap inhibitor (m7GpppG) so high to affect inhibition (1.5 mM). Often in other studies, the level used was in the 100 micromolar range. This level represents a 15,000 to 1 ratio of analog to mRNA.
Revision of the manuscript "ALS/FTD C9ORF72 transcripts initiate translation at a CUG codon and sequester ribosomal subunits" (NCOMMS-17-10398) submitted by Tabet et al.

General response to the Editor and Reviewers
We are grateful for the overall positive feedback from the referees and addressed all editorial and reviewers' concerns to improve the quality of our study. In summary, we have provided the following new pieces of evidence to strengthen our initial findings: • The main concerns raised were related to the methionine incorporation in DPR products and were in part due to imprecisions in our initial description of the constructs. We have now clarified in the text that the G 4 C 2 repeat constructs used to monitor RAN translation and incorporation of 35 S-methionine do not contain any AUG codon in none of the three frames and the DPR products should not incorporate any methionine other than at the initiation codon. Consistently, we observed incorporation of radiolabelled methionine in DPR products translated from constructs harboring a near cognate CUG start codon located in the +1 frame (poly-GA). The translation of 35 S-methionine DPR products was alleviated by a single mutation of the CUG codon into CCG, indicating that the methionine is incorporated at the N-terminal of the DPR proteins. We reinforced our finding by replacing the CUG into a canonical AUG codon and by mutating the surrounding Kozak sequence, as suggested by Reviewer 1 (new Figure 3e,f). As expected, a start codon AUG instead of CUG increases the level of methionine incorporation and the levels of all three DPRs. In contrast, mutation of the Kozak sequence inhibits the production of 35 S-methionine proteins, poly-GA, poly-GP and poly-GR, supporting a frameshifting mechanism where translation of all DPR proteins starts at the CUG codon in the +1 frame and undergoes frameshifiting to produce poly-GP and poly-GR. Furthermore, inhibiting the ternary complex eIF2-methionylated-tRNA-initiator by Poly(I:C), as suggested by Reviewer 2, blocks 35 S-methionine incorporation demonstrating the role of methionylated-tRNA-initiator in DPR translation initiation (new Supplementary Figure 4a,b).
• We strengthened our findings by demonstrating that the CUG near cognate start codon plays a crucial role in C9ORF72 RAN translation in vitro (RRL) as well as in human cells including HEK293 and human neural progenitors (new Figure 4). Indeed, mutating the CUG codon into CCG altered RAN translation in all three systems (RRL, HEK293 and human neural progenitors), confirming that RAN translation starts at CUG codon and undergoes frameshifting in vivo. Interestingly, translation in the poly-GP frame was differently regulated in HEK293 cells compared to RRL and neural progenitors suggesting cell type-specific mechanisms of translation regulation.
• We also confirmed that cis-acting elements in the 5' flanking sequence are important for RAN translation control in C9ORF72 patient fibroblasts. Indeed, in our initial manuscript we had demonstrated that the translation of an upstream open reading frame (uORF) inhibits RAN translation of the downstream G 4 C 2 repeat. • We complemented our study with several additional experiments suggested by the reviewers such as comparing RAN translation efficiency in our RRL system not only to IRES-dependent translation but also to the canonical cap scanning mechanism (new Supplementary Figure 2). We also compared G 4 C 2 repeats RNA profile in polyribosomes purification to the antisense C 4 G 2 repeats RNA, and we observed that only G 4 C 2 RNA is capable to sequester ribosomal subunits independently of RAN translation (new Figure 7a-c).
We provide below a point by point response to the Reviewers and hope that the revised manuscript can now be recommended for publication in Nature Communications.
Reviewer #1: Hexanucleotide expansions (G4C2)exp in the C9ORF72 gene are the major cause of two fatal neurodegenerative disorders, ALS and FTD. In this study, Tabet et al. recapitulated and investigated repeat-associated non-AUG (RAN) translation of the human C9ORF72 expansion transcripts in a rabbit reticulocyte lysate (RRL). This translation occurs from all reading frames and requires a CUG start codon and an initiator Met-tRNAi. Mutations of the CUG start codon affected synthesis of proteins from all the three reading frames. An upstream open reading frame in mis-spliced C9ORF72 transcript is shown to be inhibitory for RAN translation. Surprisingly, and in contrast to conventional mRNAs, the expanded transcripts bind ribosomal subunits independent from their translation.
Overall, this paper presents a detailed and comprehensive analysis of RAN translation directed by C9ORF repeat expansion, which is a frequent cause of ALS/FTD neurodegenerative disorders. However, there are several questions, some of them should be experimentally addressed. We agree with the reviewer that the N-terminal methionine is usually processed. However, using the histone H4 mRNA (which contains the initiator methionine and a single internal methionine), we have previously determined that in our self-made Rabbit Reticulocyte lysates, only ~60-70% of the N-terminal methionine is processed leaving a residual Nterminal 35 S-methionine (Martin et al., 2011;Martin et al., 2016). An incomplete processing of the N-terminal methionine was also found in vivo with ~20% of the N-terminal methionine being acetylated instead of removed (Giglione et al., 2015). In this manuscript, none of the constructs, #3 to #11, used to study the G 4 C 2 translation contain any AUG codon in none of the three frames (Supplementary Fig. 1; Table S1). Hence, the DPR products do not incorporate any internal methionine and the radiolabelled DPR products that we observe at the expected size (Figure 2) derive from the incorporation of N-terminal 35 S-methionine. This has now been clarified in the text and we have included a control without G 4 C 2 repeats to confirm that the 35 S-labelled product is dependent on the translation of the G 4 C 2 repeat transcripts (new Figure 2b). Consistently, 35 S-labelled peptides are immunoprecipitated by an antibody against the HA tag which is in frame with the poly-GA dipeptide repeat proteins (Figure 2c). In addition, it is increasingly recognized that processing of the N-terminus is influenced by the nature of the second amino-acid (Frottin et al., 2006;Martinez et al., 2008). We used the "Terminator" software available on line (https://bioweb.i2bc.paris-saclay.fr/terminator3/) to predict the N-terminus of mature poly-GA and control renilla luciferase proteins. The poly-GA N-terminus is predicted with a likelihood of 100% to be a methionine, meaning unprocessed, due to the presence of a glutamic acid residue in the second position, while the methionine of the renilla luciferase N-terminus is predicted to be processed with a likelihood of 77%.
Finally, following the suggestions from both reviewers we have performed 2 new experiments further confirming that RAN translation of G 4 C 2 initiates by a methionine. First, we generated two additional mutants of our G 4 C 2 repeat transcripts (see also comment #3 of Reviewer 1). In the first mutant construct, replacement of the CUG by a genuine AUG start codon leads to the production of DPR proteins with a similar size compared to the CUG native transcript indicating that radiolabeled DPR proteins result from initiation at this position with incorporation of 35 S-methionine. In the second one, the Kozak sequence of the CUG has been mutated which dramatically reduced the labeling of the DPR providing another evidence that initiation takes place at this codon with a 35 S-methionie residue (new Figure  3e,f). Second, we inhibited the ternary complex formation by inducing the phosphorylation of eIF2 alpha subunit (poly (I:C)/salubrinal treatment) thereby inhibiting the recruitment of methionylated initiator tRNA (see also comment #2 of Reviewer 2). Such a treatment inhibits 35 S-methionine incorporation of canonical cap-dependent translation but not IGR-driven translation which does not use the initiator tRNA to start translation. This treatment also inhibits 35 S-Methionine incorporation in DPR products, confirming the methionine incorporation by methionylated initiator tRNA and the involvement of eIF2 in RAN translation (new Supplementary Fig. 4).
Altogether, these results demonstrate that synthesis of the detected peptides is indeed starting by a methionine residue. The ~14 kDa translation product corresponds to a polypeptide expressed from a transcript containing 30 G 4 C 2 repeats. We confirmed that the band corresponds to the expected product by immunoprecipitation with an antibody specific for HA-tag (in frame with poly-GA).
The concentrations of RNA used in the translation experiments are ranging from 100 to 200 nM. This information has now been included in the legend.
3. Figure 2B. The mRNA minus control is missing. In addition, their system exhibits very high translation of endogenous globin mRNA. Wasn't the lysate treated with micrococcalnuclease?
We thank the reviewer for pointing out this omission. We have now included 35 S-Met autoradiograph corresponding to a translation experiment in RRL with and without G 4 C 2 RNA (negative control) (new Figure 2b). As expected, the specific 35 S-labelled band immunoprecipitated by anti-HA antibody ( Fig. 2c) is not observed in absence of G 4 C 2 RNA (66 repeats, construct #4). Our self-made rabbit reticulocyte lysates are indeed not treated by micrococcal-nuclease. Therefore, we can observe on 35 S-Met autoradiographs the synthesis of endogenous globin and lipoxygenase that are the most prevalent mRNAs in reticulocyte lysates. This method enables us to monitor in our extracts the translation efficiency of an internal endogenous control. In addition, the lack of nuclease treatment provides more physiologically relevant cell-free translation extracts that recapitulate more faithfully in vivo translation features such as cap-dependency as previously described (Ricci et al., 2011).
Reference related to this comment: Ricci EP, Limousin T, Soto-Rifo R, Allison R, Pöyry T, Decimo D, Jackson RJ, Ohlmann T Activation of a microRNA response in trans reveals a new role for poly(A) in translational repression. (2011) Nucleic Acids Res. 39, 5215-31.
4. Figure 3. it would be interesting to investigate the role of the optimal -3/+4 nucleotide context in RAN translation. In addition, one wonders whether the substitution of the canonical initiator AUG for CUG in construct 4 can enhance translation. The authors might have these data already.
We are grateful for this suggestion and, as described in the first comment, we have performed these experiments. Mutating the CUG near cognate start codon into a genuine AUG significantly increased RAN translation in all reading frames (new Figure 3b,e,f and Supplementary Fig. 1 construct #10), confirming that RAN translation starts at CUG codon and undergoes frameshifting to produce poly-GP and poly-GR in the +2 and +3 frames, respectively. The double mutation -3/+4 GCUCUGG>UCUCUGC in the Kozak sequence (new Figure 3b,e,f and Supplementary Fig. 1 construct #11) severely reduced the level of poly-GA and prevented the production of poly-GP and poly-GR. These data are corroborated with 35 S-Met incorporation experiments (new Figure 3e). Overall, these new experiments confirm that G 4 C 2 RAN translation shares similar mechanisms with canonical translation, including a very efficient translation when the start codon is in perfect kozak sequence context. They also confirmed that CUG is indeed the start codon for DPRs production using an initiator Met-tRNA.
5. Figure 4b. It is bothersome that mutating the CUG codon to CCG does not inhibit the production of poly-GP in HEK293 cells, as it does in RRL (Figure 3d). The authors suggest that in cells the poly-GP translation occurs from an alternative start codon and is influenced by trans-acting factors that are absent in RRL. This raises questions about the relevance of the use of the RRL system to study the translational mechanism. Could they rescue poly-GP synthesis from the mutated construct in RRL by adding an extract from HEK293 cells?
We performed the suggested experiment by supplementing RRL system with HEK293 lysates and measuring 35 S-Met incorporation and DPR levels. We mainly observed that HEK293 extracts inhibit global translation in our system (Figure 1 inserted below). Indeed, the level of beta-globin is strongly reduced with an increased concentration of HEK293 lysates. Choosing a low concentration of HEK293 lysate that does not affect overall translation did not increase the level of GP. This experiment is not conclusive and we think that HEK293 and RRL extracts might contain other translational activators/inhibitors affecting the specific transacting factor effect.
Interestingly, during the revision of this manuscript, DDX21 was identified in HEK293 cells as an RNA helicase able to bind and unwind RNA containing G-quadruplex (McRae et al., 2017). Since G-quadruplex structures are formed by C9ORF72 G 4 C 2 repeats, DDX21 is a very attractive candidate as modifier of RAN translation in different systems. We now discuss this new finding in our manuscript. In addition, to further explore whether cell-type specific factors influence RAN translation, we have tested G 4 C 2 RAN translation and the impact of CUG mutation in human neural progenitors. RAN translation of poly-GA but also poly-GP were both prevented by mutating the near-cognate codon, recapitulating the results observed in RRL (new Figure 4b, c) and confirming that cell type-specific factors intervene in the production of poly-GP in HEK293 cells. 6. Figure 7. It is concluded that the G4C2 containing transcripts sequester ribosomal subunits, and presumably inhibit global translation. This result is rather preliminary. Could they to test this prediction by exploring the effect of these transcripts' sequestration, in trans on the translation of a reporter mRNA in RRL?
As stated in point #3, we used untreated RRL that still contain lipoxygenase and beta-globin mRNA. In all the experiments presented in the manuscript, we do not see a concomitant reduction of the translation of these two mRNAs when G 4 C 2 mRNA is translated, suggesting that G 4 C 2 RNA does not show a global translation inhibitory effect at 100 and 200 nM. This can be explained by the presence of ribosome in large excess in RRL, that request probably very high amount of G 4 C 2 repeats or very long repetitions as observed in patients to induce translation inhibition. However, we agree that our experiments do not demonstrate that the global translation is inhibited in C9ORF72 patient cells and we have toned down this statement.
Notably, we now have more evidence that ribosomal sequestration is dependent on G 4 C 2 repeats, as the antisense C 4 G 2 repeat transcripts, that undergoes RAN translation in C9ORF72 patients, do not sequester ribosomes (new Figure 7a-c). Importantly, we have also included an additional control with G 4 C 2 transcripts alone on sucrose gradients (new Figure 7b). Indeed, repeat expansion-containing RNAs were recently shown to undergo abnormal phase transition leading to the formation of gel-like structures in vitro (Jain and Vale, 2017). With this new experiment (Figure 7b), and the use of purified ribosomal subunits (Figure 7e), we demonstrate that migration of G 4 C 2 RNAs to the heavy fractions of sucrose gradient is due to sequestration of ribosomal subunits rather than a phase transition phenomenon. 7. Figure 7e shows the position of RNP. This makes no sense as their assays were done only with mRNA and ribosomal subunits. Do I miss something?
The reviewer is right and we have now replaced "RNP" by "free RNA" on Figure 7e.
Minor comments: 1. Page 3 "…process that requires numerous elongation initiation factors (eIFs)." should read "…process that requires numerous eukaryotic initiation factors (eIFs)." The correction has been made.
Reviewer #2 (Remarks to the Author): This manuscript examines the ALS/FTD C90rf72 gene and its expression as RAN translation. From the studies conducted, the authors conclude that RAN translation of this mRNA occurs in a cap-dependent manner utilizing a CUG codon for initiation. They also find that these transcripts are sticky/bind to either 40S or 60S subunits.
Major concerns 1. The authors use the IGR from the Cricket paralysis virus as a control for "efficient translation". However, it would seem more appropriate to use a normal, cap-dependent reporter to see just how efficiently the repeat transcripts are expressed (i.e. a globin mRNA derivative). Secondly, it is curious that there appears to be little dependency on mRNA input with only the GR product showing an increase with increased RNA (66) although there is a decrease with added RNA for the 30 repeats. Third, as relates to the relative synthesis of either GA, GP or GR, is it possible that this reflects the tRNA populations present in RRL? An examination of the rabbit beta globin chain mRNA indicates the following use of codons that might arise from the G4C2 repeat: arginine -no CGG codons used; alanine -half of the codons used are GCC; proline -no CCG codons used; glycine -more than half of the codons used are either GGG or GGC. If one assumes that in the reticulocytes, which are synthesizing 95-98% hemoglobin, that the tRNA population is a match to the amino acids in hemoglobin, might this then be reflected in the synthesis seen in Figure 1, panels A, B and C?
We have followed the reviewer recommendation and compared RAN translation to the translation of both a cap-dependent reporter and an IRES driven reporter. Indeed, we have compared the translation efficiency of the Renilla Luciferase gene under the control of either the beta-globin 5'UTR or the IGR (new Supplementary Fig. 2). Efficiency was measured by 35 S-methionine incorporation and luminescence. Both showed that translation driven by the beta-globin 5'UTR is two times more efficient than IGR. We showed in Fig. 1 of the manuscript that poly-GA RAN translation is eighteen times more efficient than IGR, hence approximately 9 times more efficient than the beta-globin 5'UTR. Poly-GP and poly-GR translation is equivalent to IGR, and twice less efficient than beta-globin. Overall, G 4 C 2 RAN translation is a very efficient mechanism, considering that beta-globin is highly translated in RRL.
Concerning the RNA concentration dependency, DPR level is increased with the concentration of transcripts containing 66 repeats. Translation efficiency of poly-GA is very high when transcripts are capped and reach saturation, but when the efficiency is lower we can observe that poly-GA levels increase with RNA concentration (Figure 1a; uncapped versus capped 66 repeats). We agree that the translation efficiency of poly-GR with 30 repeats is less sensitive to the RNA concentration. This experiment was repeated several times and we do not have an explanation so far about why increasing RNA concentration is inhibitory to GR expression.
Finally, concerning the tRNA concentration from rabbit reticulocyte lysates, we agree that translation in RRL could be influenced by globin expression and be responsible for different rates of expression. However, it is noteworthy that the RRL used in our study are self-made extracts, supplemented with total tRNAs purified from rabbit liver to activate the system and therefore erasing any potential reticulocyte-specific tRNA content adaptation. In addition, the general codon usage of Oryctolagus cuniculus indicates that the codons used for DPR synthesis (highlighted in yellow) are not considered as rare codons except for CCG (Pro) (see below, from http://www.kazusa.or.jp). RRL also contain lipoxygenase mRNA and this mRNA contains 4 CCG codons. The fact that we don't see any trans-inhibitory effect of G 4 C 2 mRNA on both globin and lipoxygenase synthesis (see response to point 3 of Reviewer 1) demonstrates that the corresponding tRNAs are indeed not limiting.

2.
A more convincing proof that RAN translation initiates with methionine would be to add poly(IC) to activate PKR and show that phosphorylation of eIF2 reduces expression of the peptides. It is noted even for globin synthesis that the N-terminal, initiating methionine is removed and thus the only methionine registered is from an internal methionine. Based upon the N-end rule, the amino acid coded for following the CUG codon would be glutamic acid (GAA) and this should result in the removal of the N-terminal methionine (see Huang et al. Biochemistry 1987).
We agree with the reviewer that the N-terminal methionine is usually processed. However, using the histone H4 mRNA (which contains the initiator methionine and a single internal methionine), we have previously determined that in our self-made Rabbit Reticulocyte lysates, only ~60-70% of the N-terminal methionine is processed leaving a residual Nterminal 35 S-methionine (Martin et al., 2011;Martin et al., 2016). An incomplete processing of the N-terminal methionine was also found in vivo with ~20% of the N-terminal methionine being acetylated instead of removed (Giglione et al., 2015). In this manuscript, none of the constructs, #3 to #11, used to study the G 4 C 2 translation contain any AUG codon in none of the three frames (Supplementary Fig. 1; Table S1). Hence, the DPR products do not incorporate any internal methionine and the radiolabelled DPR products that we observe at the expected size (Figure 2) derive from the incorporation of N-terminal 35 S-methionine. This has now been clarified in the text and we have included a control without G 4 C 2 repeats to confirm that the 35 S-labelled product is dependent on the translation of the G 4 C 2 repeat transcripts (new Figure 2b). Consistently, 35 S-labelled peptides are immunoprecipitated by an antibody against the HA tag which is in frame with the poly-GA dipeptide repeat proteins (Figure 2c). In addition, it is increasingly recognized that processing of the N-terminus is influenced by the nature of the second amino-acid (Frottin et al., 2006;Martinez et al., 2008). We used the "Terminator" software available on line (https://bioweb.i2bc.paris-saclay.fr/terminator3/) to predict the N-terminus of mature poly-GA and control renilla luciferase proteins. The poly-GA N-terminus is predicted with a likelihood of 100% to be a methionine, meaning unprocessed, due to the presence of a glutamic acid residue in the second position, while the methionine of the renilla luciferase N-terminus is predicted to be processed with a likelihood of 77% (see Table 1 in response to Reviewer 1).
As suggested, we inhibited the ternary complex formation by inducing the phosphorylation of eIF2 alpha subunit (poly (I:C)/salubrinal treatment) thereby inhibiting the recruitment of methionylated initiator tRNA. Such a treatment inhibits 35 S-methionine incorporation of canonical cap-dependent translation but not IGR-driven translation which does not use the initiator tRNA to start translation. This treatment also inhibits 35 S-Methionine incorporation in DPR products, confirming the methionine incorporation by methionylated initiator tRNA and the involvement of eIF2 (new Supplementary Fig. 4). Along with other experiments described in the answer to Reviewer 1 (comment #1), these results demonstrate that RAN translation of G 4 C 2 is indeed initiated by a methionine residue.
3. The authors do show that the translation of their transcripts is favored when the mRNA is capped. However, what is the evidence that in vivo the mRNA responsible for RAN translation is capped?
The cap is recognized by the complex eIF4F (eIF4E, eIF4G and eIF4A) allowing the recruitment of the 40S ribosomal subunit at the 5' end of the mRNA. We showed in RRL that treatment with FL3, a specific inhibitor of the RNA helicase eIF4A, inhibits G 4 C 2 RAN translation ( Figure 6b and Supplementary Fig. 6b,c). We now have tested the impact of FL3 treatment on RAN translation in vivo by treating HEK293 cells transfected with construct #4. Translation of poly-GA, GP and GR RAN was severely reduced supporting the importance of eIF4F and a cap-dependent scanning mechanism for RAN translation in human cells (new Figure 6c-e).
4. What is the evidence that the CUG initiating codon does in fact direct the binding and use of Met-tRNAi (see above concern)? The use of CUG codons and leucyl-tRNA have shown up in several recent publications.
5. Figure 7 -binding of GGGGCC transcripts to ribosomal subunits. This experiment is uncontrolled. It would appear that the transcripts are being bound non-specifically. Controls such as globin mRNA or the IRG segment used in Figure 1 should be used to ensure that the observed binding is of some relevance. This is especially worrisome for the appparent "polysome-like" aggregates seen in Panel E.
We do agree with the reviewer that it was unexpected to observe that G 4 C 2 expanded transcripts can bind ribosomes independently from RAN translation. We have now included several controls to reinforce this result. First, we migrated free G 4 C 2 repeat RNAs on sucrose gradient without any extracts or factors to ensure that the RNA itself was not undergoing gel formation mimicking a polysome profile. Indeed, repeat expansion-containing RNAs were recently shown to undergo abnormal phase transition leading to the formation of gel-like structures in vitro (Jain and Vale, 2017). G 4 C 2 RNAs do not sediment with the heavy fractions but remain in the light fractions (new Figure 6b). Second, we determined that, contrary to the sense G 4 C 2 transcripts, the antisense C 4 G 2 transcripts with 66 repeats migrates in the light fractions on sucrose gradient (new Figure 7b). Treating the RRL extract with edeine prevents ribosomes association for antisense but not sense transcripts, confirming the binding specificity of the G 4 C 2 containing RNAs (new Figure 7c). We also show that ribosomes assembly with capped beta-globin transcripts, but not G 4 C 2 repeats, is affected by cycloheximide, edeine and GMP-PNP (new Figure 7a-d and Supplementary Fig. 6e-h).
Finally, none of the constructs tested in our laboratory other than G 4 C 2 transcripts, including the histone H4, showed a "polysome like profile" when using purified ribosomal subunits (Figure 7e). This error has been corrected.

2.
Why is the level of cap inhibitor (m7GpppG) so high to affect inhibition (1.5 mM). Often in other studies, the level used was in the 100 micromolar range. This level represents a 15,000 to 1 ratio of analog to mRNA.
As shown in Figure 2d, the principle of this competition assay is to saturate the endogenous eIF4E factor by blocking its cap-binding pocket with a cap analog. The whole pool of eIF4E has to be neutralized, hence the inhibiting concentration of cap analog is not related to the mRNA concentration but rather to the amount of eIF4E present in the RRL. In addition, an excess of cap analog is required to prevent its dissociation from eIF4E. To support our initial finding, we have performed the same competition assay using Wheat Germ Extracts that are fully cap-dependent (new Supplementary Fig. 4c,d). The results are comparable in Wheat Germ Extracts and RRL supporting a 5' scanning mechanism for RAN translation of G 4 C 2 transcripts.
3. Figure 2 -it is not clear from the transcripts whether there are any internal methionines in the coding region. This should be checked and reported.
We apologize for the confusion and have now clarified in the manuscript that "the sequence of the transcripts #3 and #4 do not contain any AUG codon and the presence of [ 35 S]methionine in RAN products cannot derive from the incorporation of an internal methionine (Supplementary Figure 1, Table 1)".
The revised manuscript by Tabet and co-authors and their point by point response satisfactorily addresses my major concerns.
They now show that mutating of the C UG codon to AUG in the GA reading frame significantly enhances expression of GA-NLuc while decreasing expression in the GP frame both in vitro and in vivo. This is consistent with competition for initiation between GA and GP reading frames. However, evidence for competition between GA and GR reading frame is less clear given the discrepant results with HEK293 cells and RRL (new Supplementary Fig. 2d, f).
I agree with their reasoning that less efficient binding of eIF4F to cap analogs as compared to capped mRNAs could be responsible for the relatively small inhibition of the control AUG mRNA by the m7GpppG cap analogue.
An intriguing observation is that the loss of the upstream near -cognate C UG start codon significantly stimulates RAN translation in the GP reading frame in RRL. The authors speculate that the scanning PIC can initiate translation within the repeat itself, and that the removal of C UG increases the fraction of PIC s reaching the repeat. Would it be possible to provide any experimental support for such a mechanism?
Reviewer #2 (Remarks to the Author): This is an improved manuscript that has addressed most, but not all of previous concerns. The positive feature is the identification of a non-conical start codon C UG. The unique feature is the analysis that suggests that this codon is the only start codon and that only one of the three dipeptide repeats is in frame with this codon. The required frame shift (proposed) for the other two reading frames has not been seen by others. C oncerns 1. Figure 2 -The ratio of globin synthesis to the peptide repeats would suggest that the dipeptide repeats are going via a minor route for initiation and one that is less efficient. This may also be evident for the proteins synthesized as marked by an asterisk (*). Previously a protein of 41,000 Da was found to be radiolabelled using RRL and this was independent of either added mRNA or added puromycin. This was called the "Kaji system" and represented the addition of methion ine to the Nterminus of an existing protein. What is the absolute quantitation of synthesized globin to the dipeptide repeat? And since the synthesis of the GA dipeptide is about 18 times greater than that of the others, what is the chance that these may represent a different initiation mechanism? Additionally, why is lipoxygenase synthesis inhibited by the 30 repeats but not by the 60 repeats in panel b? 2. Figure 3 -why is there a decrease in lane 7 of panel c or is this just variable results day to da y? Second, is there anything unusual about construct 8 which has only 9 nucleotides upstream of the initiating C UG codon? 3. What is the quantitative reduction in globin synthesis in the presence of poly(I:C ) -(in Figure S4b)? By eye, the reduction appears to be quite small, perhaps indicating that a non-optimal level of poly(I:C ) has been added. 4. The results seen in Figure 4d are inconsistent with a frame shifting event causing the synthesis of the GP dipeptide repeat. 5. It is likely that the results in Figure 6b are only valid if there is an equal reduction in total protein synthesis with each of the inhibitors. It would appear that the level of global inhibition of protein synthesis under the conditions used is greater with C HX or Edeine. 6. The results in Figure 7 appear inconsistent. In panels d and e, why is there such a spread with the (G4C 2)30 RNA? Why are there polysomes with the (G4C 2)66 RNA when there is none with beta -globin mRNA? How many (G4C 2) repeats does it take to obtain polysomes? Is the binding of the repeats sensitive to salt (it is noted that 50 mM is rather low salt)? 7. There is no description as to how or why frame shifting might occur. Second, a brief examination of the sequence in the coding region between the C UG start and the repeat does not provide evidence for a slippery sequence (i.e. a homopolymeric stretch, especially A's or U's). Second, frameshifting is usually a very rare event occurring at most at the 1% level unless a slippery sequence is present (i.e. RF2 can frameshift with a frequency approaching 50%). Thus, it is unclear to this reviewer that frameshifting is the cause for the synthesis of the GP and GR dipeptide repeats. 8. This reviewer is not convinced that the loading of ribosomal subunits on the nucleotide repeat represents a significant biological finding (and is more likely an artifact). Proof of functionally would require a correlation of extent of nucleotide repeat binding with increased protein synthesis of the dipeptide repeats.
This is an improved manuscript that has addressed most, but not all of previous concerns. The positive feature is the identification of a non-conical start codon CUG. The unique feature is the analysis that suggests that this codon is the only start codon and that only one of the three dipeptide repeats is in frame with this codon. The required frame shift (proposed) for the other two reading frames has not been seen by others.
Concerns 1. Figure 2 -The ratio of globin synthesis to the peptide repeats would suggest that the dipeptide repeats are going via a minor route for initiation and one that is less efficient. This may also be evident for the proteins synthesized as marked by an asterisk (*). Previously a protein of 41,000 Da was found to be radiolabelled using RRL and this was independent of either added mRNA or added puromycin. This was called the "Kaji system" and represented the addition of methionine to the N-terminus of an existing protein. What is the absolute quantitation of synthesized globin to the dipeptide repeat? And since the synthesis of the GA dipeptide is about 18 times greater than that of the others, what is the chance that these may represent a different initiation mechanism? Additionally, why is lipoxygenase synthesis inhibited by the 30 repeats but not by the 60 repeats in panel b?
We have chosen to use self-made untreated RRL because it has been previously described that RNAse treatment (used in commercial available extracts) is detrimental to the efficiency of the translation, especially in terms of cap-dependency (Soto Rifo et al., 2007). In this system, the globin is translated from a large pool of existing endogenous mRNA encoding β-globin. However, we are using sub-saturating G 4 C 2 mRNA concentrations for DPR synthesis to avoid titration effects (Fig 2b, the DPR synthesis is increasing with higher G 4 C 2 RNA concentrations), which is not the case for the β-globin mRNA present in the lysates. In addition, it is well established that globin translation is extremely high in RRL (Nienhuis and Benz N, 1977;Mills et al., 2017). Lastly, globin and lipoxygenase contain two and seventeen methionine respectively, while G 4 C 2 mRNA does not encode any methionine and 35 S-Met radiolabelled DPR products contain only the methionine incorporated at the CUG start codon. In summary, RNA concentrations and number of 35 S-methionine incorporated in the proteins are different between DPR, β-globin and lipoxygenase. Therefore, we quantitatively compared DPR to IRES-dependent translation, using the same tags for immunoblot experiments and the same equimolar RNA concentrations (Fig. 1). Then, following the reviewer recommendations, we used the renilla luciferase coding sequence as reporter to compare the level of capped β-globin 5'UTR to IRES-dependent translation and determine the IRES/cap-dependent translation ratio in our system (Fig S2). Translation triggered by the 5'UTR of the β-globin is two times higher than the IRES-dependent translation. Thus, the efficiency of poly-GA translation is 16 times more than IRES-mediated translation compared with HA tag (Fig. 1), and 8 times more than translation triggered by capped β-globin 5'UTR.
We also would like to stress that G 4 C 2 repeats RNA constructs do not harbor any AUG codon encoding methionine. Thus, incorporation of 35 S-methionine is only due to the N-terminal incorporation of a methionine in DPR products. We provide several pieces of evidence demonstrating that addition of methionine at the N-terminus of a pre-existing DPR protein is not occurring in our system and that 35 S-Met products migrating at the expected size correspond to the translation of either 30 or 66 G 4 C 2 repeats. Indeed, a single point mutation CUG>CCG, as well as mutations removing the CUG codon, abolish the incorporation of 35 S-Methionine (Fig. 3c). In addition, 35 SMet-DPRs are successfully immunoprecipitated with an antibody against HA-tag in frame with poly-GA (Fig. 2c). Furthermore, 35 SMet-DPRs translation is enhanced when the CUG codon is mutated to a canonical AUG, and decreased when the Kozak sequence is mutated (Fig 3e). Finally, 35 S-radiolabelled DPR is inhibited by an existing uORF (Fig 5c), ASOs targeting the 5' flanking sequence (Fig. 6f,g) and translation inhibitors targeting eIF2-Met-tRNA Met i complex (Supplementary Fig. 4).
We apologize for the confusion but Lipoxygenase is not inhibited in Figure 2b. Indeed, the band on top of the gel that was marked by an asterisk actually corresponded to the stacking of the gel. We thank the reviewer for pointing this error and have now replaced the figure to show the entire gel and correctly annotate the band corresponding to Lipoxygenase. 2. Figure 3 -why is there a decrease in lane 7 of panel c or is this just variable results day to day? Second, is there anything unusual about construct 8 which has only 9 nucleotides upstream of the initiating CUG codon?
We agree with the reviewer that RAN translation is moderately decreased in lane 7 in Figure  3c corresponding to the construct #7. These results are reproducible (the gel provided in Figure 3 is representative of results obtained from 3 independent experiments), and consistent with the immunoblot results in panel d of the same figure, with a moderate decrease of the translation in all reading frames. We have now clearly stated this result in the text. A slight reduction of RAN translation is also observed with construct #8, harboring only 9 nucleotides upstream of the CUG near cognate start codon (Fig 3c,d). As noted by the reviewer, the efficiency of the small subunit scanning is expected to be affected by short 5' UTR sequence (Kozak 1991;Martin et al., 2011;Elfakess et al., 2011). Indeed, the ribosome ideally covers 10 to 15 nucleotides upstream of the start codon. Nevertheless, we have previously shown that translation can still be efficient with a 9nt-5'UTR when a stable structure is located downstream of the start codon (Martin et al., 2011;Martin et al., 2016). Consistently, construct 8 is still efficiently translated in all three frames despite the short 5'UTR sequence (Fig 3c,d).
none with beta-globin mRNA? How many (G4C2) repeats does it take to obtain polysomes? Is the binding of the repeats sensitive to salt (it is noted that 50 mM is rather low salt)?
To analyze the ribosome assembly on each mRNA (Fig 7a), we radiolabelled them at their 5' end by capping. The radioactive mRNAs were incubated in cell-free translation extracts or with purified ribosomal subunits and then loaded onto sucrose gradients for sedimentation. After gradient collection, we monitored the position of the mRNA by measuring the radioactivity of all the sucrose gradient fractions. The profiles presented in Figure 7 and Supplementary Figure 6e-i, represent the radioactive counts, which correspond to the radioactive mRNA throughout the whole gradients. The major finding of this experiment is the abnormal sedimentation of both (G 4 C 2 ) 30 and (G 4 C 2 ) 66 RNAs into heavy polyribosome fractions even when translation was blocked by different inhibitors. Translation was efficiently inhibited by addition of CHX 5 minutes prior to the addition of radiolabelled RNAs to RRL, as demonstrated by the migration of control βglobin RNA in light fractions with a high peak corresponding to the monosome fraction (Fig.  7d). Similar observations were obtained with other inhibitors such as edeine that blocks 43S complex (Fig. 7c) or GMP-PNP blocking the assembly of the 60S subunit ( Supplementary  Fig. 6g).
Additional evidence supporting the sequestration of ribosomal subunits onto G 4 C 2 RNA independently of translation is the observation that G 4 C 2 RNAs also migrate in heavy fractions when incubated with purified 40S or 60S ribosome subunits (Fig 7e).
We provide several controls in this experiment, including the demonstration that antisense (C 4 G 2 ) 66 RNAs do not sequester ribosomal subunits (Fig 7b,c). In addition, G 4 C 2 repeat RNAs were recently shown to undergo liquid droplet formation. We verified that G 4 C 2 RNA itself without RRL is not found in the heavy fractions, demonstrating that G 4 C 2 RNAs are migrating in heavy fractions only in presence of ribosomal components. Overall, We provide evidence that ribosomal components sequestration is independent from RAN translation.
The reviewer is right that salt concentration may influence RNA structure and association with ribosomal subunits of RNAs. Following his/her suggestion we have performed a new experiment showing that migration of G 4 C 2 transcripts in the heavy fraction is further increased when they are folded in presence K + ions that stabilize G-quadruplex structures, comparatively to Na + and Li + ions (new Supplementary Fig. 6i; 195mM).
7. There is no description as to how or why frame shifting might occur. Second, a brief examination of the sequence in the coding region between the CUG start and the repeat does not provide evidence for a slippery sequence (i.e. a homopolymeric stretch, especially A's or U's). Second, frameshifting is usually a very rare event occurring at most at the 1% level unless a slippery sequence is present (i.e. RF2 can frameshift with a frequency approaching 50%). Thus, it is unclear to this reviewer that frameshifting is the cause for the synthesis of the GP and GR dipeptide repeats.
There is compelling evidence from the literature that G-quadruplex structures induce translation frameshifting (Endoh and Sugimoto, 2013 ;Yu et al., 2014 ;reviewed in Kapur et al., 2017). This phenomenon can occur with only 1 G-quadruplex structure and is increased by stabilizing the G-quadruplex with specific molecules. We agree with the reviewer that there is no canonical slippery sequence, however the G-quadruplex structures are repeated 16 times in the 66 G 4 C 2 constructs and we provide several pieces of evidence supporting frameshifting events for Ran translation of G 4 C 2 (please see comment #4).