Ribosomal 18S rRNA base pairs with mRNA during eukaryotic translation initiation

Eukaryotic mRNAs often contain a Kozak sequence that helps tether the ribosome to the AUG start codon. The mRNA of histone H4 (h4) does not undergo classical ribosome scanning but has evolved a specific tethering mechanism. The cryo-EM structure of the rabbit ribosome complex with mouse h4 shows that the mRNA forms a folded, repressive structure at the mRNA entry site on the 40S subunit next to the tip of helix 16 of 18S ribosomal RNA (rRNA). Toe-printing and mutational assays reveal that an interaction exists between a purine-rich sequence in h4 mRNA and a complementary UUUC sequence of helix h16. Together the present data establish that the h4 mRNA harbours a sequence complementary to an 18S rRNA sequence which tethers the mRNA to the ribosome to promote proper start codon positioning, complementing the interactions of the 40S subunit with the Kozak sequence that flanks the AUG start codon.

I n eukaryotes, the start codon is identified through base-triplet scanning by the initiator-tRNA bound 40S ribosomal subunit (43S complex), starting from the usually m 7 G-capped 5 0 end until the correct AUG start codon is found and the 48S initiation complex is formed. At least 13 initiation factors are involved in translation initiation which results in the formation of the 80S initiation complex on joining of the 60S ribosomal subunit [1][2][3][4][5] . To ensure the fidelity of translation initiation, the start codon is usually located in the context of a Kozak sequence (A/G)CCAUGG (ref. 6) and contains a purine in position À 3 and a G in position þ 4. Variations of the Kozak sequence can lead to initiation at downstream AUG triplets by leaky scanning 7 . However, deviations from this classical model exist, for example, viral mRNAs that contain 5 0 untranslated region (UTR) internal ribosomal entry sites (IRES) often require only a subset of the initiation factors to hijack the ribosome, as visualized by several cryo-EM structures [8][9][10][11][12][13] . Histone H4 mRNA (h4) combines canonical features (cap-dependent translation) with viral strategy (lack of scanning). It contains a three-way junction (TWJ) with the unusual property of stalling engaged 80S ribosomes when cap-dependent translation is repressed 14 . The TWJ is located 19 nucleotides downstream from the AUG codon, and is flanked by a weak Kozak sequence (with a U in position þ 4) and a double stem-loop structure called eIF4E-sensitive element (4E-SE) (Supplementary Fig. 1) 14 . These specific RNA structures tether the translation machinery directly on the first AUG initiation codon of h4 mRNA, regardless of the presence of a second in-frame initiation codon. The lack of scanning appears to favour high expression levels of histone H4 protein during S-phase of the cell cycle, which is relevant for chromatin organization, but the regulatory mechanism is unknown. Here we localize the folded h4 mRNA TWJ domain on the rabbit ribosome using cryo-EM and show by toe-printing and mutational analysis that h4 mRNA exhibits shortly after the start codon a sequence complementary to the 18S rRNA sequence that helps mRNA binding and proper AUG positioning.

Results
Structure of the 80S ribosome assembled on histone h4 mRNA. Mouse h4 mRNA/rabbit 80S complexes were assembled in rabbit reticulocyte lysate and stalled in the initiation state by cycloheximide and hygromycin B that prevent the elongation at translocation steps. Complexes were pulled from the extracts by affinity purification 15 and analysed by cryo-EM. Rabbit reticulocyte lysates mimic the full complexity of the in vivo environment, and provide all required tRNAs besides translation factors for efficient assembly. However, the process also limited to some extend the resolution of the structure due to stronger sample heterogeneity, which could only in part be addressed by particle sorting. The cryo-EM structure of the predominant subpopulation nevertheless reached B10 Å resolution, which allowed localizing the h4 TWJ on the 80S ribosome. Further highresolution refinement provided better features on the ribosome but not on the h4 region probably due to multiple conformations (see Methods). It shows that h4 forms a folded, repressive structure bound to the 40S subunit at the mRNA entry site (Fig. 1a). The cryo-EM map was interpreted by fitting the atomic model of the human ribosome derived from high-resolution cryo-EM 16 . The structure contains an initiator tRNA accommodated in the peptidyl (P) site and a ternary complex of eEF1A-tRNA localized in the factor-binding site (Fig. 1a) reminiscent of a late 80S translation initiation complex in which codon recognition has occurred. A separate sub-class also shows the post-translocation complex with eEF2 (see Methods and Supplementary Fig. 2). The 5 0 extremity of h4 is positioned close to ribosomal proteins eS26 and eS28 as confirmed by chemical crosslinking experiments performed with h4 harbouring a periodate-oxidized cap ( Supplementary Fig. 3). The role of these proteins is supported by a recent study that showed how the IRES of hepatitis C virus (HCV) mimics a bacterial Shine-Dalgarno (SD)-anti-SD structure and interacts with eS26 and eS28 to facilitate mRNA loading and tRNA binding into the P-site 17 . The mRNA extends towards the mRNA entry site at position 26. A large additional density (by comparison with the empty ribosome, Supplementary Fig. 4) is located in this region, reminiscent of the folded TWJ RNA element. It is embedded between helix h16 (18S rRNA) and ribosomal proteins uS2, uS3 and eS10 of the 40S beak (Fig. 1b). The structure of the 80S ribosome complex with a deletion mutant of h4 comprising nucleotides 1-142 (h4 1-142 , rather than 377 nt; Supplementary  Fig. 4) confirms that the density corresponds to the 5 0 region of h4. Its size accounts for the ribosome-interacting TWJ part while the 3 0 region of the mRNA is disordered.
Interaction between ribosomal helix 16 and h4 mRNA. The binding of the h4 mRNA at the tip of helix h16 (18S rRNA) suggests that a direct interaction between h4 mRNA and the rRNA exists. This 18S rRNA region comprises an apical ( 540 UUUC 543 ) tetraloop in which the four nucleotides are often found to be flipped out in various ribosomal structures. To identify the possible nucleotides interacting with the 18S rRNA we probed the mRNA structure by nucleotide substitution and monitored binding to the ribosome by reverse transcriptase assays ('toe-printing'). A toe-print was detected at position þ 17 (numbering starting on the A of the AUG codon, or h4 nt 27), 3 nt upstream of the TWJ domain (Fig. 2a). Interestingly, nts 26 to 30 ( 26 AAGGG 30 ) could base pair with nts of h16 tetraloop ( 540 UUUC 543 ) to form a putative interaction site at the entrance of the mRNA channel. Such interaction would be consistent with the distance between the mRNA on the AUG in the P site and the TWJ on helix h16 as shown by mRNA modelling (Fig. 1c). This prompted us to mutate these nts of h4 and check whether ribosome positioning was modified. Single mutants of nts 26 to 30 were constructed and tested. They all exhibited a toe-print at position þ 17, but in addition also one at position þ 26 with an intensity inversely proportional to the one at position þ 17 suggesting that these nucleotides critically influence mRNA positioning and ribosome assembly ( Supplementary Fig. 5). In fact, toe-prints at position þ 26 indicate slippage of the ribosomes towards an out-of-frame AUG-like codon (G 21 U 22 G 23 ). We further combined mutations in double mutants and confirm ribosome slippage on the AUG codon, especially with mutants (28)(29) and (29)(30) (Supplementary Fig. 5). A triple mutant (27-28-29) exhibited a more drastic effect with toe-prints being spread over positions þ 17, þ 18 and þ 19 (Fig. 2a). These new shifts indicate that the mutated mRNA strand is less constrained in the mRNA channel and up to 2 extra nts can enter the mRNA cleft to give rise to the þ 18 and þ 19 stops (Fig. 2a,b). These results show that interactions between the initiator region of the mRNA and the 18S rRNA are required to avoid ribosome slippage over the AUG start codon. Along the same lines, nucleotide deletion downstream of the AUG induces a shift of the toe-print to position þ 16 ( Supplementary Fig. 5). This shows that the interaction with h16 is strong enough to stretch the mRNA by one nt in the mRNA channel. However, when 2 nts are deleted, the toe-print moves back to þ 17 suggesting loss of the h16-h4 interaction. Consistently, 1 nt deletion in the triple mutant (27-28-29) did not induce any shift of the toe-print position that would indicate the mRNA stretching. This further confirms that the interaction of h16 with residues (27-28-29) is contributing to mRNA binding ( Supplementary Fig. 5). Together, these experiments establish that interactions between h16 and h4 mRNA exist and are critical for optimally positioning the mRNA on the ribosome (Fig. 1d). To evaluate the significance of this interaction on poly-ribosome formation, polysome profiles of translation extracts programmed with wild-type h4 and the triple mutant (27-28-29) were examined. Compared with the triple mutant, the wild-type h4 was more efficient in ribosome assembly and translation. Indeed, 42.5% of the mRNA of the triple mutant were not assembled with ribosomes versus 31.5% for the wildtype h4. In addition, wild-type h4 exhibited more polysomes ( þ 11%), suggesting that it is more efficiently translated (Fig. 2c).
Binding of yeast 40S by a compensatory mutant of h4. To address the role of the h16 tetra-loop residues, we performed additional binding assays of h4 mRNA with 40S ribosomes. To demonstrate the base pairing between the mRNA and rRNA, we set out to identify compensatory mutations that restore the binding of h4 mRNA to a mutated h16 tetra-loop. As the production of mutated rabbit ribosomes is very challenging, we focused the experiment on purified yeast 40S subunits, which naturally exhibit a variation of the tetraloop of h16 and do not bind h4 mRNA (Fig. 3). We then set about finding new h4 mRNA mutants that would generate yeast 40S binding. Significant binding of the 40S particles was obtained with (G 29 U) mutant, which allows formation of an additional U 29 :A 540 pair instead of the G 29 :A 540 pair (Fig. 3). To check whether the newly formed U 29 :A 540 pair is the essential element of the ribosome:mRNA interaction, we tested the binding of two additional mRNA mutants. A first one was a triple mutant (U 27 U 28 U 29 ) that exhibited the restoring U 29 :A 540 pair. A second one was a triple mutant (U 26 U 27 U 28 ) displaced by one nt that kept the non-functional G 29 :A 540 pair (Fig. 3). Both mutants did not bind yeast ribosomes. This result shows that the U 29 :A 540 pair cannot lead to ribosome binding in the absence of the pairings on the 5 0 side. We cannot exclude the possibility that the conformation of the yeast tetraloop is quite different than the rabbit tetraloop. Indeed, tetraloops starting with A are typically not well structured, in contrast to those starting with U. This is the case of the (AUUC) tetraloop of yeast h16 (ref. 18), in contrast to the (CUUU) tetraloop of rabbit h16 (ref. 19). Therefore, formation of an extra U 29 :A pair could rearrange the yeast tetraloop structure and favour binding of 40S subunits. Altogether, these results validate the importance of the h16 interaction site also for the yeast ribosome, and suggest that this binding mode may be widely used in the eukaryotic kingdom.

Discussion
Taken together, the present data uncover the concept of base pairing between 18S rRNA sequence and eukaryotic mRNAs to facilitate ribosome positioning on the start codon, complementing the stabilizing role of the Kozak consensus sequence that flanks the AUG start codon. Structural and functional data reveal that the key regulatory site for this is the tip of eukaryotic helix h16 which can base pair with the h4 sequence preceding the TWJ. This additional interaction may compensate for weak or deficient Kozak consensus sequences at þ 4. This tethering mechanism provides specificity for the formation of translation initiation complexes on the first start codon of h4 and explains why slippage on a second start codon does not occur. By directly forming base-pair interactions with the tip of the ribosomal h16, it increases the general affinity for the small subunit and correctly localizes the ribosome on the h4 start codon thus preventing scanning. According to the wobble rules, residue U has the ability  ARTICLE to base pair with A and G residues. Therefore, the complexity of base pairing with the UUUC sequence is increased, suggesting that many other mRNAs may be assisted by the interaction with h16 in a similar way. In addition, the presence of the TWJ-folded domain locks the ribosome in a pre-translocation conformation to stabilize the base pairing interactions. In that position, the TWJ of h4 also competes with DHX29 (ref. 20) a critical helicase for the scanning mechanism 21 . This observation is consistent with the absence of scanning of the short h4 5 0 UTR (ref. 14). The N-terminal domain of Hbs1 protein (part of the no-go decay complex 22 ) also binds at this particular place 23 . The discovery of mRNA interactions with specific bases of the 18S rRNA appears to be a mechanism reminiscent of that observed in bacteria at the level of the SD interactions at the 3 0 end of the 16S rRNA that help recruiting mRNAs to the 30S initiation complex. However, the interaction site observed in the eukaryotic complex is completely different because it corresponds to a eukaryotespecific sequence insertion in the 18S rRNA (tip of h16), which is oriented differently and extends by B50 Å as compared with bacterial ribosomes (Fig. 1d) to create a landing platform for pre-binding the mRNA at the entry site of the mRNA channel (Fig. 1b). This allows formation of stabilizing interactions of the mRNA with the ribosome that promote the formation of the 48S initiation complex, illustrating how temporarily repressive folded elements of cellular mRNAs can guide the ribosome to favour their own translation. The study thus brings in a new concept regarding the mode of interaction of mRNAs with specific structural elements of a eukaryote-specific site on the 40S subunit, the general significance being comparable to that of Kozak and Shine-Dalgarno sequences. An interesting question to address in future studies is whether this specific 18S rRNA interaction exists with other eukaryotic mRNAs. mRNA:rRNA interactions are more documented in viruses. For instance, sequences in the adenovirus mRNA complementary to 18S rRNA facilitate shunting by base pairing to 40S ribosomal subunit 24 . A base pairing between hepatitis C virus and 18S rRNA is also required for IRES-dependent translation initiation 25 . Several studies reported similar interactions with cellular mRNAs. These include reports of mRNA interactions between a plant ribosomal protein mRNA (RPS18) and the 18S rRNA 26     Binding was studied on sucrose gradient with radiolabelled m 7 G-capped h4 mRNA. Samples were separated on 7-47% sucrose gradients, and complexes with 40S particles were counted in Cerenkov mode. Binding values were normalized to wild-type h4 binding with rabbit 40S particles. Values represent the average of three technical replicates. Errors bars representing the variability of data are shown. (b) Secondary structure of the 142 first nucleotides of murine histone h4 mRNA. The structure contains three helices connected by a TWJ followed by a stem-loop structure. The initiation codon is boxed. The black star indicates the location of the þ 17 ribosome toe-print. Partial helix 16 (h16) from yeast and mammalian (rabbit, human and mouse) are drawn in blue; nts numbering corresponds to rabbit sequence (rabbit 540 ¼ yeast 452). Mutated h4 mRNAs tested with 40S subunits from yeast are shown in the grey insets.
30°C in the presence of a mix of 1 mg ml À 1 cycloheximide and 0.5 mg ml À 1 hygromycin B blocking the translocation of the peptidyl-tRNA from the A to the P site of the ribosome 30 . Finally, formation of initiation complexes was obtained by adding histone h4 mRNA at a final concentration of 500 nM and incubating for 5 min at 30°C. Then, ribosome complexes (15 ml) were mixed with an equal volume of ice-cold buffer A containing 20 mM Tris-HCl (pH 7.5), 100 mM KAc, 2.5 mM Mg[Ac] 2 , 2 mM DTT, 1 mM ATP and 0.25 mM spermidine. Toe-print experiments were adapted from refs 14,31. An ultracentrifugation of the reaction mixture step was performed at 337,000g in a S100AT3 rotor (Sorvall-Hitachi) at 4°C for 1 h to separate ribosomal complexes from the non-ribosomal fraction. Then, ribosomal pellets were dissolved in 30 ml buffer A complemented with the same translation inhibitor and analysed by primer extension using AMV reverse transcriptase and a primer complementary to nts 91-110 of h4 (ref. 14).
Sample preparation for cryo-EM. 80S/h4 and 80S/h4 1-142 ribosome complexes were prepared as described previously 15 . Briefly, mouse h4 mRNA was ligated to a biotinylated DNA oligonucleotide and bound to streptavidin-coated beads. Then, rabbit 80S ribosomes were assembled on the beads coated with the bait, stalled at the post-initiation step, washed, and released from the beads by enzymatic DNase I cleavage of the DNA moiety 15 . First, the chimeric mRNA-DNA bait harbouring a biotin molecule at its 3 0 end was constructed in one step ligation catalysed by T4 DNA ligase 15  Empty 80S ribosomes were purified from nuclease-untreated RRL by centrifugation at 37,000 r.p.m. in a SW41Ti rotor for 2.5 h at 4°C through 7-47% linear sucrose gradient in buffer containing 25 mM Tris-HCl (pH 7.5), 50 mM KCl, 5 mM MgCl 2 and 1 mM DTT. After gradient fractionation, fractions containing 80S ribosomes were centrifuged at 108,000 r.p.m. (S140AT Sorvall-Hitachi rotor) for 1 h at 4°C, then the ribosomal pellet was dissolved in 80S/h4 complex resuspension buffer (20 mM HEPES-KOH (pH 7.6), 0.2 mM EDTA, 10 mM KCl, 1 mM MgCl 2 , 1 mM DTT).
Data collection. A volume of 2.5 ml of freshly prepared 80S ribosome complexes, at 0.2-0.5 mg ml À 1 , were applied to 300 mesh holey carbon Quantifoil 2/2 grids (Quantifoil Micro Tools, Jena, Germany), blotted with filter paper from both sides for half a second in the temperature-and humidity-controlled Vitrobot apparatus (FEI, Eindhoven, Netherlands, T ¼ 10°C, humidity 95%, blot force 8, blot time 0.5 s) and vitrified in liquid ethane pre-cooled by liquid nitrogen. Data were collected on the in-house spherical aberration (Cs) corrected Titan Krios S-FEG instrument (FEI, Eindhoven, Netherlands) operating at 300 kV acceleration voltage and at a nominal underfocus of Dz ¼ À 0.6 to À 4.5 mm using a second-generation back-thinned direct electron detector CMOS (Falcon II) 4,096 Â 4,096 camera and automated data collection with EPU software (FEI, Eindhoven, Netherlands). The camera was set up to collect seven frames, plus one total exposure image; total exposure time was 1 s with a dose of 60 ē Å À 2 (3.5 ē Å À 2 per frame) using a nominal magnification of Â 59,000 resulting in 1.1 Å pixel size at the specimen level (images were coarsened by 2 for further processing). Data for the empty 80S, 80S/h4 1-142 and preliminary 80S/h4 ribosome complexes were collected on the in-house Polara Tecnai F30 electron microscope using a first-generation direct electron detector CMOS (Falcon I) 4,096 Â 4,096 camera using a magnification of Â 59,000 with a pixel size of 1.36 Å.
Image processing. Stack alignment of the Titan Krios data was performed before particle picking, which included seven frames and a total exposure image (total eight images in the stack), using the whole image motion correction method 32 . Thereafter, an average image of the whole stack was used to pick 146,821 particles semi-automatically using EMAN2 Boxer 33 and RELION 34 , and the contrast transfer function of every image was determined using CTFFIND3 (ref. 35) in the RELION workflow. Particle sorting was done by two-dimensional classification resulting in 48,952 particles. Further three-dimensional classification resulted in five classes with 2,822, 7,893, 6,786, 3,412 and 5,363 particles (total 26,276 particles). Classes 1, 3, 4 and 5 looked similar with h4 present in the folded state, and A-and P-site tRNAs and eEF1A; these classes were merged for structure refinement (18,383 particles). Class 2 contained elongation factor eEF2, P/E-site tRNA and no density for h4, which corresponds to the elongated complex in which the h4 mRNA is unfolded and tRNA is already translocated (Supplementary Fig. 2). This complex is typical of cycloheximide inhibition that happens after a first translocation step by blocking tRNA Met into the E-site 36 . Hygromycin B that typically prevents the translocation induced by eEF2 (refs 37-39) was probably not bound in this complex. Sorting was also applied to the 80S/h4 1-142 data, revealing the same mass of density for the 5 0 core domain of h4, but with the 40S in different conformations ( Supplementary Fig. 6). The resolution was estimated in Relion at 0.143 FSC 34 , indicating an average resolution of 10.2 Å (Supplementary Fig. 7). Map interpretation was done using Chimera 40 and Coot 41 starting from our human ribosome atomic model 16  The mRNA from a partial 48S preinitiation complex in S. cerevisiae (PDB ID 3J81) was edited according to sequence differences and fitted in density using Coot and the UCSF Chimera package. The structure data file (.sdf) for 7-methyl-guanosine-5 0 -triphosphate was retrieved from PDB entry 3AM7 (ref. 43). The .pdb file generated from the .sdf file by eLBOW 44 in Phenix 42 was fitted in density using Chimera. Based on a comparative analysis of various tetraloops, we selected the tetraloop from h16 in a S. cerevisiae translation initiation complex (PDB ID 3JAM) to model base pairs involving U541-G28 and U542-A27. Geometry of the mRNA and h16 were regularized in Coot.
Data availability. The experimental map is available from the Electron Microscopy Data Bank (EMDB) under accession code EMD-4049. All other relevant data are available from the authors upon request.