Main

Nuclear pre-messenger RNA (pre-mRNA) splicing is catalyzed by the spliceosome, a large ribonucleoprotein particle consisting of five small nuclear RNAs (U1, U2, U4, U5 and U6) and 〈70 proteins1,2. The spliceosome catalyzes the same two-step transesterification reaction that occurs autocatalytically in mitochondrial group II self-splicing pre-mRNAs1,2, suggesting that the spliceosome is also a ribozyme. U6 RNA is a strong candidate for participating in spliceosomal catalysis: it is the most highly conserved of the spliceosomal RNAs1, it base pairs with the 5′ splice site at the time of the first transesterification reaction1 and certain mutations in U6 arrest splicing after spliceosome assembly but before the first or second catalytic steps3. In addition, several phosphate oxygens within U6 RNA have been identified as essential for splicing4,5. This finding is significant, because the spliceosome requires magnesium as a catalytic cofactor1 and metal ions are often coordinated by phosphate oxygens in ribozymes.

In the catalytically active spliceosome, U6 is base paired with U2 RNA, which also contains highly conserved sequences and base pairs to the intron1. The proposed secondary structure of the U2–U6–pre-mRNA complex includes a U6 RNA intramolecular stem-loop (ISL), which was originally identified in free U6 RNA6 (Fig. 1a). The ISL contains a phosphate group 5′ to U80 in yeast U6 that is essential for both steps of splicing4,5,7. Substitution of the Saccharomyces cerevisiae U80 Sp phosphate oxygen by sulfur inactivates splicing but does not interfere with spliceosome assembly7. The inactivating U80 Sp phosphorothioate substitution can be rescued for the first catalytic step of splicing by the addition of the thiophilic metal ion cadmium7. These data provide strong evidence that the U80 Sp phosphate oxygen of U6 RNA participates in coordination of a catalytically essential metal ion during the first step of splicing. Recently, a splicing-related reaction similar to the first transesterification has been shown to be catalyzed by fragments of U2 and U6 RNAs in the absence of proteins8. The catalytically active U2–U6 subcomplex includes the U6 ISL sequence. Here we report the solution structure of the U6 ISL and demonstrate that it is sufficient for stereo-selective binding of metal ion to the essential U80 phosphate. Intriguingly, stereo-selective binding is lost upon protonation of an adjacent C·A pair, which has a pKa close to physiological pH. These results suggest a potential mechanism for the regulation of splicing catalysis.

Figure 1: Schematic of the structural transitions of yeast U6 RNA during the pre-mRNA splicing cycle and NMR data of the U6 ISL RNA.
figure 1

a, The three complexes containing U6 RNA include the U6 small nuclear ribonucleoprotein particle (U6 snRNP), which has the U6 intramolecular stem-loop (ISL) RNA secondary structure; the assembled but inactive spliceosome in which the U6 ISL cannot form because of extensive base pairing interactions with U4 RNA; and the catalytically active U2–U6 complex from the spliceosome, which allows the ISL to form again. An asterisk denotes the essential phosphate between A79 and U80. b, Sequence and secondary structure of the S. cerevisiae U6-A62G ISL RNA used in this study. The different regions are colored as follows: the terminal helix is blue; the internal loop, yellow; the pentaloop-proximal helix, purple; and the pentaloop, green. The A62G substitution is boxed. An asterisk denotes the essential phosphate between A79 and U80. c, 750 MHz 1H NMR spectrum of the imino peaks. Peak assignments are indicated. The spectrum was acquired at pH 6.0, 10 °C, 1 mM RNA and 50 mM NaCl. d, 600 MHz 1H- 1H 2D NOE spectrum of the U6 ISL. The sequential H1′–H6/8 assignments are traced with lines. The spectrum was acquired at pH 7.0, 30 °C, 1 mM RNA and 50 mM NaCl. The NOESY mixing time was 250 ms. The H1′ resonances of residues 73 and 75 at 5.96 p.p.m., as well as the H8 resonances of 75 and 76 at 8.05 p.p.m., overlap in the 2D spectrum.

Spectral features and metal ion binding

The RNA that we chose to study corresponds to nucleotides 62–85 of S. cerevisiae U6 RNA (Fig. 1a,b), which form the highly conserved ISL secondary structure in the 3′ terminal half of the molecule6. Two RNAs were initially studied by NMR: one corresponding to the wild type sequence from S. cerevisiae and the other with an A62G mutation. Yeast with the A62G mutation have wild type growth rates at 30 °C (ref. 6). 2D NOE spectra indicate that the A62G ISL has the same overall structure as the wild type sequence, with identical chemical shifts for all resonances except for those from the first two base pairs (data not shown) but produces higher quality NMR spectra. We therefore pursued the structure determination of the A62G variant containing 13C,15N-labeled nucleotides incorporated via transcription by T7 RNA polymerase.

The NMR spectra reveal interesting structural features of the U6 ISL. The 1D spectrum of the imino resonances reveals slowly exchanging imino peaks from eight Watson-Crick base pairs plus an additional imino belonging to G71 at 10.7 p.p.m., which does not have a chemical shift indicative of Watson-Crick base pairing (Fig. 1c). All of the observable imino protons could be assigned by 2D 1H-15N HMQC and 2D 1H-1H NOESY spectra (data not shown). Sequential NOEs (Fig. 1d) linking all the nucleotides in the internal loop indicate that this region maintains A-form helical stacking despite being an asymmetrical internal loop. NOEs between A73 and A75 in the pentaloop indicate that U74 is bulged out of the structure.

Because the U6 ISL contains a putative metal-binding site, we tested whether the RNA structure is influenced by the addition of Mg2+. Analysis of 2D NOESY, TOCSY and 1H-13C HSQC data in the presence and absence of Mg2+ indicate the U6 ISL structure does not significantly change upon the addition of up to 50 molar equivalents of Mg2+ (data not shown).

To test that the U6 ISL alone is sufficient to stereo-selectively bind the metal ion required for the first step of splicing, we analyzed cadmium ion binding to purified U80 Sp and Rp phosphorothioate-substituted U6 ISL by 31P-NMR. Inner-sphere coordination of Cd2+ to phosphorothioates causes an upfield (negative) 31P shift9. Stereo-specific cadmium binding to the U6 ISL at the U80 phosphate is indeed observed (Fig. 2a). 31P-NMR spectra of the Sp thio-substituted ISL show that the signal from the substituted phosphate shifts upfield upon Cd2+ addition at pH 7.0. By contrast, the 31P-NMR peak from the thiophosphate of the Rp-substituted molecule shows a smaller shift in the opposite, downfield direction (Fig. 2a). Chemical shift changes similar in both direction and magnitude have been observed for the well-studied hammerhead ribozyme, which possesses an inner-sphere cadmium ion-binding site that is stereo selective for Sp phosphorothioates9. Also as observed for the hammerhead ribozyme, the Cd2+ binding curves for the U6 ISL cannot be fit to a one ion–one binding site model, consistent with the observation that there are several metal-binding sites on the U6 ISL. Our results using the isolated U6 ISL correlate well with those obtained with the assembled spliceosome, for which only the Sp phosphorothioate diastereomer is rescued by cadmium ion7.

Figure 2: Stereo selectivity and pH-dependence of metal binding at U80 of the ISL.
figure 2

a, Cadmium ion binding by U6 ISL phosphorothioate-substituted at U80. The change in chemical shift of the 31P signal from the purified Sp and Rp thiophosphate stereoisomers is plotted as a function of molar equivalents of added Cd2+. Black diamonds are U6 ISL U80 Sp phosphorothioate RNA at pH 7.0; black squares, U6 ISL U80 Rp phosphorothioate RNA at pH 7.0; open diamonds, U6 ISL U80 Sp phosphorothioate RNA at pH 5.4; and open squares, U6 ISL U80 Rp phosphorothioate RNA at pH 5.4. Open circles with plus symbols represent the Sp and Rp phosphorothioate substituted hexaU RNA at pH 7.0. b, Henderson-Hasselbach curve fit of the change in the 13C-chemical shift of the A79 C2 carbon versus pH, as determined from 2D 1H-13C HSQC spectra recorded at different pH values. Black squares represent no added magnesium, with an apparent pKa = 6.5±0.1. Open circles are 4 mM MgCl2, with an apparent pKa = 6.0±0.1.

To test whether the stereo specificity of binding is a consequence of the structure of the U6 ISL or simply a feature of the chirality of the substitution, we also investigated cadmium binding to sulfur-substituted hexauridylate (UUU-s-UUU), where -s- denotes either a Sp or Rp phosphorothioate linkage (Fig. 2a). The observed cadmium binding to hexaU is extremely weak and non-stereo specific, indicating that introduction of a phosphorothioate into a flexible RNA cannot account for the observed stereo-selective cadmium binding at U80. Furthermore, the phosphorothioate substitution does not significantly alter the structure of the U6 ISL, because the same NOEs and ribose coupling constants are observed for both the native and thio-substituted U6 ISL RNAs (data not shown). A standard A-form helix does not detectably bind cadmium upon introduction of phosphorothioate9.

Intriguingly, the stereo specificity of metal binding at U80 is lost at low pH (Fig. 2a). Therefore, we examined the U6 ISL structure for pH-dependent effects of folding. 2D 1H-13C HSQC spectra were collected at a variety of pH values from 5.3–8.2. Throughout most of the molecule, little or no change in 1H- or 13C-chemical shifts was observed (data not shown). However, the A79 C2 13C-chemical shift displays large changes as a function of pH. The observed chemical shift changes can be attributed to protonation of the adjacent N1 atom and can be fit with a Henderson-Hasselbach-type equation10, yielding an apparent pKa of 6.5±0.1 in the absence of magnesium (Fig. 2b). These data are consistent with protonation of A79 N1 near neutral pH, in agreement with in vivo chemical modification analysis6. The same proton chemical shifts and NOE intensities are observed throughout the molecule at pH 5.8 and 7.8, indicating that loss of protonation at A79 does not alter the RNA structure (data not shown). Notably, the pKa of A79 decreases by half a pH unit to 6.0±0.1 when 4 mM MgCl2 is present, suggesting that metal ion binding and protonation are mutually antagonistic.

Structure of the U6 ISL

The structure determination of the U6 ISL included 521 conformationally restrictive NOE distance restraints (Table 1). The superimposition of the 40 lowest energy conformers over all heavy atoms is shown (Fig. 3a). These structures have an overall r.m.s. deviation of 1.4 Å to the mean structure (Table 1). As expected, the Watson-Crick paired regions adopt standard A-form helical geometry. The internal loop contains a C67·A79 wobble base pair, consistent with the observed protonation state for A79 (Figs 2, 3). The structure was determined at pH 7.0 and is unprotonated at A79 but still contains a wobble pair conformation identical to that of a protonated C·A pair, minus one hydrogen bond (Fig. 3). U80 is unpaired but stacked in the helix between A79 and G81. The pentaloop contains a sheared G71·A75 base pair and a 3′ adenine stack, which is facilitated by bulging out U74 (Fig. 3b).

Table 1 Structure determination statistics
Figure 3: Solution structure of the yeast U6 ISL.
figure 3

a, Stereo view of the superimposition over all heavy atoms of the 40 conformers of the U6 ISL with the lowest energies. b, Stereo view of the pentaloop region of the lowest energy U6 ISL structure. c, Stereo view of the internal loop region of the lowest energy U6 ISL structure.

The GCAUA pentaloop adopts a GNRA-type fold11 and can be superimposed with a GCAA tetraloop (PDB entry 1ZIH) with an r.m.s. deviation of 2.4 Å for all common heavy atoms. The bulged U74 in the pentaloop is disordered and cannot be superimposed with the overall GNRA-type structure. This pentaloop variation of the GNRA tetraloop fold was observed in the bacteriophage lambda box B Nut site RNA12,13. The U6 ISL GCAUA pentaloop and the box B pentaloops GAAGA12 and GAAAA13 all adopt the same fold and share the consensus GNR(N)A (where N is any nucleotide, R is a purine and (N) denotes any bulged nucleotide).

Structural basis for metal ion binding

To further investigate the structural basis for the observed stereo-selective metal ion binding at U80, we used a nonlinear Poisson Boltzmann model to calculate the electrostatic surface potential of the U6 ISL14. The Sp phosphate oxygen between A79 and U80 is located in the highly electronegative major groove (Fig. 4a), which is deep enough to accommodate a divalent cation, whereas the U80 Rp phosphate is angled toward the minor groove, which is less electronegative, shallow and does not provide a good cation binding pocket.

Figure 4: Charge distribution of the U6 ISL.
figure 4

a, Calculated electrostatic surface potential of the U6 ISL RNA structure. A79 is unprotonated in these calculations. Red indicates negative charge, white is neutral and blue is positive charge. Electronegativity scale: red, −35; white, −5; and blue, 5. Major and minor groove views are indicated. In the major groove, the Sp U80 phosphate oxygen implicated in metal binding is circled by a dashed line. b, Upper, the protonated C67·A79+ base pair found in the U6 ISL. The protonated nitrogen is indicated with a plus sign. Lower, the near-isosteric C67·C79+ base pair proposed to occur in Trypanosoma brucei, Crithidia fasciculata, Leptomonas seymouri and Phytomonas sp. U6 RNAs.

The U80 Sp phosphate oxygen is within 4.5 Å of the A79 Rp phosphate oxygen (average P–P distance for the 20 lowest energy conformers). Interestingly, the preferred interphosphate oxygen distance for inner-sphere coordination of magnesium is 4.5 Å (ref. 15), whereas the typical interphosphate oxygen distance in an A-form helix is 5.5–6.5 Å (ref. 16), which is also the average P-P distance observed in the A-form regions of the U6 ISL NMR structure. Substitution of the A79 Rp phosphate oxygen with sulfur does not block splicing4,5; therefore, it cannot be as important as the U80 Sp phosphate oxygen for binding of the essential metal ion.

Regulation of metal binding by base protonation

A functional group important in regulating splicing catalysis should be evolutionarily conserved. The only sequence variation that is known to exist for the C67·A79+(+ denotes partial protonation at neutral pH) base pair is found in trypanosomatids, which have a C at position 79 in place of A17. We note that a C at position 79 is the only substitution that would allow formation of a protonated base pair with near-isosteric geometry to the C67·A79+ base pair. A C67·C79+ pair could adopt similar protonated wobble geometry via N3-amino hydrogen-bonding and N3 protonation at C79 (Fig. 4b). The C·C+ base pairing would position an ionizable functional group in the minor groove at the same position as the C·A+ wobble base pair. An isosteric U·G wobble pair is never observed at the 67/79 position, and this pairing would not allow for protonation at nucleotide 79. Taken together, these observations suggest that a protonated base at position 79 is important for spliceosome function. Protonated bases have been found near the active sites of several ribozymes18. Interestingly, the protonated cytidine in the HDV ribozyme shows an acidic pKa shift upon metal binding19,20, as we observe for A79 of the U6 ISL. Although protonation of the 67·79 base pair of the U6 ISL does not seem to be essential for catalysis because most mutations at these positions are not lethal6,21, the structural and phylogenetic evidence suggests that protonation at this position is important for splicing, perhaps by modulating the binding of an essential metal ion. Because the pKa of the C·A+ wobble pair is close to neutrality, the antagonism between proton binding and metal ion binding could be exploited by the spliceosome to regulate catalytic activity by modulating the binding of an essential metal ion.

Evolutionary implications of U6 ISL structure

The GNR(N)A-fold of the ISL pentaloop explains the phylogenetic conservation observed among all known U6 RNA sequences. The first position is a G for all known U6 major spliceosomal RNAs, but the vertebrate and plant minor 'ATAC' spliceosomes have a U in the first position22,23. This sequence variation can be explained by the observation that the GNRA fold is a type of 'U-turn', which can be formed with either a G or U at the first position24. The third position of the pentaloop is a purine (R) in all known U6 sequences, consistent with the observation that the purine N7 in GNRA-type folds accepts a hydrogen bond from the 2′ OH of the G in the first position. Finally, the last position is invariantly an A to allow formation of the G·A pair (or a U-turn structure in the case of the ATAC U6 sequences). Therefore, all U6 ISLs likely adopt the same pentaloop structure.

Domain V of group II self-splicing introns has been proposed to be the evolutionary precursor of the U6 ISL because they have similar secondary structures and both have been implicated in metal ion coordination1. Here we reveal a further similarity between domain V and the U6 ISL: both contain a GNRA-fold at similar positions. Both domain V and the U6 ISL may use the GNRA-fold to facilitate tertiary contacts required to bring these domains in proximity to their respective splice sites.

Methods

RNA synthesis and purification.

The U6 ISL RNA with the A62G substitution was transcribed in vitro using purified His6-tagged T7 RNA polymerase and synthetic DNA oligonucleotides (Integrated DNA Technologies, Inc.). The wild type U6 ISL sequence and phosphorothioate-containing RNAs (Dharmacon, Inc.) were deprotected according to recommended procedures. RNA was purified by denaturing 15% polyacrylamide gel electrophoresis, identified by UV absorbance and excised from the gel. RNA was recovered with an electroelution apparatus (Schleicher and Schuell, Inc.), ethanol precipitated, purified by liquid chromatography on a BioRad LC system and a High Q strong anion exchange column (5 cm×1 cm), ethanol precipitated again and then desalted by gel filtration chromatography using a BioRad P6 column (5 cm×1 cm). The purified RNA was lyophilized, resuspended in water and brought to pH 7.0 by the addition of 1 M NaOH. 13C,15N-labeled RNA was prepared according to published procedures25. HPLC purification and nuclease digestions for the stereo-specific assignments of ISL RNA containing the Sp and Rp phosphorothioate diastereomers were performed as described26.

NMR spectroscopy.

All NMR spectra were recorded on Bruker DMX spectrometers at the National Magnetic Resonance Facility at Madison (NMRFAM) and at Bruker (Karlsruhe, Germany). All spectrometers were equipped with HCN triple-resonance triple-axis pulsed-field gradient probes.

Exchangeable resonances were assigned by reference to 2D NOESY (150 ms mixing time) and 2D 1H-15N HMQC spectra of the RNA in 90% H2O/ 10% D2O at 283 K. Non-exchangeable resonances were assigned by reference to 2D NOESY spectra (60, 120, 250 and 300 ms mixing times) and 2D TOCSY, 2D 1H-13C HSQC and 3D 1H-13C-1H HCCH TOCSY, 3D 1H-13C-1H HCCH COSY and 3D 1H-13C-1H NOESY-HMQC spectra of the RNA in 99.99% D2O at 303 K (ref. 27). Water suppression for samples in 90% H2O/ 10% D2O was achieved with a 1-1 spin-echo pulse sequence. For experiments in 99.99% D2O, the residual HDO resonance was suppressed with a low power presaturation pulse. 1D 31P spectra were acquired at 202 MHz (500 MHz for 1H) with a 5 mm quadrupole nucleus probe (QNP, Bruker). 1D 31P spectra with and without Cd2+ were collected in the presence of both 50 and 250 mM KCl to test for monovalent salt effects (none were seen). All spectra were referenced directly to an internal reference of 0.1 mM 3-(Trimethylsilyl)-1-propane sulfonic acid (TSP) for protons or indirectly referenced using the respective chemical shift index ratios.

NMR data were processed with XWINNMR (Bruker) and analyzed with Felix98 (MSI) and the NMR assignment program Sparky (http://www.cgl.ucsf.edu/home/sparky/). NOE peak volumes were integrated using the Gaussian peak fitting function in Sparky. The apparent pKa values were fit using Origin software (MicroCal) as described10.

Structure calculations.

NOE distances were estimated from the integrated peak volumes obtained from a 2D NOESY spectrum with a 250 ms mixing time, which was determined to be within the linear range of the NOE build-up curve. Distances were calibrated by setting the average integrated volume of the pyrimidine H5–H6 NOEs to 2.4 Å, using the r−6 distance relationship and the CALIBA macro in DYANA28. NOEs were then grouped into three categories, corresponding to strong (1.8–3.0 Å), medium (1.8–4.5 Å) and weak (3.0–6.0 Å). NOEs from exchangeable protons obtained from 2D NOESY spectra in 90% H2O/10% D2O, as well as NOEs from 3D NOESY-HMQC spectra, were qualitatively assigned as strong, medium or weak.

The torsion angles α and ζ were loosely restrained to exclude the trans range (0±120°), based on all 31P chemical shifts falling into the range of −4 to −5 p.p.m. (ref. 29). The torsion angle χ was restrained to −160±20° for all nucleotides in the anti range, based on the integrated NOE volume of intranucleotide H1′ to aromatic NOE for a 60 ms NOESY spectrum. 2D TOCSY experiments with a 45ms mixing time were used to analyze sugar pucker conformations. Nucleotides with strong H1′–H2′ and H1′–H3′ crosspeaks were restrained to an S-type range (145±30°) (C72 and U74). Nucleotides with intermediate crosspeak intensities (G62, G71, A73, U80 and C85) were left unrestrained. Nucleotides with absent H1′–H2′ crosspeaks were restrained to an N-type range (85±30°).

DYANA28 was used to calculate 200 starting structures in torsion angle space, starting from random torsion angles. Ideal A-form geometry torsion angle restraints were included at this step but only for the Watson-Crick helical regions that were clearly A-form on the basis of NOE data. We performed 4,000 steps of restrained torsion angle molecular dynamics. The 100 lowest energy DYANA starting structures were then refined in Cartesian coordinate space with restrained molecular dynamics using X-PLOR30. Restrained molecular dynamics and simulated annealing (28 ps) were performed at 2,000K with cooling to 100 K in 40,000 steps, with a 0.7 fs time step. The simulated annealing was followed by 200 steps of energy minimization using the Powell algorithm in X-PLOR. The 40 lowest energy refined structures were viewed and analyzed using MOLMOL31.

Structure analysis.

The electrostatic surface calculations were performed using QNIFFT14 and visualized with GRASP32. A conformational analysis of the RNA structure was achieved by using the AMIGOS algorithm33 to calculate pseudotorsion angles between O4′ and the phosphates in the 5′ and 3′ directions. The AMIGOS calculation reveals that the torsion angles fall within the allowed plot regions for A-form RNA and known structural motifs.

Coordinates.

The U6 ISL structure coordinates have been deposited in the PDB (accession code 1LC6). The U6 ISL chemical shifts have been submitted to BioMagResBank (accession code BMRB-5784).