Deconvolution of conformational exchange from Raman spectra of aqueous RNA nucleosides

Wilson, Alex L.; Outeiral, Carlos; Dowd, Sarah E.; Doig, Andrew J.; Popelier, Paul L. A.; Waltho, Jonathan P.; Almond, Andrew

doi:10.1038/s42004-020-0298-x

Download PDF

Article
Open access
Published: 06 May 2020

Deconvolution of conformational exchange from Raman spectra of aqueous RNA nucleosides

Communications Chemistry volume 3, Article number: 56 (2020) Cite this article

3427 Accesses
7 Citations
3 Altmetric
Metrics details

Subjects

This article has been updated

Abstract

Ribonucleic acids (RNAs) are key to the central dogma of molecular biology. While Raman spectroscopy holds great potential for studying RNA conformational dynamics, current computational Raman prediction and assignment methods are limited in terms of system size and inclusion of conformational exchange. Here, a framework is presented that predicts Raman spectra using mixtures of sub-spectra corresponding to major conformers calculated using classical and ab initio molecular dynamics. Experimental optimization allowed purines and pyrimidines to be characterized as predominantly syn and anti, respectively, and ribose into exchange between equivalent south and north populations. These measurements are in excellent agreement with Raman spectroscopy of ribonucleosides, and previous experimental and computational results. This framework provides a measure of ribonucleoside solution populations and conformational exchange in RNA subunits. It complements other experimental techniques and could be extended to other molecules, such as proteins and carbohydrates, enabling biological insights and providing a new analytical tool.

Ensemble determination by NMR data deconvolution

Article 11 May 2023

Optimal ¹³C NMR investigation of intrinsically disordered proteins at 1.2 GHz

Article 12 December 2023

Sensitivity enhancement of homonuclear multidimensional NMR correlations for labile sites in proteins, polysaccharides, and nucleic acids

Article Open access 21 October 2020

Ribonucleic acids (RNAs) are key players in the central dogma of molecular biology; genetic information is transported, translated, and regulated by RNA. Their importance was underlined by the discovery of self-replicating RNA ribozymes¹, supporting the RNA World hypothesis of the early evolutionary history of life². Molecular descriptions, deriving from three-dimensional (3D)-structural information, are required to understand the enigmatic mechanisms behind these diverse biological functions³. Being predominantly single stranded, RNAs are intrinsically more flexible than globular proteins and only super-atomic crystallographic resolutions (2.5–4 Å) are typically attainable⁴; ribose ring densities are often diffuse and ambiguous, indicating conformational degeneracy is present even in crystals^5,6. Macromolecular complexes can alternatively be studied with cryogenic electron microscopy but again atomic resolution is typically not achievable⁷. Solution-phase nuclear magnetic resonance (NMR) has also been instrumental in the determination of functional, conformational, and structural information about RNA. However, while NMR can provide atomic resolution, it is not suited to systems beyond RNA oligonucleotides as they require high concentrations, exhibit low chemical shift dispersion, and necessitate complex and expensive isotopic labeling^8,9,10. Established experimental techniques are therefore limited in their ability to achieve atomic resolution on RNA. Lack of resolution masks functionally-important ribose ring puckering and conformational exchange. For example, constituent ribose rings show a general preference for c3′-endo (north) puckering in helical polynucleotides⁶, and the c2′-endo (south) sugar pucker in non-helical polynucleotides. Dynamical interconversion is therefore responsible for important macromolecular changes between A and Z RNA and between A, B, and Z DNA^11,12. This is reinforced by the fact that locking the sugars into particular puckers perturbs the delicate hybridization process and provides a novel route to therapeutics, highlighting the functional importance of ribose conformational exchange¹³. Clearly, further experimental techniques are required to provide atomic resolution of backbone conformational preferences in RNA.

Both infrared and Raman spectroscopy are able to study solution phase RNA samples^14,15, without restrictions on molecular size, and potentially provide an inexpensive, rapid route to the elucidation of RNA conformation. However, deciphering the spectral complexity associated with large RNAs and assigning the constituent bands is a problem limiting the utility of Raman spectroscopy¹⁶. Computational prediction of vibrational spectra is an excellent route to assignment of Raman spectra, potentially expanding Raman’s domain of applicability to routine 3D-structural determination, but computational tractability limits current methods, both in terms of realistic system size and inclusion of conformational exchange^17,18,19,20. However, there have been recent successful attempts to simulate on the order of 10³–10⁴ atoms using combined classical and quantum methods^{21,22,23,24,25,26,27,28}. This type of scalable computational approach is required to accurately predict vibrational spectra of biological molecules, and include the contribution of physical effects such as conformational exchange. Consequently, there have been a number of recent attempts to derive spectra from ab initio simulations, but the contributions of slow conformational exchange to computed spectra are inadequately reproduced due to limits on feasible simulation timeframes²⁹.

Here, a Raman spectral prediction framework is applied to aqueous RNA nucleosides to distil spectral complexity into contributions of multiple conformers in a manner computationally scalable to much larger systems, by combining classical and ab initio molecular dynamics with experiment. Specifically, long-timescale classical simulations fully explore molecular phase space, and highly populous regions are further sampled with multiple short-timescale ab initio simulations, providing accurate dynamics around specific conformers. Autocorrelation of the molecular polarizability then allows for individual conformer spectral prediction and subsequent combination into an overall Raman spectrum. In this way, we achieve experimental agreement, opening up Raman spectroscopy as a simple, inexpensive experimental route to 3D-structure determination of nucleic acids to complement existing techniques. While we demonstrate this for the first time on RNA ribonucleoside building blocks, it may have fundamental applicability to a wide range of other molecules.

Results and discussion

Classical molecular dynamics ensemble generation

An ensemble of 3D-conformations for each nucleoside was initially created to provide a Boltzmann-weighted distribution in order to predict the most populated conformers and the effect of RNA conformational exchange. Classical molecular dynamics (MD) simulations (5 μs) with explicit water molecules were used for this purpose. The resultant conformational space was divided into two dimensions corresponding to the major degrees of freedom: glycosidic torsional angle (χ) and ribose ring pucker (P). For χ, syn spanned −90° to +90°, whilst anti described the remaining 180°, using the standard angle definitions: O4′-C1′-N9/N1-C4/C2 for purines/pyrimidines, respectively. For P, south (inclusive of the c2′-endo conformer) spanned +90° to +270°, whilst north (inclusive of the c3′-endo conformer) described the remaining 180° (Fig. 1), using the method of Altona³⁰. The resulting populations of each conformer were determined, with syn and anti dominating in the purines and pyrimidines, respectively (Fig. 2). All nucleosides, with the exception of guanosine, had a slight preference for the south ring conformer (~60%). Guanosine was observed to have a marked preference for the north conformer (~70%). The exchange rate between conformations was in the range 0.1–3.3 ns⁻¹ (Supplementary Figs. 1 and 2), putting the mean lifetimes of conformers in the hundreds of picoseconds to tens of nanoseconds range, in agreement with fast exchange observed in previous NMR studies³¹. In the pyrimidines anti to syn transitions had a slower rate (~0.1 ns⁻¹), consistent with the preference for the anti conformer, while in purines it was the syn to anti transition that had a slower rate (~0.4 ns⁻¹).

**Fig. 1: Ribonucleoside conformation.**

Fig. 2: Conformer distributions from classical simulations. Conformer distributions from 5 μs classical molecular dynamics (MD) simulations in water, using a force-field optimized for RNA (OPLS-AA/M).

Raman spectral prediction from classical molecular dynamics

The Raman spectrum of each nucleoside was initially approximated by a single 3D-structure selected from the most populous region of its conformer distribution (Fig. 2). Prior to the determination of the Raman-active vibrational modes, each nucleoside was quantum-mechanically energy-minimized; a process that scales cubically with system size³². The resulting discrete frequencies for Raman-active vibrations of these single conformers were broadened using a Lorentzian function to 5 cm⁻¹ at full-width half-maximum (FWHM), thereby accounting for intrinsic peak lifetime³³. Frequency scaling-factors were optimized to achieve experimental agreement³⁴. Band assignment of characteristic experimental peaks was performed for each nucleoside using these single, most-populous conformers (Supplementary Table 1). In all cases, this approach produced highly-resolved spectra with fine structure, but in general had a poor agreement with experiment (Fig. 3), especially in terms of the relative intensities of characteristic bands. For example, for adenosine, the intense triplet observed experimentally (1300–1400 cm⁻¹) was not predicted theoretically. Of particular note, experimental spectra were on average ~15 cm⁻¹ broader relative to predictions (Fig. 3). This implies that spectral linewidth is not solely due to intrinsic linewidth; non-uniform shape and width of experimental bands across all nucleosides points to this additional broadening being a result of conformational exchange³⁵. Therefore, it is apparent that predictions based on the single most populated conformer (Fig. 3) and any of the other canonical conformers (Supplementary Fig. 3) may be insufficient to interpret and decipher the complex Raman spectra of macromolecular systems, such as RNA. Moreover, poor scaling of this approach with respect to system size represents a potential bottleneck that limits its applicability. However, if the effect of conformational exchange could be taken into account theoretically, while achieving scalability, the opportunity exists to extract novel information on molecular dynamics from the experimental Raman spectra.

**Fig. 3: Comparison of different approaches to theoretically reproduce the experimental Raman spectra.**

The importance of including conformational exchange in Raman spectral predictions was confirmed by sampling 1000 3D-structures from each Boltzmann-weighted MD conformer distribution using aqueous molecular dynamics simulations and a force-field optimized for RNA (OPLS-AA/M). Each 3D-structure was geometrically optimized prior to Raman spectral calculation using B3LYP density functional theory (DFT), as above, and subsequent averaging over all 1000 generated spectra for each nucleoside was carried out to give single overall Raman spectra. These overall spectra were again frequency scaled, and broadened to 5 cm⁻¹ FWHM with a Lorentzian function. Inclusion of multiple conformers manifested as unresolved frequency and intensity variations within vibrational modes. This had the anticipated gross effect of further broadening the predicted spectra (Fig. 3) to within ~5 cm⁻¹ of experiment. Characteristic spectral features were more evident using this approach, such as the doublet observed in cytidine’s experimental spectrum (1200–1300 cm⁻¹) and the adenosine triplet (1300–1400 cm⁻¹), although relative peak intensities remained poorly reproduced across all four nucleosides. For example, the predicted singlet at ~800 cm⁻¹ in uridine had an exaggerated intensity (1.00) relative to experiment (0.36). Spectral calculations that modeled the 10 closest water molecules at the B3LYP DFT quantum mechanical level rather than the classical level very marginally improved the predictions (Supplementary Fig. 12), and calculations using an MD ensemble from the generalised AMBER force-field (GAFF) instead of OPLS led to an improvement in predicted spectra for purines, but gave a worse prediction for pyrimidines (Supplementary Fig. 13). Thus, while inclusion of conformer exchange provided a critical step towards a closer agreement with experiment, predicted intensities remained unreliable. Moreover, using this method, applicability to biological molecules was again limited by computational scaling with respect to system size.

Raman spectral prediction from ab initio molecular dynamics

In order to address the issue of scalability while also including local dynamics, ab initio molecular dynamics (AIMD) simulations were employed as they afford feasible linear scaling with system size³⁶. Specifically, an AIMD simulation of each nucleoside was started from a structure with a syn/south conformation (Supplementary Fig. 4). Raman spectra were subsequently determined by Fourier-transforming the autocorrelation-function of the molecular polarizability, further improving computational efficiency by removing the need to compute expensive polarizability derivatives³⁷. Local conformer dynamics were sampled for the syn/south conformer region per nucleoside, resulting in spectral linewidths that were broadened to within close agreement with experiment (Fig. 3). The adenosine triplet (1300–1400 cm⁻¹) was more clearly reproduced in the AIMD-based approach, with the central peak at a much greater intensity (0.92), in line with experiment (1.00). The relative intensities of guanosine peaks at ~1300 and ~1500 cm⁻¹ (0.81 cf. 1.00) were also more accurately reproduced relative to experiment (0.60 cf. 0.99). The syn conformers were predicted from the MD simulations (Fig. 2) to be highly populated relative to anti for the purines, so it is unsurprising that AIMD sampling around the syn region led to improved experimental agreement. On the other hand, as anti conformers were predicted to be of a much higher relative population in pyrimidines, sampling solely around the syn regions was expected to lead to poorer experimental agreement. Indeed, the intense doublet in cytidine (~1200–1300 cm⁻¹) observed experimentally was not predicted at all, and the intensity of the uridine peak at ~800 cm⁻¹ was over-exaggerated (0.96, experimentally 0.36). Therefore, whilst AIMD-simulation of the syn/south conformer leads to improvements in terms of linewidth and relative intensities of some purine peaks, for pyrimidines the poor prediction of intensities of key bands confirmed the need to take account of further conformers.

Additional AIMD simulations were carried out to determine whether spectra predicted from local sampling around different conformer regions manifested significant changes to peak intensities. For purines, additional AIMD simulations were carried out in the syn/north region only, as it was deemed that the anti conformer regions were of too-low a population to contribute significantly overall. For pyrimidines, additional AIMD simulations were performed in syn/north and anti/south regions (Supplementary Figs. 5 and 6). It was not necessary to perform separate anti/north simulations as facile interchange between anti/south and anti/north conformers was observed, resulting in anti/north–south AIMD simulations that contained a mix of both conformers. Raman sub-spectra were subsequently produced for each significantly-populated region using each of these simulated trajectories, as described above (Supplementary Fig. 7). The main differences between sub-spectra (representing each significantly-populated region) were the relative intensities of characteristic peaks. For example, in the syn/south sub-spectrum of adenosine, the peak at ~740 cm⁻¹ was predicted to have the highest relative intensity (1.00), whereas the same peak in the syn/north conformer was much less intense (0.51). Experimentally, this peak was between both predictions (0.64), suggesting contributions from both sub-spectra to the Raman spectrum. Furthermore, in syn/north the intensities of the three triplet peaks at ~1300–1400 cm⁻¹ (0.78, 1.00, 0.37) were much more accurately reproduced than for syn/south (0.49, 0.92, 0.55), relative to experiment (0.71, 1.00, 0.36). For syn/south of cytidine, both peaks of the intense, broad doublet (1200–1300 cm⁻¹) were predicted to have a very low intensity, indiscernible from spectral noise, whereas in syn/north the peak at ~1300 cm⁻¹ was recovered, and the peak at ~1250 cm⁻¹ was recovered in the anti/north–south sub-spectrum. These results exemplify how an overall Raman spectrum comprises spectral features of individual conformer sub-spectra, with substantially different relative intensities. As a consequence, an overall Raman spectrum is a reflection of the relative abundance of each conformer in solution. It was further presupposed that the effect of conformational exchange on Raman spectra could be modeled by mixing and weighting sub-spectra corresponding to significant populations. Moreover, the dramatic intensity changes of characteristic peaks between major conformer sub-spectra provides a theoretical framework for developing empirical rules to rapidly determine conformer populations in more complex molecules.

Sub-spectra for each significant population were recombined using their calculated classical MD weights, which resulted in a substantial syn bias for purines and anti bias for pyrimidines (Table 1). Additionally, spectra were re-combined by optimizing sub-spectral weights to maximize experimental agreement, thereby allowing experiment to dictate conformer populations (Table 1). Both combination methods led to significantly improved experimental agreement (Fig. 4). For example, inclusion of the anti conformers in the cytidine spectra led to the prediction of a strong peak in the 1200–1300 cm⁻¹ region. Such a peak was not present in the syn spectra. Similarly, in the uridine spectrum, the relative intensities of the peaks at ~800 and ~1250 cm⁻¹ were only adequately recovered by heavily weighting spectra towards the anti conformers. Both classical MD and optimizing to experiment resulted in strong anti-dominance for pyrimidines, with very little difference between spectra computed from each combination method. Likewise for adenosine, differences between each spectral combination method were minimal, with both performing well. This observation suggests that the initial classical MD simulation weights were likely highly representative of solution phase conformer populations. However, for guanosine, weighting optimization to experiment led to improved relative intensities of peaks at ~1200 and ~1300 cm⁻¹ when compared to the classical MD predictions. This translates to a likely higher solution population of syn/south conformers than predicted by classical MD. To corroborate the syn-dominance in purines and anti-dominance in pyrimidines, the lowest energy conformer in each conformational region was determined (Supplementary Table 2). For the purines, the lowest energy conformer was in the syn region, whereas for pyrimidines, the lowest energy conformer was in the anti region (Supplementary Figs. 8–11). This reinforces the prediction that, in solution, syn conformers dominate for purines, whereas anti conformers dominate for pyrimidines.

Table 1 Relative weighting of theoretical Raman spectra from each conformational region determined from AIMD local sampling. Two differing spectral combination approaches are contrasted in order to reproduce experimental Raman spectra: (A) relative conformer population weights derived from classical MD simulations, and (B) optimization of relative conformer population weights from experiment.

Full size table

**Fig. 4: Comparison of experiment with AIMD-based theoretical Raman spectra.**

Conformational population validation

Previous NMR measurements on ribonucleosides^38,39 demonstrated no strong preference for either ring pucker (south/north) in pyrimidines, but a definite bias towards the anti-orientation of the base was found and simulations using a specially-parameterized OPLS force-field, as described here, substantiate these conclusions. Similarly, for purines, no strong preference for either ring pucker (south/north) was observed, and while values for base orientation have not been determined by NMR due to aggregation, the OPLS force-field predicts a strong bias towards syn-orientation of the base. These previous conclusions are in excellent agreement with our results.

In conclusion, the use of multiple ab initio MD simulations to determine sub-spectra, which are then optimized to experimental Raman spectra, provides an independent measure of ribonucleoside solution populations and is computationally scalable beyond the realm of solution-phase NMR. Our framework thus provides a novel way to deconvolute a Raman spectrum into its constituent conformers in populations that agree with independent NMR measurements. Furthermore, as characteristic Raman bands manifest significant intensity differences between conformers, the opportunity exists to extract empirical rules for the estimation of conformer distributions without recourse to calculation. For example, a syn/south adenosine conformer has an intensity ratio of 1.00:0.92 between the intense singlet at ~740 cm⁻¹ and the central peak of the triplet at ~1300–1400 cm⁻¹, whereas this ratio is 0.69:1.00 in the syn/north conformer. Intermediate values for this ratio allow the relative populations of each conformer to be determined and similar associations can be inferred for the other nucleosides. It could be envisaged that in favorable situations and when the sequence of nucleotides is known, empirical rules such as these may be extended to allow for rapid deconvolution of conformational exchange in complex RNA. While the combination of classical and multiple ab initio molecular dynamics simulations with experimental Raman spectra enable a novel route to determine relative populations of RNA nucleosides in solution, it is anticipated that the extension to DNA would be straightforward. Moreover, the framework described here is not restricted to nucleosides and could now be attempted for a wide range of subunits from other important molecules, such as proteins, carbohydrates, and even industrial polymers. In concert with the computational scalability of our framework, this could confer Raman spectroscopy with an ability to explore the conformational exchange within biological subunits more widely. In doing so, spectral complexity can be deciphered and specific bands assigned to specific conformers, overcoming the previous limitations of Raman and opening it up to complement other experimental techniques. This technological advance can lead to new chemical and biological insight, provide a much needed new tool in analytical chemistry, and aid in the manufacture of molecules, such as therapeutics.

Methods

Raman spectral prediction from classical molecular dynamics

For each nucleoside, 5 μs classical molecular dynamics (MD) simulations were performed in water and coordinates were saved every 10 ps for later analysis. Ensembles of 1000 conformations were obtained by uniform sampling of every 50 coordinate sets. Each nucleoside was solvated in a TIP3P⁴⁰ water box using the Amber12 tool tleap⁴¹. Simulations were performed using the OPLS-AA/M force field for RNA⁴². Energy minimization was carried out using the L-BFGS minimization algorithm until the gradient of the internal energy per step was less than 1 kcal mol⁻¹. The system was brought to 298 K immediately by suitable scaling of the atomic velocities and then an initial equilibration simulation step in the NPT ensemble for 20 ns was performed using a Monte Carlo barostat (1 atm) with volume changes at 1 ns intervals, coupled to a Langevin thermostat (298 K) with a damping factor of 0.1 ps⁻¹. Subsequently, the equilibrium volume was calculated and fixed (45–50 nm³ in all cases) and the simulations were continued in the NVT ensemble for 5 μs. Hydrogen atoms were constrained using M-SHAKE⁴³, electrostatic interactions were calculated via cubic periodic boundary conditions, electrostatic and van der Waals interactions were truncated (9 Å), and the integration time-step was 4 fs with hydrogen mass (4 a.m.u.) repartitioning⁴⁴. All MD simulations used the OpenMM⁴⁵, but with a modification to the Lennard-Jones potential to account for the geometric mixing rule of OPLS-AA/M. A comparison of OPLS-AA/M with other force fields is presented in Supplementary Table 3.

Geometry optimizations were carried out using the ONIOM quantum mechanics/molecular mechanics (QM/MM) implementation in Gaussian09⁴⁶. For each frame the water molecules within 10 Å radius of the nucleoside were kept, and structures were considered optimized when the forces on all nuclei fell below 10⁻⁶ Hartree/Bohr. Nucleosides were treated with density functional theory (DFT) using the B3LYP functional^47,48,49,50, combined with the 6-31+G(d) Pople basis set, and water molecules were treated at the MM level using the Universal force field of Rappé et al.⁵¹ implemented in Gaussian09. No further implicit solvent model was used, as several shells of explicit water were present. It was confirmed that minimization maintained the conformational diversity present in the MD simulations (Supplementary Table 4). Frequencies and activities of Raman-active vibrational modes were determined within the double harmonic approximation using Gaussian09. Raman activities were then converted into intensities by accounting for excitation laser wavelength (532 nm) and temperature (293 K), using standard methods⁵². These intensities were then normalized for comparison to normalized experimental spectra. Lorentzian broadening to 5 cm⁻¹ at full-width half-peak-height, as well as frequency scaling, was performed with an in-house program (see Code availability).

Raman spectral predictions from ab initio molecular dynamics

Solvated nucleosides were equilibrated for 2.5 ps followed by 17.5 ps production runs using AIMD, and electron densities were saved every 2.5 fs in the discretized Gaussian CUBE format with a stride of 2. The QM subsystem consisted of the single nucleoside, enclosed in a non-periodic cubic box (20.0 Å), in contact with classical water molecules. Global cubic simulation cells were periodic cubic boxes (A = 24.25 Å, U = 23.18 Å, C = 23.43 Å). The QM subsystem used the dispersion-corrected BLYP-D3 functional^48,53,54 and the hybrid Gaussian and planewave (GPW) method⁵⁵, core electrons were described with Goedecker–Teter–Hutter (GTH) pseudopotentials^56,57,58, valence states with the Gaussian-type MOLOPT62 DZVP-MOLOPT-SR-GTH basis set of double-zeta polarized quality from the CP2K package, with a plane-wave cut-off of 280 Ry, and the MM system used identical AMBER parameters to classical MD described above. The equations of motion (classical and Born–Oppenheimer), were integrated using a timestep of 0.5 fs, 300 K using the Nosé thermostat^59,60,61 (timecon 100 fs) For each timestep, the electronic structure was explicitly quenched to a tolerance of 10^-6 Hartree. All AIMD simulations were performed with the QUICKSTEP program³⁶ from the CP2K suite⁶². The Raman spectra were directly determined from the Fourier transform of the autocorrelation function of the molecular polarizability, and were normalized³⁷. A positive electric field (5 × 10⁻⁴ a.u.) was applied along each of the three independent directions of space to get all polarizability tensor components. Numerical integration of the electron density within Voronoi cells used TRAVIS⁶³.

Combining ab initio molecular dynamics-derived spectra of individual conformers

AIMD-derived spectra of multiple conformers of individual nucleosides were combined into a global spectrum for each nucleoside, which was then fitted to experiment. Specifically, the multiple AIMD-derived spectra for each nucleoside were fitted to the corresponding experimental spectrum of that given nucleoside using a statistical procedure, which optimized: (i) the contribution of each spectrum, (ii) the frequency rescaling of every contribution, and (iii) a global frequency shift.

In digital format, spectral data were stored as a two-column array, containing a grid of wavenumbers and the corresponding intensities. Since experimental and predicted spectra often present disjoint wavenumbers, the first step of this process was to obtain intensities on a common grid of N_grid wavenumbers for all spectra (x^grid,i). After normalization, each spectrum was fitted using cubic splines, obtaining a function that describes spectral intensity in terms of the wavenumber. Introducing the rescaling factor of the j^th spectrum r_j, and the global shift factor s, this function has the form of Eq. (1), where y_i^conf,j is the intensity prediction for the i^th grid point and the j^th conformer.

$$y_i^{{{\mathrm{conf}},j}}\left( {r_j,s} \right) = f^{{{\mathrm{fit}},j}}\left( {r_jx^{{{\mathrm{grid}},i}} + s} \right) \; \text{for} \: 1\leq i \leq N_{\mathrm{grid}}$$

(1)

Here, f^fit,j is the sum of cubic splines obtained in the previous step, and y_i^exp is the experimental intensity at the i^th grid point. The parameters mentioned in Eq. (1), as well as the weights (w_j), were determined with a modified least-squares regression method, with a loss function that includes corrections to minimize overfitting, Eq. (2). The weights can be interpreted as relative populations of different conformers.

$$\begin{array}{l} L\left( {\mathbf{w}},{\mathbf{r}},{\mathbf{s}} \right) = \sum\limits_{i = 1}^{{N_{{\mathrm{grid}}}}} {\left\{ {y_i^{{\mathrm{exp}}} - \left[\sum_{j=1}^{N_{\text{conf}}} w_j y_i^{{\text{conf}}, j} \right] } \right\}}^2 + L_{{\mathrm{weight}}}\left( \mathbf{w} \right) + L_{{\mathrm{scaling}}}\left( \mathbf{r} \right) + L_{{\mathrm{shift}}}\left( s \right)\\ {\mathrm{where}}\qquad\qquad L_{{\mathrm{weight}}}\left( \mathbf{w} \right) = c_{{\mathrm{weight}}}\left[ {\mathop {\sum}\limits_{j = 1}^{N_{{\mathrm{conf}}}} {w_j^2 - 1} } \right]^{2}\qquad\qquad\qquad\qquad\qquad\\ \; L_{{\mathrm{scaling}}}\left( \mathbf{r} \right) = c_{{\mathrm{rescaling}}}\mathop {\sum}\limits_{j = 1}^{N_{{\mathrm{conf}}}} {\left( {\left| {r_j - 1} \right| - r_{{\mathrm{eq}}}} \right)} ^2\\ \;\; L_{{\mathrm{shift}}}\left( s \right) = \left\{ {\begin{array}{*{20}{c}} {c_{{\mathrm{shift}}}} & {{\mathrm{s}}\, < \,s_{{\mathrm{min}}},s \, > \, s_{{\mathrm{max}}}} \\ 0 & {s_{{\mathrm{min}}} \le s \le s_{{\mathrm{max}}}} \end{array}}\qquad \right.\end{array}$$

(2)

The first term on the right-hand side of Eq. (2) is the classical least-squares loss function that attempts to minimize the error in the predicted spectrum. The second term ensures that the sum of the weights adds up to 1, which is paramount for their interpretation as relative populations. The third term restricts the rescaling factor to lie close to an equilibrium value of ±5%, and the fourth term ensures that the global shift is constrained.

The choice of parameters has a key influence on the effect of this procedure. In this study, the following collection of parameters were used: c_weight = 1.0 × 10⁸, c_rescaling = 1.0 × 10², c_shift = 5.0 × 10⁸, r_eq = 0.05, s_min = 0.0 cm⁻¹, and s_max = 200.0 cm⁻¹. The least-squares loss function generally takes values in the ~10² order of magnitude, and therefore we selected penalties that would be loose or tight depending on the violation of the conditions. For example, the large penalty in the weights ensures that they always add up to 1 with significant precision. The Broyden–Fletcher–Goldfarb–Shanno (BFGS) algorithm was then used to find a minimum of the loss function. In order to find a point close to, or equal to, the global minimum, the optimization procedure was repeated 50 times with different random initial values, keeping the best result.

Experimental Raman spectra

The four ribonucleoside samples with purity >99% (Sigma-Aldrich Company Ltd., Gillingham, Dorset, UK) were dissolved in Milli-Q water (Merck KGaA, Darmstadt, Germany). Uridine and cytidine were dissolved at 50 mg/mL, and adenosine and guanosine were made up as saturated solutions to maximize their concentrations. Pyrimidines were self-buffering between pH 7–8 and were not further pH adjusted. Adenosine and guanosine were measured to have pH values of 6.9 and 6.7 at their saturated concentrations, respectively. Raman spectra were acquired using a ChiralRaman spectrometer (BioTools Inc., Jupiter, FL, USA) with a 532 nm laser of power 1.2 W. Samples of 200 μL were placed in rectangular quartz cells coated with magnesium fluoride (Starna Scientific Ltd., Hainault, Essex, UK), sealed to avoid evaporation, and held at a constant temperature (293 K). The final spectra were accumulated over 1 h, and directly prior to each sample acquisition, a spectrum of pure water was obtained under identical conditions. The pure water background was subtracted from the raw sample spectra and the result was further baselined using an asymmetric least squares smoothing approach (third-order polynomial)⁶⁴. Each sample had a sample and laser stabilization of 1 h, at which point it was confirmed that the spectral intensity was constant and thus neither sample degradation was occurring nor interference due to undissolved sample. Raman spectra were then recorded over 600–1600 cm⁻¹ with a resolution of 3 cm⁻¹, using laser illumination periods of 1.25 s. Spectra were truncated at 1600 cm⁻¹ as beyond this the water background made baseline correction unreliable, even using the above methods.

Data availability

The main data supporting the findings of this study are included in the paper and its Supplementary Information files. Additional raw data are available from the corresponding author on reasonable request.

Code availability

Lorentzian broadening of theoretically predicted Raman spectra, as well as frequency scaling, was performed with an in-house program. The source software code files for performing these calculations are available as Supplementary Data 1 and 2. Two versions are available: one that allows linewidths to be optimized to the Raman spectrum and one that fixes them to 5 cm⁻¹ at full-width half-peak-height (as used in Fig. 3). All parameters required are included in the program files.

Change history

13 May 2020
The original version of this Article was updated shortly after publication following an error that resulted in the ORCID ID of Carlos Outeiral being omitted and to correct the ORCID ID of Sarah E. Dowd.

References

Kruger, K. et al. Self-splicing RNA: autoexcision and autocyclization of the ribosomal RNA intervening sequence of tetrahymena. Cell 31, 147–157 (1982).
Article CAS PubMed Google Scholar
Gilbert, W. Origin of life: the RNA world. Nature 319, 618–618 (1986).
Article Google Scholar
Zhang, X., Cekan, P., Sigurdsson, S. T. & Qin, P. Z. Studying RNA using site-directed spin-labeling and continuous-wave electron paramagnetic resonance spectroscopy. Methods Enzymol. 469, 303–328 (2009).
Article CAS PubMed PubMed Central Google Scholar
Dunkle, J. A. et al. Structures of the bacterial ribosome in classical and hybrid states of tRNA binding. Science 332, 981–984 (2011).
Article CAS PubMed PubMed Central Google Scholar
Murray, L. J. W., Arendall, W. B., Richardson, D. C. & Richardson, J. S. RNA backbone is rotameric. Proc. Natl. Acad. Sci. U.S.A. 100, 13904–13909 (2003).
Article CAS PubMed PubMed Central Google Scholar
Richardson, J. S. et al. RNA backbone: consensus all-angle conformers and modular string nomenclature (an RNA Ontology Consortium contribution). RNA 14, 465–481 (2008).
Article CAS PubMed PubMed Central Google Scholar
Afonine, P. V. et al. New tools for the analysis and validation of cryo-EM maps and atomic models. Acta Crystallogr. D 74, 814–840 (2018).
Article CAS Google Scholar
Dethoff, E. A., Chugh, J., Mustoe, A. M. & Al-Hashimi, H. M. Functional complexity and regulation through RNA dynamics. Nature 482, 322–330 (2012).
Article CAS PubMed PubMed Central Google Scholar
Marchanka, A. & Carlomagno, T. Solid-state NMR spectroscopy of RNA. Methods Enzymol. 615, 333–371 (2019).
Article CAS PubMed Google Scholar
Fürtig, B., Richter, C., Wöhnert, J. & Schwalbe, H. NMR spectroscopy of RNA. ChemBioChem 4, 936–962 (2003).
Article PubMed CAS Google Scholar
Davis, P. W., Hall, K., Cruz, P., Tinoco, I. Jr. & Neilson, T. The tetraribonucleotide r C p G p C p G forms a left-handed Z-RNA double-helix. Nucleic Acids Res. 14, 1279–1291 (1986).
Article CAS PubMed PubMed Central Google Scholar
Feigon, J., Wang, A. H. J., van der Marel, G. A., Van Boom, J. H. & Rich, A. A one- and two-dimensional NMR study of the B to Z transition of (m5dC-dG)3 in methanolic solution. Nucleic Acids Res. 12, 1243–1263 (1984).
Article CAS PubMed PubMed Central Google Scholar
Petersen, M. & Wengel, J. LNA: a versatile tool for therapeutics and genomics. Trends Biotechnol. 21, 74–81 (2003).
Article CAS PubMed Google Scholar
Tsuboi, M. Application of infrared spectroscopy to structure studies of nucleic acids. Appl. Spectrosc. Rev. 3, 45–90 (1970).
Article Google Scholar
Lord, R. C. & Thomas, G. J. Raman spectral studies of nucleic acids and related molecules-I Ribonucleic acid derivatives. Spectrochim. Acta A 23, 2551–2591 (1967).
Article CAS Google Scholar
Koenig, J. L. Raman spectroscopy of biological molecules: a review. J. Polym. Sci. 6, 59–177 (1972).
CAS Google Scholar
Neese, F. Prediction of molecular properties and molecular spectroscopy with density functional theory: from fundamental theory to exchange-coupling. Coord. Chem. Rev. 253, 526–563 (2009).
Article CAS Google Scholar
Petrenko, T. & Neese, F. Analysis and prediction of absorption band shapes, fluorescence band shapes, resonance Raman intensities, and excitation profiles using the time-dependent theory of electronic spectroscopy. J. Chem. Phys. 127, 164319 (2007).
Article PubMed CAS Google Scholar
Rauhut, G., Knizia, G. & Werner, H.-J. Accurate calculation of vibrational frequencies using explicitly correlated coupled-cluster theory. J. Chem. Phys. 130, 054105 (2009).
Article PubMed CAS Google Scholar
Barone, V., Improta, R. & Rega, N. Quantum mechanical computations and spectroscopy: from small rigid molecules in the gas phase to large flexible molecules in solution. Acc. Chem. Res. 41, 605–616 (2008).
Article CAS PubMed Google Scholar
Andrushchenko, V. & Bouř, P. Applications of the Cartesian coordinate tensor transfer technique in the simulations of vibrational circular dichroism spectra of oligonucleotides. Chirality 22, E96–E114 (2010).
Article CAS PubMed Google Scholar
Andrushchenko, V., Tsankov, D., Krasteva, M., Wieser, H. & Bouř, P. Spectroscopic detection of DNA quadruplexes by vibrational circular dichroism. J. Am. Chem. Soc. 133, 15055–15064 (2011).
Article CAS PubMed Google Scholar
Kessler, J., Kapitán, J. & Bouř, P. First-principles predictions of vibrational Raman optical activity of globular proteins. J. Phys. Chem. Lett. 6, 3314–3319 (2015).
Article CAS Google Scholar
Kessler, J., Andrushchenko, V., Kapitán, J. & Bouř, P. Insight into vibrational circular dichroism of proteins by density functional modeling. Phys. Chem. Chem. Phys. 20, 4926–4935 (2018).
Article CAS PubMed Google Scholar
Přecechtělová, J. et al. Toward reproducing sequence trends in phosphorus chemical shifts for nucleic acids by MD/DFT calculations. J. Chem. Theory Comput. 9, 1641–1656 (2013).
Article PubMed CAS Google Scholar
Andrushchenko, V., Benda, L., Páv, O., Dračínský, M. & Bouř, P. Vibrational properties of the phosphate group investigated by molecular dynamics and density functional theory. J. Phys. Chem. B 119, 10682–10692 (2015).
Article CAS PubMed Google Scholar
Giovannini, T. et al. Polarizable embedding approach for the analytical calculation of Raman and Raman optical activity spectra of solvated systems. J. Chem. Theory Comput. 13, 4421–4435 (2017).
Article CAS PubMed Google Scholar
Biczysko, M. et al. Integrated computational approaches for spectroscopic studies of molecular systems in the gas phase and in solution: pyrimidine as a test case. Theor. Chem. Acc. 131, 1201 (2012).
Article CAS Google Scholar
Wendler, K., Brehm, M., Malberg, F., Kirchner, B. & Delle Site, L. Short time dynamics of ionic liquids in AIMD-based power spectra. J. Chem. Theory Comput. 8, 1570–1579 (2012).
Article CAS PubMed Google Scholar
Altona, C. & Sundaralingam, M. Conformational analysis of the sugar ring in nucleosides and nucleotides. New description using the concept of pseudorotation. J. Am. Chem. Soc. 94, 8205–8212 (1972).
Article CAS PubMed Google Scholar
Plavec, J., Roselt, P., Földesi, A. & Chattopadhyaya, J. Pseudorotation of the sugar moiety in pyrimidine nucleosides occurs on the same time-scale as the overall tumbling: 2H NMR relaxation time study. Magn. Reson. Chem. 36, 732–740 (1998).
Article CAS Google Scholar
Valiev, M. et al. NWChem: a comprehensive and scalable open-source solution for large scale molecular simulations. Comput. Phys. Commun. 181, 1477–1489 (2010).
Article CAS Google Scholar
Singh, R. K., Singh, S. N., Asthana, B. P. & Pathak, C. M. Deconvolution of Lorentzian Raman linewidth: techniques of polynomial fitting and extrapolation. J. Raman Spectrosc. 25, 423–428 (1994).
Article CAS Google Scholar
Pople, J. A. et al. Molecular orbital studies of vibrational frequencies. Int. J. Quantum Chem. 20, 269–278 (1981).
Article Google Scholar
Yu, N.-T. & Krimm, S. Raman spectroscopy: a conformational probe in biochemistry. Crit. Rev. Biochem. 4, 229–280 (1977).
Article CAS Google Scholar
VandeVondele, J. et al. Quickstep: fast and accurate density functional calculations using a mixed Gaussian and plane waves approach. Comput. Phys. Commun. 167, 103–128 (2005).
Article CAS Google Scholar
Thomas, M., Brehm, M., Fligg, R., Vöhringer, P. & Kirchner, B. Computing vibrational spectra from ab initio molecular dynamics. Phys. Chem. Chem. Phys. 15, 6608–6622 (2013).
Article CAS PubMed Google Scholar
Yildirim, I., Stern, H. A., Kennedy, S. D., Tubbs, J. D. & Turner, D. H. Reparameterization of RNA χ torsion parameters for the AMBER force field and comparison to NMR spectra for cytidine and uridine. J. Chem. Theory Comput. 6, 1520–1531 (2010).
Article CAS PubMed PubMed Central Google Scholar
Robertson, M. J., Tirado-Rives, J. & Jorgensen, W. L. Improved treatment of nucleosides and nucleotides in the OPLS-AA force field. Chem. Phys. Lett. 683, 276–280 (2017).
Article CAS PubMed PubMed Central Google Scholar
Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W. & Klein, M. L. Comparison of simple potential functions for simulating liquid water. J. Chem. Phys. 79, 926–935 (1983).
Article CAS Google Scholar
Case, D. A. et al. The Amber biomolecular simulation programs. J. Comput. Chem. 26, 1668–1688 (2005).
Article CAS PubMed PubMed Central Google Scholar
Robertson, M. J., Qian, Y., Robinson, M. C., Tirado-Rives, J. & Jorgensen, W. L. Development and testing of the OPLS-AA/M force field for RNA. J. Chem. Theory Comput. 15, 2734–2742 (2019).
Article CAS PubMed PubMed Central Google Scholar
Kräutler, V., van Gunsteren, W. F. & Hünenberger, P. H. A fast SHAKE algorithm to solve distance constraint equations for small molecules in molecular dynamics simulations. J. Comput. Chem. 22, 501–508 (2001).
Article Google Scholar
Buch, I., Harvey, M. J., Giorgino, T., Anderson, D. P. & De Fabritiis, G. High-throughput all-atom molecular dynamics simulations using distributed computing. J. Chem. Inf. Model. 50, 397–403 (2010).
Article CAS PubMed Google Scholar
Eastman, P. et al. OpenMM 7: rapid development of high performance algorithms for molecular dynamics. PLoS Comput. Biol. 13, e1005659 (2017).
Article PubMed PubMed Central CAS Google Scholar
Frisch, M. J. et al. Gaussian 09, Revision B.01 (Gaussian Inc., Wallingford, CT, 2009).
Becke, A. D. Density‐functional thermochemistry. III. The role of exact exchange. J. Chem. Phys. 98, 5648–5652 (1993).
Article CAS Google Scholar
Lee, C., Yang, W. & Parr, R. G. Development of the Colle–Salvetti correlation-energy formula into a functional of the electron density. Phys. Rev. B 37, 785–789 (1988).
Article CAS Google Scholar
Vosko, S. H., Wilk, L. & Nusair, M. Accurate spin-dependent electron liquid correlation energies for local spin density calculations: a critical analysis. Can. J. Phys. 58, 1200–1211 (1980).
Article CAS Google Scholar
Stephens, P. J., Devlin, F. J., Chabalowski, C. F. & Frisch, M. J. Ab initio calculation of vibrational absorption and circular dichroism spectra using density functional force fields. J. Phys. Chem. 98, 11623–11627 (1994).
Article CAS Google Scholar
Rappe, A. K., Casewit, C. J., Colwell, K. S., Goddard, W. A. & Skiff, W. M. UFF, a full periodic table force field for molecular mechanics and molecular dynamics simulations. J. Am. Chem. Soc. 114, 10024–10035 (1992).
Article CAS Google Scholar
Polavarapu, P. L. Ab initio vibrational Raman and Raman optical activity spectra. J. Phys. Chem. 94, 8106–8112 (1990).
Article CAS Google Scholar
Becke, A. D. Density-functional exchange-energy approximation with correct asymptotic behavior. Phys. Rev. A 38, 3098–3100 (1988).
Article CAS Google Scholar
Grimme, S., Antony, J., Ehrlich, S. & Krieg, H. A consistent and accurate ab initio parametrization of density functional dispersion correction (DFT-D) for the 94 elements H-Pu. J. Chem. Phys. 132, 154104 (2010).
Article CAS PubMed Google Scholar
Lippert, G. & HutterJ. & Parrinello M. A hybrid Gaussian and plane wave density functional scheme. Mol. Phys. 92, 477–488 (1997).
Article CAS Google Scholar
Krack, M. Pseudopotentials for H to Kr optimized for gradient-corrected exchange-correlation functionals. Theor. Chem. Acc. 114, 145–152 (2005).
Article CAS Google Scholar
Hartwigsen, C., Goedecker, S. & Hutter, J. Relativistic separable dual-space Gaussian pseudopotentials from H to Rn. Phys. Rev. B 58, 3641–3662 (1998).
Article CAS Google Scholar
Goedecker, S., Teter, M. & Hutter, J. Separable dual-space Gaussian pseudopotentials. Phys. Rev. B 54, 1703–1710 (1996).
Article CAS Google Scholar
Martyna, G. J., Klein, M. L. & Tuckerman, M. Nosé–Hoover chains: the canonical ensemble via continuous dynamics. J. Chem. Phys. 97, 2635–2643 (1992).
Article Google Scholar
Nosé, S. A unified formulation of the constant temperature molecular dynamics methods. J. Chem. Phys. 81, 511–519 (1984).
Article Google Scholar
Hoover, W. G. Canonical dynamics: equilibrium phase-space distributions. Phys. Rev. A 31, 1695–1697 (1985).
Article CAS Google Scholar
Hutter, J., Iannuzzi, M., Schiffmann, F. & VandeVondele, J. cp2k: atomistic simulations of condensed matter systems. WIREs. Comput. Mol. Sci. 4, 15–25 (2014).
Article CAS Google Scholar
Brehm, M. & Kirchner, B. TRAVIS—a free analyzer and visualizer for Monte Carlo and molecular dynamics trajectories. J. Chem. Inf. Model. 51, 2007–2023 (2011).
Article CAS PubMed Google Scholar
Lieber, C. A. & Mahadevan-Jansen, A. Automated method for subtraction of fluorescence from biological Raman spectra. Appl. Spectrosc. 57, 1363–1367 (2003).
Article CAS PubMed Google Scholar

Download references

Acknowledgements

These studies were supported by the UK Research and Innovation Grants (BB/M021637, EP/M507490/1, and EP/K005472).

Author information

Authors and Affiliations

Manchester Institute of Biotechnology and Department of Chemistry, School of Natural Science, Faculty of Science and Engineering, The University of Manchester, M1 7DN, Manchester, UK
Alex L. Wilson, Carlos Outeiral, Sarah E. Dowd, Paul L. A. Popelier, Jonathan P. Waltho & Andrew Almond
Division of Neuroscience and Experimental Psychology, Michael Smith Building, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, M13 9PT, Manchester, UK
Andrew J. Doig
Krebs Institute for Biomolecular Research, Department of Molecular Biology and Biotechnology, The University of Sheffield, S10 2TN, Sheffield, UK
Jonathan P. Waltho

Authors

Alex L. Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Carlos Outeiral
View author publications
You can also search for this author in PubMed Google Scholar
Sarah E. Dowd
View author publications
You can also search for this author in PubMed Google Scholar
Andrew J. Doig
View author publications
You can also search for this author in PubMed Google Scholar
Paul L. A. Popelier
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan P. Waltho
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Almond
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.L.W., A.J.D., P.L.A.P., J.P.W., and A.A. conceived the research. A.A. carried out the classical MD simulations. A.L.W. and C.O. performed all DFT calculations and AIMD simulations. A.L.W. and A.A. carried out the theoretical Raman spectral calculations. S.E.D. performed the experimental Raman spectral determination for each nucleoside. A.L.W. and A.A. wrote the manuscript with input from all authors.

Corresponding author

Correspondence to Andrew Almond.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Description of Additional Supplementary Files

Supplementary Data 1

Supplementary Data 2

Peer Review File

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Wilson, A.L., Outeiral, C., Dowd, S.E. et al. Deconvolution of conformational exchange from Raman spectra of aqueous RNA nucleosides. Commun Chem 3, 56 (2020). https://doi.org/10.1038/s42004-020-0298-x

Download citation

Received: 05 September 2019
Accepted: 06 April 2020
Published: 06 May 2020
DOI: https://doi.org/10.1038/s42004-020-0298-x

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Results and discussion

Classical molecular dynamics ensemble generation

Raman spectral prediction from classical molecular dynamics

Raman spectral prediction from ab initio molecular dynamics

Conformational population validation

Methods

Raman spectral prediction from classical molecular dynamics

Raman spectral predictions from ab initio molecular dynamics

Combining ab initio molecular dynamics-derived spectra of individual conformers

Experimental Raman spectra

Data availability

Code availability

Change history

13 May 2020

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links