The effect of the neutral cytidine protonated analogue pseudoisocytidine on the stability of i-motif structures

Incorporation of pseudoisocytidine (psC), a neutral analogue of protonated cytidine, in i-motifs has been studied by spectroscopic methods. Our results show that neutral psC:C base pairs can stabilize i-motifs at neutral pH, but the stabilization only occurs when psC:C base pairs are located at the ends of intercalated C:C+ stacks. When psC occupies central positions, the resulting i-motifs are only observed at low pH and psC:C+ or psC:psC+ hemiprotonated base pairs are formed instead of their neutral analogs. Overall, our results suggest that positively charged base pairs are necessary to stabilize this non-canonical DNA structure.


NMR assignment details 3 2. Supplementary Figures
: Amino and non-exchangeable protons region of NOESY spectrum of CC0 4 Figure S2: Imino-amino/aromatic protons regions of NOESY spectrum of CC0 5 Figure S3: Imino-sugar protons region of NOESY spectrum of CC0 6 Figure S4: Schemes of the head-to-head and head-to-tail i-motif species of CC0 7 Figure S5: psC:psC, psC:psC + and C:psC + base pairs 7 Figure S6: Series of 1D 1 H-NMR spectra of C7, CC8 and CC9 at different pH and T 8 Figure S7: Series of CD spectra vs. T of C0, C7, CC0, CC8 and CC9 9 Figure S8: Scheme of the head-to-tail species of CC9 10 Figure S9: Amino/aromatic protons regions of the NOESY spectrum of CC9 10 Figure S10: Exchangeable protons regions of CC8 11 Figure S11: Schemes of the head-to-head and head-to-tail i-motif species of CC8 12 Figure S12: Exchangeable protons regions of C7 12 Figure S13: Imino protons regions of C7 at different pH 13 Figure S14: Schemes of the head-to-head and head-to-tail i-motif species of C7 14 Figure S15: CD Melting curves of HT0, HT-psC1, HT-psC17 and HT-psC28.

NMR assignment details.
NMR assignment of the unmodified sequence d(TCCGTTTCCGT), CC0, was carried out at pH 4. The exchangeable proton spectra indicate that at this pH and T = 5°C, CC0 is highly structured. Characteristic imino signals of hemiprotonated C:C + base pairs are observed at 15-16 ppm, confirming the formation of i-motif structures. However, much overlapping is found in the non-exchangeable protons spectra. Two broad cytosine H5-H6 cross-peaks (~7.85 and 7.40 ppm) are found in the TOCSY spectra, whereas up to twelve aromatic spin systems can be identified in the NOESY spectra. Since only eleven aromatic spin systems are expected for a unique dimeric symmetrical structure, we conclude that there are more than one species present. Fortunately, the exchangeable proton region exhibits very little overlapping and the cross-peaks patterns of imino and amino signals provide key structural information.
As shown in Figure S1, five different cytosine H41-H42 cross-peaks are found. All these amino proton signals exhibit cross-peaks with the imino signals observed at 15.48, 15.25 and 15.23 ppm (see Figure  S2). One of these amino cross-peaks (the most intense) could be assigned to the amino protons of C2 and C8, which present degenerated signals. These amino protons only show cross-peaks with the imino signal at 15.48 ppm. The other four H41-H42 cross-peaks were assigned to two different pairs of C3 and C9 cytosine residues, indicating the formation of two i-motif species. One pair of amino protons (assigned to C3) shows cross-peaks with the imino signal at 15.25 ppm, whereas the other three pairs of amino protons exhibits cross-peaks with the imino signal at 15.23 ppm. Observation of H42C2/C8-H2'/H2"C2/C8 cross-peaks, characteristic of 3'-3' stacked C:C + base pairs, confirms that these residues are located in central positions in the two species. On the other hand, the existence of four sets of H41/H42C3-H1G4 and H41/H42C9-H1G10 cross-peaks reveals that these cytosine residues are located at the end of the C:C + tract, and are stacked with contiguous guanine residues. The good dispersion of these cross-peaks, together with the observed amino-imino cross-peaks pattern, allowed the assignment of C3 and C9 residues to each of the observed species. We conclude that the two i-motif dimeric species maintain the same stacking order (see Figure S4), but have different strand orientation: head-to-head and head-to-tail. For the head-to-head species, hemiprotonated C:C + base pairs are formed between equivalent cytosines and each imino proton show cross-peaks with an unique amino pair, 15.25(C3) and 15.23(C9) ppm. In contrast, in the head-to-tail orientation, these base pairs are formed between non-equivalent cytosines and each imino signal shows cross-peaks with two different pairs of amino protons, signal at 15.23 ppm (C3 and C9). In spite of the different orientation, the chemical environment of the central base pairs is very similar in both species. The degenerated signals observed for C2 and C8 is consequence of this similarity.
Imino-imino cross-peaks between guanine and thymine residues, characteristic of G:T base pairs formation, are observed in both species. Moreover, H1-H1' contacts between guanine residues (G4/10-G4/10, for the head-to-head, and G4/10-G10/4 for the head-to-tail species) are observed, indicating an interaction between guanine residues through their minor groove side, and supporting the formation of two G:T:G:T minor groove tetrads, each at one side of the i-motif.
Assignment of the NMR spectra of the modified 11-mers, CC8 and CC9, was carried out in a similar way as the unmodified sequence CC0. The most prominent features in the NMR spectra are explained in the main text.            T9   G3   T4   T6   T9   T5   T5   T6   T6   T4   G3   G3   psC7   psC7   G8   T1   C2  C2   T9   G8   T9   T1   T4   T5  T5 head-to-tail head-to-head