Intrastrand backbone-nucleobase interactions stabilize unwound right-handed helical structures of heteroduplexes of L-aTNA/RNA and SNA/RNA

Xeno nucleic acids, which are synthetic analogues of natural nucleic acids, have potential for use in nucleic acid drugs and as orthogonal genetic biopolymers and prebiotic precursors. Although few acyclic nucleic acids can stably bind to RNA and DNA, serinol nucleic acid (SNA) and L-threoninol nucleic acid (L-aTNA) stably bind to them. Here we disclose crystal structures of RNA hybridizing with SNA and with L-aTNA. The heteroduplexes show unwound right-handed helical structures. Unlike canonical A-type duplexes, the base pairs in the heteroduplexes align perpendicularly to the helical axes, and consequently helical pitches are large. The unwound helical structures originate from interactions between nucleobases and neighbouring backbones of L-aTNA and SNA through CH–O bonds. In addition, SNA and L-aTNA form a triplex structure via C:G*G parallel Hoogsteen interactions with RNA. The unique structural features of the RNA-recognizing mode of L-aTNA and SNA should prove useful in nanotechnology, biotechnology, and basic research into prebiotic chemistry.

X eno nucleic acids (XNAs) are synthetic analogues that retain natural nucleobases but are replaced with backbone structures different from DNA and RNA. They have potential for use in nucleic acid-based drugs, in development of artificial genetic polymers, and in the prebiotic field 1-8 . In the past decades, many nucleic acid analogues have been developed. Artificial analogues with 2′-ribose modifications such as 2′-fluoro, 2′-O-methyl, 2′-O-methoxyethyl, and locked nucleic acids (LNAs) are used in the nucleic acid-based drugs and drug candidates [9][10][11][12][13][14] . The increased binding affinities of these modified analogues for RNA result from the C3′-endo conformation of pentose ring 3,15,16 . Not only RNA analogues alternate sugarbased XNAs have been developed. Six-membered ring-based hexitol nucleic acid (HNA) has restricted ring conformation to form an A-type structure in the duplex with RNA 17,18 . Fivemembered ring group α-L-threofuranosyl-(3′→2′) nucleic acid (TNA), which has been used as a model of pre-RNA polymer possesses the phosphodiester group connecting at a different position from that in natural nucleic acids [19][20][21] . Although the backbone unit of TNA is one atom shorter than that of natural nucleic acids that have six-atom backbone repeat, TNA is capable of forming stable duplex with complementary RNA 19 . In addition to cyclic scaffolds, ribose-inspired acyclic XNAs have been synthesized and characterized because the acyclic XNAs provide extremely resistant against enzymatic degradation 22 . However, acyclic ribose modified analogues decrease the stabilities of the heteroduplex with RNA [23][24][25] . Also, development of acyclic nucleic acids composing simple structure such as glycerol nucleic acid (GNA), which is most simplified acyclic backbone of propylene glycol [26][27][28] , and zip nucleic acid, which is connecting through sixbonds system in analogy to natural nucleic acid by phosphonomethylglycerol unit 29 have been attempted. Although their homoduplexes are highly stable, the stabilities of heteroduplexes with natural nucleic acids are much lower than that of unmodified DNA or RNA duplexes. Peptide nucleic acid (PNA), non-charged type acyclic nucleic acid, binds with high affinity to RNA and DNA [30][31][32] , however, synthesis of long oligomer and purine-rich sequence is technically very difficult due to poor solubility.
We recently discovered acyclic nucleic acids serinol nucleic acid (SNA) and L-threoninol nucleic acid (L-aTNA) can form stable duplexes with RNA in a sequence-specific manner [33][34][35] . Since they are structurally simple, readily synthesized, excellent water solubility, and high nucleases resistance, various applications have been realized based on hybridization with RNA such as a high-sensitive molecular beacon and nucleic acid-based drug candidates, including siRNAs, anti-miRNA oligonucleotides, and exon-skipping type antisense oligonucleotides [36][37][38][39][40][41][42][43] . However, how SNA and L-aTNA hybridize with natural nucleic acid have remained unknown. We initially assumed that helicity of duplex of D-aTNA, which is enantiomer of L-aTNA, is right handed based on the similarity of CD spectrum to that of DNA and structural modelling 34,44 , and that L-aTNA homoduplex had lefthanded helicity. However, based on CD studies of PNA of known structure 45 , the CD signal of L-aTNA is indicative of righthanded helicity 35 . Interestingly, the handedness of an SNA homoduplex depends on the oligomer sequence 33 . Despite the importance of the acyclic nucleic acid, only limited structural information is available. For development of acyclic nucleic acids, for applications to tools or materials, and to facilitate de novo design of artificial polymerases for these acyclic nucleic acids, structural information is helpful. In the present study, we successfully prepared single crystals of L-aTNA/RNA and SNA/RNA heteroduplexes and solved the first crystal structures at 1.70-1.75 Å resolution. L-aTNA/RNA and SNA/RNA form righthanded helical structures with large helical pitch involving Watson-Crick base pairs and parallel-type Hoogsteen base pairs.

Results
Helicities and paring mode of L-aTNA/RNA and SNA/RNA. The duplexes formed by an L-aTNA strand L T8a (3′-GCAG-CAGC-1′) with an RNA strand R8Br (5′-GCUGC-Br U-GC-3′) and an SNA strand S8a ((S)-GCAGCAGC-(R)) with the R8Br were prepared. Melting analyses and CD spectroscopy analyses confirmed that complexes were formed ( Supplementary Fig. 1). Crystals were obtained using the sitting drop vapour diffusion method with PEG400 as precipitant. The L-aTNA/RNA crystal diffracted at 1.5-Å resolution and the SNA/RNA crystal diffracted at 1.7-Å resolution. Data collection and structure refinement statistics are summarized in Tables 1 and 2. The structures were solved by X-ray anomalous scattering using Br atom of Br U on RNA strand.
In the L-aTNA/RNA heteroduplex structure, the crystallographic asymmetric unit contains two right-handed duplexes ( L T8a-1/R8Br-1 and L T8a-2/R8Br-2) stabilized through canonical Watson-Crick base pairing (Fig. 1a). We refer to the duplex orientation as antiparallel in analogy to natural dsRNA. Hydrogen-bonding distances of Watson-Crick base pairs of G: C and A:U are consistent with those observed in RNA/RNA duplexes (Fig. 2). The two L-aTNA/RNA duplexes in the asymmetric unit are surprisingly connected via triplex interactions: C2( L T8a-1):G7(R8Br-1)*G1( L T8a-2) and C2( L T8a-2):G7 (R8Br-2)*G1( L T8a-1) (the colons indicate the Watson-Crick pairs and the asterisks indicate Hoogsteen interactions) (Figs. 1a and 2b). The electron densities of the 3′-terminal C8s of R8Br-1 and R8Br-2 were not observed, suggesting that these bases were flipped out from the helical structure ( Supplementary Fig. 2). This is likely necessary to allow Hoogsteen base pairing for triplex formation. The direction of Hoogsteen base pairing of the G*G interactions in the L-aTNA/RNA structure is more similar to that observed in the G-quadruplex structure than in the conventional triplex 46,47 . The L-aTNA and L-aTNA strands are antiparallel in the triplex region, therefore, we refer to Hoogsteen base pairing of L-aTNA and RNA strands as parallel. Additionally, high-order helical structure also stabilizes the crystal of the L-aTNA/RNA heteroduplex (Fig. 3a). The two dimers of heteroduplexes connected by the triplex interactions stack end-to-end into a continuous helix. In addition, the dimer of heteroduplexes is wrapped with another dimer of duplexes structure in an antiparallel direction, and these helical structures are tightly packed in the crystal lattice (Fig. 3b, c).
The crystallographic asymmetric unit of SNA/RNA contains four SNA strands (S8a-1 to 4) and four RNA strands (R8Br-3 to 6) ( Fig. 1b and Supplementary Fig. 3). Similar to the L-aTNA/ RNA duplex, a single SNA/RNA duplex is an antiparallel righthanded helix stabilized by canonical Watson-Crick base pairing, and two duplexes are stabilized by formation of two parallel triplex through Hoogsteen base pairs of C(SNA):G(RNA)*G (SNA), result in formation of dimer of duplex structure (Fig. 2). Although electron densities of the C8 residues at 3′-terminus of RNA strands are not observed in the crystal structure of L-aTNA/ RNA, flipped-out C8s of RNA strands (R8Br-3, -4, and -6) are observed in the SNA/RNA crystal ( Supplementary Fig. 4). As in the L-aTNA/SNA crystal, the dimers of SNA/RNA duplex are stacked in an end-to-end manner, thereby forming continuous helices in the both perpendicular and horizontal directions (Fig. 3d, e).
Unwound helical structures of L-aTNA/RNA and SNA/RNA duplexes. The helical parameters of the Watson-Crick duplex regions of L-aTNA/RNA and SNA/RNA structures were calculated using the program 3DNA (Table 3 and Supplementary  Tables 1 and 2) [48][49][50] . Sugar puckers of most RNA residues are C3′-endo, categorized in Northern-type, that observed in a typical A-type duplex. Sugars of G1 of R8Br-1 and 2 in the L-aTNA/ RNA duplexes and sugars of G1, C2 of R8Br-3, C2, G4, and Br U6 of R8Br-4, and G1 of R8Br-6 in the SNA/RNA duplexes adopt the C2′-exo conformation that lies in the Northern range. However, interestingly, helical structures of L-aTNA/RNA and SNA/RNA are apparently different from those of RNA/RNA and LNA/RNA which structures are typically categorized in A-type duplex (Fig. 4a). The helical axis-base inclinations for L-aTNA/RNA (average 1.6°) and SNA/RNA (average 1.2°) are much smaller than that of A-type RNA/RNA duplex (average 16.6°calculated from PDB 3ND4) and other XNA/RNA heteroduplexes (Table 3) 51 . The values are similar to that observed in B-type DNA duplexes (average 4.4°observed in PDB 3BSE), in which base pairs are stacked in perpendicular to the axis 52 . In contrast, values of Xdisplacement of L-aTNA/RNA and SNA/RNA are much higher than that of dsDNA which show no displacement from the helical axis (−6.4 Å and −5.7 Å for L-aTNA/RNA and SNA/RNA, −0.5 Å for dsDNA). The lower inclination and larger displacement cause lower helical twist of the L-aTNA/RNA and SNA/RNA structures (22.7°and 24.2°for L-aTNA/RNA and SNA/RNA, respectively) compared to 33.0°for dsRNA and 30.0°for LNA/RNA ( Table 3). The duplex structures of L-aTNA/RNA and SNA/RNA look similar to that of PNA/RNA duplex (RMSD 1.1 Å between N2′, C7′, and C8′ of PNA, and C1′ and P of RNA in PNA/RNA duplex and N4′, C5′, and C6′ of L-aTNA, and C1′ and P of RNA in L-aTNA/RNA duplex) (Fig. 4a). However, the base pairs in the PNA/RNA duplex appear to be tilted rather than parallel although base pairs of L-aTNA/RNA are kept parallel one another ( Supplementary Fig. 5) 32 . Due to this, position of helical axis and the value of helical parameters are different from those of L-aTNA/RNA and SNA/RNA (Table 3). Thus L-aTNA/RNA and SNA/RNA duplexes form unwound and stretched structures with wider major grooves    and minor grooves relative to dsRNA and dsDNA ( Fig. 4b and Table 3) 16,18,24 .
Torsion angles of RNA, L-aTNA, and SNA. To see how RNAs in L-aTNA/RNA and SNA/RNA adapt unwound helical structures, we next compared torsion angles of RNAs in dsRNA, L-aTNA/RNA, and SNA/RNA duplexes (Fig. 5, Table 4, and Supplementary Tables 3 and 4). The values for the backbone torsion angle δ associated with the sugar ring (C5′-C4′-C3′-O3′) of the RNA in the heteroduplex with L-aTNA and SNA are between 70°a nd 100°, the range observed for typical A-type helix 3 , with the exception of those for the terminal residues of each strand. In contrast, the average values for the backbone torsion angle α associated with the phosphodiester bond (O3′-P-O5′-C5′) are relatively smaller for the RNA strand in the heteroduplexes than those observed in the dsRNA duplex. The average values for the glycosidic torsion angle χ between sugar and base (O4′-C1′-N1-C2 for pyrimidines, O4′-C1′-N9-C4 for purines) are also smaller than those observed in the dsRNA. Differences of these angles, although which are very little, make P-P distances of the L-aTNA/RNA and SNA/RNA larger. Consequently, helical pitches of the heteroduplexes larger relative to that of the dsRNA (Table 4).
While torsion angles of L-aTNA and SNA are also defined as α to ζ (Fig. 5), the number of bonds between the backbone to nucleobase N1 of pyrimidines or N9 of purines are different: three bonds from the C4′ to N1 or to N9 linkage in RNA, whereas 4 bonds from C2′ (corresponding to C4′ of ribose) to N1 or to N9 in the L-aTNA and SNA. Additionally, there exist the amide group connecting between backbone-nucleobase on L-aTNA and SNA backbones. As expected from the chemical structure differences between L-aTNA or SNA and RNA, the torsion angles α to ζ of L-aTNA and SNA are different from those observed in RNA ( Fig. 5 and Table 5).
Intrastrand interactions stabilize L-aTNA or SNA in helical structures. Nucleobases of L-aTNA and SNA are connected to backbone through amide bond. All amide bonds in the backbones  of L-aTNA and SNA are in trans configuration and the carbonyl oxygens are turned inward in helical structures of L-aTNA/RNA and SNA/RNA (Fig. 6). Parts of NH of L-aTNA and SNA backbone form water-mediated hydrogen bonds to O3′ of phosphodiester linkage (Fig. 6). Interestingly the carbonyl oxygens are located in close proximity to the C8 atoms of guanine/adenine or C6 atoms of cytosine of residues next to 1′-terminal (C-O distance: 3.0-4.0 Å) and C6′ atoms of backbone of residues next to 1′-terminal (C-O distance: 3.2-5.0 Å) ( Fig. 6 and Supplementary Table 5). Except for the terminal residue most of C-O distances are within the van der Waals distance cutoff of 3.7 Å. In SNA case, C5 atom of terminal cytosine residues instead of C6′ atom of backbone are adjacent to carbonyl oxygen of neighbouring guanine residue (C-O distance: 3.2-3.5 Å). In addition, C-O distances between carbonyl oxygen at C2 position of cytosine and C6′ of neighbouring residues are located closely (3.1-3.4 Å). These observations suggest that weak CH-O hydrogen bonds between nucleobase and backbone of neighbouring residues are extensively formed in intrastrand of L-aTNA and SNA (Fig. 6). C8 of purine and C5 and C6 of pyrimidine in aromatic are relatively polarized, therefore carbonyl oxygens putatively act as an acceptor of CH-O hydrogen bonds 53 . The CH-O interactions between aromatic CH and carbonyl oxygens are observed in PNA of PNA/RNA heteroduplex in which PNA backbone and nucleobases are connected by amide bonds 32 . Tendency of CH-O interactions is lower than those observed in L-aTNA/RNA and SNA/RNA duplexes, therefore helicity of PNA/RNA duplex is increased in comparison to L-aTNA/RNA and SNA/RNA, even it is lower than the case of dsRNA (Table 3). In addition, unsatisfied number of CH-O interactions might cause backbone flexibility of PNA/RNA duplex. These findings reveal that intrastrand hydrogen bonding networks between nucleobases and neighbouring backbones of L-aTNA and SNA enable stabilization of acyclic structures and adjust formation of heteroduplexes with RNA.
Dimer formations of L-aTNA/RNA and SNA/RNA duplexes in solution. Finally, in order to investigate whether the C:G*G triplex-mediated dimer of duplex structures are formed between XNA and RNA in solution, we performed nanoESI-MS analyses of solutions of L T8a and R8Br and of S8a and R8Br under nondenaturing conditions. In the spectrum of the solution of L T8a and R8Br, a peak at the expected molecular mass of the dimer of heteroduplexes was observed as were peaks corresponding to single strands and the L T8a/R8Br duplex (Fig. 7a). A peak of the expected mass of the dimer of S8a/R8Br heteroduplexes was also observed (Fig. 7b). These results strongly suggest that the C:G*G triplex-stabilized dimers of duplexes form in solution. Then we are interested in whether the triplex are formed by homooligomers. nanoESI-MS analyses of mixtures of the 8-mer GCAGCAGC and the 7-mer GCU(T)GCU(T)G of each type of oligomer were performed. In case of RNA, peaks were observed in the spectrum corresponding to the single strands of R8a and R7b and the duplex of R8a/R7b, but peak corresponding to the dimer of R8a/R7b duplex was not ( Supplementary Fig. 6). For L-aTNA, which forms stable homoduplex structure, the peak of dimers of L T8a/ L T7b duplex was clearly observed (Fig. 7c and Supplementary Fig. 7). The same result was obtained from the S8a and S7b case (Fig. 7d). These data suggest that homo-L-aTNA and homo-SNA oligomers can form parallel-type Hoogsteen base pairs and the Hoogsteen base triplex interactions uniquely stabilize formation of dimers of L-aTNA and SNA duplexes.   Fig. 1. peptides, proteins, and tRNA, and also molecular-molecular interactions [54][55][56][57] . It is possible that artificial nucleic acid having desired helicity and folding can be designed by introduction of CH-O hydrogen bonds at certain point.

Discussion
In the previous work, we demonstrated that D-aTNA does not form stable duplexes with RNA 34 . The difference among the D-aTNA, L-aTNA, and SNA monomer is presence or position of single methyl group. It is likely that the methyl group in D-aTNA induces steric crash with oxygens of neighbouring phosphate group if it is right-handed helical structure ( Supplementary  Fig. 8). Also, it is possible that methyl group, which is located in major groove in the L-aTNA/RNA duplex but in minor groove in the putative right-handed D-aTNA/RNA duplex, has a negative effect on minor groove environment ( Fig. 4b and Supplementary  Fig. 8). To avoid them, D-aTNA is expected to prefer left-handed helix formation.
We found that two duplexes interact through triplex formations to form dimer of duplex structures. Interestingly the triplexes were formed by parallel-type Hoogsteen base pairing between Gs and a Watson-Crick G:C base pairs. Conventionally G:C Watson-Crick base pair in DNA and RNA forms parallel Hoogsteen base pair with C, but this type of interaction is disfavoured at physiological pH because protonation of C in third strand is required for formation of the Hoogsteen interaction [58][59][60] . In addition, canonical C:G*G type triplex is formed by antiparallel direction between G and G 60 . In the L-aTNA/RNA and SNA/RNA, the orientations of the third nucleobase (G) are different from that observed in typical C:G*G triplex. This allows them to form parallel type Hoogsteen interaction. The unique triplex forming abilities of L-aTNA and SNA will be an advantage for applying triplex-based theranostics.
Nucleic acids play important roles as the blueprints for construction of all living organisms. It is hypothesized that RNA served as the precursor in a prebiotic world [6][7][8]21 . On the other hand, acyclic type XNA studies showed that acyclic nucleic acids can form homoduplex and heteroduplex with RNA or DNA, even if amino acid derivatives are used as backbones 26,[29][30][31]34,35 , indicating that ribose is not necessary for formation of stable duplex structure. These findings raise up the fundamental question why nature selected ribose as backbone of genetic materials and amino acid as backbone of proteins, products of gene. It is also possible that acyclic nucleic acid, capable of hybridizing with RNA, derived from amino acid derivatives similar to our SNA and L-aTNA served as evolutionary intermediates or competitors of genetic material. To answer this, it might be important to consider the optimal helical structure for compact packing of large genetic polymer.
Thus, the structural data reported here will expand the scope of application of acyclic nucleic acid analogues in prebiotic studies as well as in nucleic acid-based drug and nanotechnology.   . Purified oligomers were dissolved in 10 mM Tris-HCl (pH 7.0) and then duplexes were prepared at a final concentration of 1 mM. S8a/R8Br and L T8a/R8Br were annealed by heating for 10 min at 95°C and then gradually cooling to 4°C. Crystallization conditions were screened with commercially available sparse matrix screening kits. Crystals of S8a/R8Br were grown using the sitting drop method in 0.1 M HEPES (pH 6.5), 75 mM CaCl 2 , and 28% PEG400, and crystals of L T8a/R8Br were grown using the sitting drop method in 0.1 M HEPES (pH 7.5), 200 mM CaCl 2 , and 28% PEG400 at 20°C.
X-ray data collection and refinement. X-ray datasets were collected on the BL44XU beamline at SPring-8, Japan. X-ray diffraction datasets were integrated and scaled using XDS 61 and AIMLESS 62 . The crystal structures were solved using the multi-wavelength anomalous dispersion method relying on the Br atoms in the RNA. The initial phases were determined with the Phenix AutoSol 63 . The obtained electron density maps were very clear, and the initial coordinates were built manually using COOT 64 . Model refinement was conducted using REFMAC5 65 and phenix.refine 63 . Topology files for the model refinement of nucleoside moieities of L-aTNA and SNA monomers were created by PRODRG2 Server 66 . For L T8a/R8Br crystals, merohedral twinning was suspected during refinement because the refined coordinates had an R work of 36% and an R free of 40% in the C2 space group. Therefore, the structure belonging to the I2 space group was solved by molecular replacement methods using the C2 structure as a search model with MOLREP 67 , giving rise to a dramatically improved the refinement statistics with R work of 26.9% and R free of 28.8%. The crystallographic parameters and final refinement statistics of L T8a/R8Br and S8a/R8Br are summarized in Tables 1 and 2  Spectra were recorded at 5°C using a JASCO model J-820 instrument.

Data availability
Structural data that support the findings of this study have been deposited in PDB with the accession code, 7BPF and 7BPG. All data analysis results generated during this study are included in this published article and its supplementary information file.