Introduction

It has been over six decades since the structure of double helical DNA was first unveiled1. To chemists and biologists working in the field of molecular recognition at the time, the structure provided more than just insight into the mechanism for the storage and transmission of genetic information. It provided a paradigm for recognition of genetic materials, in a way in which the two opposing strands are held together: through hydrogen-bonding interaction between the adenine (A) nucleobase and thymine (T), and between cytosine (C) and guanine (G). These so called ‘Watson-Crick’ base-pairing rules are routinely employed in the design of oligonucleotide molecules for targeting single-stranded or perturbed regions of double-stranded DNA or RNA; however, they are rarely applied in the recognition of double helical B-form DNA (B-DNA)—the most stable form of DNA double helix. The reason is that most oligonucleotide molecules developed to date do not have sufficient binding energy to invade B-DNA. To circumvent this challenge, past research efforts have generally been focused on the minor and major grooves, in large part, because of the ease of accessibility of the chemical groups and the precedence for their recognition in Nature2,3. This pursuit has led to the development of several major classes of antigene molecules including polyamides4, triplex-forming oligonucleotides5, zinc-finger peptides6,7, and transcription activator-like effectors (TALEs)8,9. Notwithstanding considerable advances in these areas, the issues of sequence selection and/or specificity, to a certain extent, still remain with many of these classes of molecules. Recently, a CRISPR-Cas9 system has been harnessed from bacteria for the manipulation of genomes and has achieved remarkable success10,11. Even with this state-of-the-art technology, though, an improvement in sequence-specificity is warranted for many of the anticipated biomedical applications due to the concern for off-target genome modifications12,13.

In concert with these efforts, Nielsen and coworkers14,15 have shown over the past two decades that peptide nucleic acid (PNA), a nucleic acid mimic comprising a pseudopeptide backbone (Fig. 1a), can invade B-DNA. Strand invasion of DNA by PNA occurs predominantly through Watson-Crick16, or in combination with Hoogsteen base-pairing17, depending on the sequence context. The simplicity and the generality in sequence design, along with its exquisite recognition-specificity, make PNA an attractive antigene agent. However, with the original (achiral) backbone design, PNA recognition is restricted to mostly homopurine and homopyrimidine targets. Mixed-sequence PNA generally does not have sufficient binding energy to invade B-DNA. Progress has been made to relax the sequence constraint with the application of tail-clamp18,19 and double-duplex invasion strategies20, but they are not without limitations21. More recently we showed that PNA, when preorganized into a right-handed helical motif by installing an (R)-stereogenic center at the γ-backbone (Fig. 1b), can invade B-DNA without sequence limitation (Fig. 1c)22. This newly endowed property of PNA has been exploited in a number of biological applications, including diagnostics23,24,25 and gene editing26. Despite this recent advance, it remains a challenge to invade B-DNA at elevated, physiologically relevant ionic strengths. With increasing ionic strengths, particularly in the presence of divalent metal cations, such as Mg2+, the DNA double helix becomes significantly thermodynamically more stable. As such, additional binding energy would be required to invade B-DNA, a major hurdle facing the design of oligonucleotide molecules for targeting mixed-sequence double helical B-DNA (or RNA). It should be noted that our work is focused on targeting double stranded DNA in vitro and not in its native state in a cell, which is epigenetically modified, wrapped around histones, and bound to other proteins.

Fig. 1
figure 1

Chemical structures and binding modes of γPNAs. Structure of (a) PNA and (b) γPNA, (c) binding mode of γPNA containing natural nucleobases, and (d) that containing JBs. e Hydrogen-bonding interactions of natural bases, and (f) that of JBs with the canonical base-pairs

The required binding energy, in principle, could be attained by replacing natural nucleobases with synthetic analogs27, such as the replacements of adenine with 2,6-diaminopurine28 and cytosine with G-clamp29, with improved hydrogen-bonding and base-stacking interactions. However, such an approach could compromise recognition-specificity due to the propensity of the resultant ultra-high affinity probes to bind to DNA (or RNA) with closely related sequences, in addition to their intended binding sites. The challenge is in how to design oligonucleotide molecules that can invade B-DNA under physiologically relevant conditions without compromising sequence-specificity.

Here we report the use of a class of shape-selective, bifacial nucleic acid recognition elements for targeting double helical DNA with improvements in binding energy and sequence-specificity, enabling an effective strand invasion of double helical B-DNA, as well as RNA, under a physiologically simulated condition.

Results

Design rationale of bifacial recognition

To augment the binding energy of γPNA, while simultaneously improving its recognition specificity, we envisioned the application of Janus bases (or JBs) as recognition elements since they have the potential to form hydrogen-bonds with nucleobases in both strands of the DNA double helix (Fig. 1d). We expected their interactions with double helical DNA to be more favorable than that of natural nucleobases (Fig. 1e), as a result of a significant increase in the number of hydrogen-bonding and in the degree of base-stacking interactions of the resulting triplex as compared to that of a duplex (Fig. 1f). An improvement in base-stacking interactions was anticipated due to the expanded aromatic ring-systems of JBs and as a result of the formation of base-triads. For every two hydrogen-bonds in an A–T or T–A pair, and for every three in a C–G or G–C that are broken, five new ones are formed upon the invasion of DNA by JBγPNA. Moreover, we expected their binding to be more sequence-specific because a single base-mismatch that would normally occur on one face of a natural nucleobase would occur on both faces of a JB. The concept of bifacial nucleic acid recognition, however, is not new. It was conceived by Lehn30 more than two decades ago and, subsequently, expounded upon by others in the development of Janus-wedges31,32,33,34,35,36,37,38. Despite concerted efforts from several research groups, only a small set of Janus-wedges has been developed, and they vary considerably in shapes and sizes, such that they cannot be effectively combined in a modular format for recognition of a non-homogenous nucleic acid sequence. Examples include those developed by Chen and McLaughlin31,32, which contained single-aromatic ring systems. As such, not only are they degenerate in recognition, unable to distinguish a C–G from G–C pair or an A–T from T–A, they neither possess the necessary binding energy to invade B-DNA nor a means to suppress self-hybridization. The latter is an intrinsic property of JBs, one that presents a major challenge in the design of bifacial nucleic acid recognition elements for targeting (canonical) double helical DNA or RNA. In contrast, the JBs under development are novel in structures and enabling in recognition capabilities. They were strategically designed and optimized so that they contain the appropriate shapes, sizes, chemical functionalities, and tautomeric structures for proper recognition of the respective DNA base-pairs (Fig. 1f), in addition to the asymmetry in the helically-folded γPNA backbone that enables them to hybridize to the designated base-pairs in double helical DNA or RNA, but not to each other.

Molecular dynamics simulations

Molecular dynamics (MD) simulations were initially performed to assess the feasibility of JB recognition (Supplementary Table 1). A dodecameric γPNA containing a mixture of all four JBs, H-Lys-EBFDBEFDFDFB-NH2, was chosen as a model system for computational modeling (Fig. 2a). This extended sequence was selected to ensure its proper binding with the Watson strand, as well as the Crick strand of DNA; and should it fail to hybridize to each other, this occurs as a result of steric clash in the backbone and not due to the short sequence. The four C-E-G, G-F-C, A-B-T, and T-D-A base-triads were built and optimized by the HF/6-31G* basis set and grafted onto the respective DNA and γPNA backbone (Supplementary Fig. 1)39. The structure of DNA-JBγPNA-DNA was created using the NAB module of Ambertools40 (Fig. 2b and Supplementary Data 1) and that of JBγPNA-JBγPNA was adopted from an existing NMR structure (Fig. 2f, g, and Supplementary Data 2)41. The result showed that the W-P portion of the triplex was stable after 500 ns (Fig. 2c and Supplementary Movie 1), remained intact throughout the simulations (Fig. 2d), while that of the P-C segment displayed a significant structural distortion (Fig. 2e). This is reflected in the number of hydrogen-bonds and in the inter-strand interaction energy (Supplementary Figs. 24). The weaker interaction of P-C, as compared to that of W-P, is attributed to the number of hydrogen-bonds being fewer and to the fact that hybridization occurs in a less favorable parallel orientation42. In contrast, as the result of steric clash in the backbone, the structure of JBγPNA-JBγPNA unraveled upon restraint release (Fig. 2h, Supplementary Fig. 5 and Supplementary Movie 2). Self-hybridization was a major concern in the design of JBγPNA, or any bifacial nucleic acid recognition system for targeting the canonical base-pairs of DNA or RNA for that matter, due to the complementarity of the two faces of JBs. However, we conjectured that such an event is less likely to occur with an asymmetrical, right-handed helically-induced chiral γPNA than with the achiral counterpart39,43. Indeed, the simulations revealed that not only can JBγPNA hybridize to both strands of the DNA double helix, but that it is unable to hybridize to each other—a prerequisite for a successful design of the bifacial nucleic acid system.

Fig. 2
figure 2

MD simulations of the triplex and duplex structures. ae DNA-JBγPNA-DNA triplex, and (fh) JBγPNA-JBγPNA duplex. a Sequence of triplex, (b) initial structure of triplex, (c) simulated structure of triplex after 500 ns, and (d) and (e) are the same as (c) but with the respective C and W strands removed for clarification. f Sequence of parallel duplex, (g) initial structure of duplex, and (h) simulated structure of duplex after 500 ns

Chemical synthesis and probe design

Encouraged by the results of MD simulations, we synthesized nucleobases E and F44, along with the corresponding γPNA monomers (Supplementary Fig. 6) and oligomer, for the initial testing (Fig. 3a and Supplementary Figs. 7 and 8). They were chosen based on the presumption that, if JBγPNA containing such recognition elements is able to invade B-DNA with the corresponding sequence, then those that contain B and D, along with their mixture with E and F, should be able to accomplish the same task but more effectively, since G-C and C-G pairs are thermodynamically more stable and, thus, more difficult to invade than A-T and T-A pairs. We selected a relatively short JBγPNA (P1), six units in length, for the initial proof-of-concept study because MD simulations suggest that this length is sufficient for binding DNA (or RNA) (Fig. 3a). P2, a homolog of P1 containing natural nucleobases, was also prepared for comparison. Binding studies were conducted with model DNA targets containing matching sequence and binding-orientation (W1 and C1, Fig. 3b, c), as well as those with mismatched sequence (W2 and C2) and mismatched binding-orientation (W3 and C3). Figure 3d shows the preferential binding orientation of the triplex. Double-stranded DNA and RNA were also prepared (Fig. 3e, f), and their bindings with P1 and P2 were assessed using electrophoretic mobility-shift assays (EMSA). The difference between the two perfectly-matched DNA targets, HP1 (Fig. 3e, i) and HP1D (Fig. 3f), is that the former is an intramolecular hairpin and the latter is an intermolecular duplex—designed to assess the recognition generality of P1.

Fig. 3
figure 3

Probes and nucleic acid targets employed in this study along with their CD spectra. a Sequence of P1 (blue) and P2 (black), (b) Watson strands (W1, W2, W3), (c) Crick strands (C1, C2, C3), (d) depiction of a bound W1-P1-C1 triplex, (e) intramolecular DNA (HP1 and HP2) and RNA (HP1R) targets, (f)intermolecular DNA target (HP1D), (g) CD spectra of the individual strands (P1: solid line, W1: dashed line, C1: dotted line), and (h) CD spectra of W1-P1-C1 (red line), W1-P1 (blue line), W1-C1 (black line), and P1-C1 (green line) complexes. In g and h the concentration of each strand was 2.5 µM, prepared in a 1× PBS buffer (137 mM NaCl, 2.7 mM KCl, 10 mM NaPi, pH 7.4) and recorded at 25 °C

Conformational analysis

Circular dichroism (CD) was employed to assess the conformation of P1, along with those of the bound complexes. Consistent with the previous finding43, P1 adopted a right-handed helical motif, as evident by the biphasic exciton coupling pattern with maxima at 235, 255, and 340 nm and minima at 210 and 275 nm (Fig. 3g). We ruled out the possibility for self-hybridization on the basis of concentration-dependent measurements (Supplementary Fig. 9), along with other supporting evidence as discussed in the sections below. The W1-C1 mixture showed modest differential in CD signals. On the other hand, notable differences in the signal strengths and spectral patterns were observed with W1-P1, P1-C1, and W1-P1-C1, in comparison to that of the sum of individual strands (Fig. 3h and Supplementary Fig. 10)—suggesting that their interactions are more favorable than that of W1-C1. Such drastic changes in the CD signals, however, were not observed with the mismatched sequence (W2 and C2) or the mismatched binding-orientation (W3 and C3) (Supplementary Fig. 11). Collectively, these results provide supporting evidence for the binding of P1 with the Watson strand, as well as the Crick strand of DNA double helix in a sequence- and orientation-specific manner.

Thermal and thermodynamic analyses

UV-melting experiment confirmed the formation of a W1-C1 duplex, although relatively weak, with a melting transition (Tm) of ~37 °C (Fig. 4). However, under identical conditions, no discernible Tm was observed for P1-C1, suggesting that either the duplex did not form or that it did, but its melting profile merged with that of P1, which exhibited a broad melting pattern with Tm in the 35–80 °C range (Supplementary Fig. 12). The former scenario is unlikely based on the CD finding. The unusually high and broad Tm of P1 could be attributed to the increase in base-stacking interactions of E and F, in comparison to that of the natural nucleobases. A mixture of W1, P1, and C1 at an equimolar ratio displayed two well-defined melting transitions at 40 and 74 °C (Fig. 4). We assigned the first transition to the melting of P1-C1 and the second to that of W1-P1. W1-P1 and W1-P2 showed similar Tms, 74 °C and 72 °C, respectively (Supplementary Fig. 13). This result was expected since they both possess the same number of hydrogen-bonds. The thermodynamic parameters for W1-C1, W1-P1, and W1-P2 duplexes, as determined by van’t Hoff analyses, are shown in Table 1. In contrast to the observations made with W1 and C1, samples containing P1 and mismatched sequence (W2 and C2, Supplementary Fig. 14) or mismatched binding-orientation DNA (W3 and C3, Supplementary Fig. 15) did not yield well-defined melting patterns. These results are consistent with the CD findings, indicating that P1 has strong binding sequence and orientation preferences.

Fig. 4
figure 4

Thermal stability of the bound complexes. UV-melting profiles of W1-P1-C1 (red line), W1-P1 (blue line), W1-C1 (black line), and P1-C1 (green line). The concentration of each strand was 2.5 µM, prepared in a 1× PBS buffer

Table 1 Thermodynamic parameters

Confirmation of binding by NMR

To further substantiate the binding of P1 with W1 and C1, we carried out 1D and 2D NMR experiments at variable temperature and concentration. Figure 5a shows 1H-NMR spectra of the various combinations, prepared in a PBS buffer containing H2O:D2O at a 9:1 ratio; and those of the individual strands are shown in Supplementary Figs. 1618. Several key observations were noted. First, the NMR spectrum of P1 was devoid of any imino proton signals in the 10.0–20.0 ppm region (Fig. 5a, i), indicating that self-hybridization did not take place even at such a high concentration (500 µM strand concentration). Second, C1 showed two imino proton signals at 12.95 and 12.80 ppm (Fig. 5a, iii, and Supplementary Fig. 19), as a result of the formation of a partial C1-C1 duplex stabilized by a terminal GG-diad. Self-hybridization did not occur with W1 because it lacked the terminal purine-cap (Fig. 5a, ii). Third, the titration of P1 with W1-C1 resulted in a gradual disappearance of the imino proton signals of G12, G2/G8, G10, and G4 (Fig. 5a, compare vi to iii and iv), concomitant with the formation of that of G2’ and G4’ of W1-P1 and other imino protons of P1-C1 (Fig. 5a, compare vi to v, and Supplementary Figs. 20 and 21; the other controls are shown in Supplementary Figs. 2227). COSY and NOESY experiments enabled the assignment of imino protons of W1-C1 (Supplementary Figs. 2833 and Supplementary Table 2) and W1-P1 (Supplementary Figs. 3439). Fourth, W1-P1 complex appeared to be highly stable, as judging from the line sharpness and the persistence of imino proton signals at a temperature as high as 65 °C (Fig. 5b and Supplementary Figs. 40 and 41). While it is not currently feasible to assign the imino proton signals of E and F due to significant spectral overlaps with that of DNA nucleobases, a cursory inspection of W1-P1-C1 complex (Supplementary Fig. 42) confirms the binding of P1 with W1 and C1, as noted from the emergence of the aromatic W1-P1 proton signals (asterisk) and from the disappearance of that of C1 (filled circle). This suggestion was further corroborated by findings from EMSA (Fig. 5c) and MALDI-TOF MS (Supplementary Fig. 43) experiments, showing W1-P1 duplex being the most stable (compare lane 2 to lane 1), followed by P1-C1 (compare lane 4 to lane 3) and W1-C1 (compare lane 5 to lanes 1 and 3).

Fig. 5
figure 5

Imino proton signals of the bound complexes. a 1H-NMR spectra of the indicated samples in the 10.0–13.4 ppm range (no signals were observed beyond 13.4 ppm). Note P1, W1, C1, W1-C1: black line; W1-P1: blue line; W1-C1-P1: red line. b Imino proton signals of a W1-P1 duplex as a function of temperature. c Result of an EMSA showing the binding of various partners. Experimental conditions: pre-annealed DNA (W1F or C1F*) was mixed with P1 in a PR buffer (137 mM NaCl, 2.7 mM KCl, 10 mM NaPi, 2 mM MgCl2, pH 7.4) and incubated at 37 °C for 4 h; the resulting mixtures were separated by 10% non-denaturing PAGE. The sequences of FITC-attached DNA are as followed: W1F, 5’-CGCGCC-3’; and C1F*, 5’-(T)12GGCGCG-3’. The poly-(T)12 tail in C1F* was employed to provide separation between the two target strands. The strand concentrations of W1F, C1F*, and P1 were 1 μM each

Assessment of DNA strand invasion

To determine whether JBγPNA can invade a double helical B-DNA, we carried out EMSA comparing the binding of P1 and P2 with model DNA targets containing a perfectly-matched HP1 (Fig. 3e, i) and a base-pair inversion HP2 (Fig. 3e, ii) at a moderate ionic strength (1xPBS buffer), as well as at a physiologically relevant (PR) condition45. Figure 6a depicts compositions of the various binding complexes. The result demonstrated that not only can P1 invade a highly stable HP1 in PBS (Fig. 6b, lanes 2–4), but also in a PR buffer (lanes 5–7). Formation of the HP1-P1 invasion complex was confirmed by MALDI-TOF MS (Supplementary Fig. 44). The UV-melting profiles of the individual duplexes are shown in Supplementary Fig. 45. Binding occurred in a concentration-dependent manner, resulting in a complex that remained intact throughout the electrophoresis—as inferred from the sharpness of the shifted bands. Binding was relatively fast, complete within 10 min at 37 °C (Supplementary Fig. 46). In contrast, no evidence of binding was observed with P2, not even in a PBS buffer (compare lanes 8 and 9 to lanes 4 and 7, respectively). The fact that P1 and P2 have similar binding energy with the Watson strand (Table 1), but that only P1 was able to invade HP1 in PBS, as well as in a PR buffer, highlights the significant contribution of hydrogen-bonding interactions and, perhaps, steric clashes with the Crick strand of DNA double helix in stabilization of the invasion complex. No binding was observed with the mismatched HP2 (Fig. 6c). In addition to the hairpin structure, P1 was also able to target an internal binding site embedded within a highly stable double helix (Supplementary Fig. 45), as demonstrated with HP1D (Fig. 3f and Fig. 6d). The result showed that being able to simultaneously target both strands of the DNA double helix is essential for the invasion of B-DNA at a physiologically relevant ionic strength. A similar observation was made by Nielsen20 in the invasion of DNA by pseudo-complementary PNAs, albeit strand invasion requires simultaneous binding of two separate strands of PNA. Such a design, however, has practical limitations due to the slow hybridization kinetics and low invasion efficiency21.

Fig. 6
figure 6

Invasion of double-stranded DNA and RNA by P1 and P2. a Compositions of the various complexes (HP1-P1, HP1-P2, HP1R-P1, HP1D-P1). b Comparison of binding of P1 (lanes 2–7) and P2 (lanes 8 and 9) with the perfectly-matched DNA hairpin (HP1) at a moderate (1× PBS) (lanes 1–4, and 8) and at a physiologically relevant (PR) ionic strength (lanes 5–7, and 9). The concentration of HP1 was 1 µM and that of P1 and P2 are as shown. The samples were prepared in the indicated buffers and incubated at 37 °C for 4 h prior to separation by non-denaturing PAGE. c Result of EMSA of P1 with mismatched HP2, (d) with HP1D, and (e) with HP1R RNA. In ce, the samples were prepared in a PR buffer and incubated at the same temperature and duration, and separated under identical conditions as that in b. PR buffer: 137 mM NaCl, 2.7 mM KCl, 10 mM NaPi, 2 mM MgCl2, pH 7.4

Strand invasion of RNA

Interestingly, we noticed that P1 was also able to invade RNA double helix, HP1R (Fig. 3e, iii), in a PR buffer (Fig. 6e), where P2 was not. This result is intriguing, suggesting the possibility for targeting the secondary and tertiary structures of RNA. Such RNA structural motifs are highly abundant in cells and are known to play key roles in many biological processes46. Generally, they are challenging to target with conventional antisense oligonucleotides47,48 or with small-molecule ligands with high selectivity49,50, although some progress has been made with the development and approval of nusinersen51 and patisiran drugs52, and in the application of oligonucleotides for targeting RNA-repeated expansions53.

Discussion

The A–T (or A–U) and C–G base-pairing interactions are employed by Nature in the storage and transmission of genetic information, and by researchers in the recognition of genetic materials because of their exquisite recognition sequence-specificity. However, such principles have rarely been applied in the recognition of double helical DNA or RNA because most oligonucleotide molecules developed to date do not have sufficient binding energy to invade such a canonical duplex structure. The present study is focused on determining whether a bifacial JBγPNA could be designed, and, if so, whether it could invade a double helical DNA (or RNA) at a physiologically relevant ionic strength. A critical requirement for the successful development of such a dual-recognition nucleic acid system is its preferential binding with DNA (or RNA) but not with itself. We showed that not only can JBγPNA be prepared, but that it can form hydrogen-bonds with nucleobases in both strands of the DNA double helix without any evidence for self-hybridization. The helically-induced chirality in the backbone prevents JBγPNA from approaching and properly hybridizing to each other. In addition to its tight binding with the Watson strand, and to a lesser extent with the Crick strand, JBγPNA can invade double helical B-DNA, as well as RNA, under a simulated physiological condition.

A significant structural perturbation in the P-C segment of triplex, as revealed by MD simulations, was surprising but not unexpected. This is because the number of H-bonds formed in P-C (12 H-bonds) is fewer than that in W-P (18 H-bonds), and binding occurs in an unfavorable parallel orientation. This result was substantiated by experimental findings, as demonstrated by CD, UV-melting, NMR, and EMSA measurements. The fact that both P1 and P2 form the same number of H-bond with the Watson strand and that the resulting duplexes exhibited virtually identical thermodynamic stability, but P1 was able to invade B-DNA whereas P2 was not, indicates the importance of being able to engage the Crick strand in stabilization of the invasion complex. It is well established that DNA double helix is relatively dynamic, with the deoxyribose phosphate backbones and the nucleobases in constant flux with continual bending and twisting motions54. The reason that most oligonucleotide molecules are unable to productively invade B-DNA is not due to the lack of base-pair accessibility, but rather due to their inability to compete with the complementary DNA strand, especially in the context of a relatively long double helix27. In addition to the binding energy gained in the formation of P-C, although relatively small compared to the overall contribution, their interactions may provide a physical barrier in preventing the complementary DNA strand from re-hybridizing with its partner. Such a phenomenon has been illustrated by Nielsen and coworkers20.

While this proof-of-concept study is focused on the synthesis and evaluation of the binding property of a relatively short probe, the binding energy that would be required to invade a biologically relevant genomic DNA target, estimated to be ~18-nt in length, could be attained by expanding the probe length accordingly. This is because a γPNA-DNA duplex is thermodynamically more stable than DNA-DNA on a per-unit basis, as demonstrated in a previous study.55 We expected the thermodynamic stability of JBγPNA-DNA to be even greater than that of γPNA-DNA. If necessary, additional binding energy could be attained by employing second-generation JBs, such as those shown in Supplementary Fig. 47a, with improved hydrogen-bonding and base-stacking interactions. For instance, E* and F** are capable of forming six hydrogen-bonds each with the respective C–G and G–C base-pairs, as compared to five each for E and F. Although F** is not the most stable isomer, previous studies showed that tautomers interconvert on a relatively fast time scale, and that rare occurrences can occur if they aid in the folding and stabilization of nucleic acid structures56.

Another potential benefit of a bifacial nucleic system is that it could be designed to bind more favorably with a double-stranded over a single-stranded DNA (or RNA) by furnishing it with a protective companion (Supplementary Fig. 47b). The application of toehold and strand displacement reaction to tune the binding kinetics and thermodynamics of nucleic acids—for the purpose of controlling their binding selectivity—has already been demonstrated57. Since binding occurs on both sides, a relatively short companion strand would be suffice to provide protection against single-stranded DNA (or RNA) binding, and maintain the required kinetics and thermodynamics for a successful invasion of double helical DNA (or RNA)58. Besides its protective role, the companion strand could be functionalized with a specific chemical group, such as guanidine, for improving cellular uptake59, and possibly for enhancing the rate of DNA (or RNA) strand invasion60. A distinctive advantage of such a molecular feature is that once it is displaced, upon a successful invasion of DNA (or RNA) by the designer probe, the companion strand is quenched through self-hybridization. Such a probe design may provide a safer and more effective means for targeting double-stranded DNA (or RNA) than the conventional antisense or antigene approach for biomedical applications.

In addition to the four JBs shown in Fig. 1f, which are designed to bind to the canonical base-pairs, the remaining twelve JBs within this class that are designed to bind to non-canonical base-pairs44 could be employed in combination to target the secondary and tertiary structures of RNA, in attempts to elucidate their physiological functions and to develop therapeutic interventions for treating genetic disorders. However, this does not imply that conventional antisense molecules, those containing natural nucleobases, cannot be effectively used to manipulate the structures and functions of RNA. In fact, there are a number of diseases that are considered ‘undrugable’ by small molecules that can be corrected by targeting single-stranded RNA61—case in point are the recent development and approval of nusinersen (Spinraza) by Ionic Pharmaceuticals and patisiran (Onpattro) by Alnylam Pharmaceuticals for treatment of the respective spinal muscular atrophy and hereditary transthyretin-mediated amyloidosis. The present work provides another dimension in the design of ‘millamolecular’ oligonucleotide molecules for targeting nucleic acid biopolymers, especially those structured regions that may be difficult to access by conventional reagents.

In summary, we have shown that γPNA containing a selected set of JBs could be developed and could hybridize to both strands of the DNA double helix without undergoing self-hybridization. JBγPNA is able to invade a highly stable double helical B-DNA at a physiologically relevant ionic strength, where a homolog containing natural nucleobases is not. Overall, the bifacial recognition mode is general, applicable to targeting not only double helical DNA, but also RNA. Due to their tight binding, significantly shorter probes could potentially be used to target the secondary and tertiary structures of RNA in our efforts to decipher their physiological roles and to develop molecular therapies for treating genetic disorders.

Methods

Molecular dynamics simulations

See Supplementary Methods.

UV-melting analysis

All UV-melting samples were prepared by mixing probes with DNA targets at the indicated concentrations and in the appropriate buffers, and annealed by incubation at 90 °C for 5 min followed by gradual cooling to room temperature. UV melting curves were collected using Agilent Cary UV-Vis 300 spectrometer equipped with a thermoelectrically controlled multi-cell holder. UV-melting spectra were collected by monitoring UV-absorption at 260 nm from 20 to 95 °C in the heating runs, and from 95 to 20 °C in the cooling runs, both at the rate of 1 °C per min. The heating and cooling curves were nearly identical, indicating that the hybridization process is reversible. The recorded spectra were smoothed using a 10-point adjacent averaging algorithm. The first-order derivatives of the melting curves were taken to determine the melting temperatures of the duplex and single strand.

Circular dichroism analysis

The samples were prepared in a 1× PBS buffer. All circular dichroism (CD) spectra represent an average of at least 15 scans collected at a rate of 100 nm/min between 200–400 nm, in a 1-cm path-length cuvette at 25 °C. The CD spectrum from buffer solution was subtracted from the sample spectra, which were then smoothed via a 10-point adjacent averaging algorithm.

Nuclear magnetic resonance analysis

All 1D nuclear magnetic resonance (NMR) spectra for DNA and DNA:PNA complex were recorded on a 500 MHz spectrometer using Watergate p3919gp in 90:10% H2O:D2O and in D2O solvent. All 2D NOESY spectra for DNA and DNA:PNA complex were recorded on a 500 MHz spectrometer using Watergate NOESY or noesygpph19 as a pulse program with 100, 200, and 300 ms time scale. All COSY spectra were recorded using cosygpprqf as the pulse program.

Mass spectrometric analysis

For PNA, a solution of α-cyano-4-hydroxycinnamic acid (10 mg of α-cyano-4-hydroxycinnamic acid in 500 μL of water with 0.1% TFA and 500 μL of acetonitrile with 0.1% TFA) was used as the matrix for MALDI-TOF analysis. The PNA samples were prepared by mixing with 2 μL of matrix and 1 μL PNA (1–5μM) at 37 °C. MALDI-TOF analysis was performed about 10 min after spotting. For DNA and DNA:PNA complex, a solution of 6-aza-2-thiothymine (ATT, 10 mg/mL) in 1:1 (v:v) CH3CN/ammonium citrate [20 mM] were prepared. The DNA and DNA:PNA complex were prepared by mixing 2 μL of matrix and 1 μL of DNA or DNA:PNA complex (either in RNAse-free water or in a 1× PBS buffer). The MALDI-TOF plate was dried in vacuum desiccator for 10 min after spotting the sample and analyzed by MALDI-TOF MS.

Electrophoretic mobility-shift assay

All the DNA samples were prepared in the indicated buffers, either in 1× PBS (137 mM NaCl, 2.7 mM KCl, 10 mM NaPi, pH 7.4) or in a PR buffer (137 mM NaCl, 2.7 mM KCl, 10 mM NaPi, 2 mM MgCl2, pH 7.4), and annealed by heating to 90 °C for 5 min followed by gradual cooling to room temperature. PNA and DNA targets were mixed at the indicated concentrations and incubated at 37 °C for 4 h. The samples were then loaded onto 10% non-denaturing PAGE with 1× Tris-borate buffer and electrophoretically separated at 120 V for 60 min. The fluorescent probe attached to DNA was visualized by UV-Transilluminator. The gels without fluorescent probe were stained with SYBR-Gold and visualized by UV-Transilluminator.

Preparation and characterization of PNA P1 and P2 and biophysical studies with DNA

See Supplementary Methods.