Enzyme-guided DNA Sewing Architecture

With the advent of nanotechnology, a variety of nanoarchitectures with varied physicochemical properties have been designed. Owing to the unique characteristics, DNAs have been used as a functional building block for novel nanoarchitecture. In particular, a self-assembly of long DNA molecules via a piece DNA staple has been utilized to attain such constructs. However, it needs many talented prerequisites (e.g., complicated computer program) with fewer yields of products. In addition, it has many limitations to overcome: for instance, (i) thermal instability under moderate environments and (ii) restraint in size caused by the restricted length of scaffold strands. Alternatively, the enzymatic sewing linkage of short DNA blocks is simply designed into long DNA assemblies but it is more error-prone due to the undeveloped sequence data. Here, we present, for the first time, a comprehensive study for directly combining DNA structures into higher DNA sewing constructs through the 5′-end cohesive ligation of T4 enzyme. Inspired by these achievements, the synthesized DNA nanomaterials were also utilized for effective detection and real-time diagnosis of cancer-specific and cytosolic RNA markers. This generalized protocol for generic DNA sewing is expected to be useful in several DNA nanotechnology as well as any nucleic acid-related fields.

( Fig. 1a and see Supplementary Fig. S1a,b online). More detailed sequence information is described in Table S1. Due to its intrinsic anisotropic property, T-DNA may possess two different Y-DNAs at two accessible positions of central Y-DNA (CY) in three different Y-DNAs. The Y-DNA at these different locations were designated as West Y-DNA (WY) and East Y-DNA (EY), respectively. To estimate the product efficiency of ligation, WY-CY and EY-CY were analyzed using a gel electrophoresis mobility shift assay (GEMSA) (see Supplementary Fig. S1c,d online).
To effectively synthesize the T-DNA model formed by the interlocking of three different Y-DNAs, several environmental parameters were first tested (Fig. 1b). Regarding the productivity of the ligase enzyme, a molar ratio of reactants and salt concentration were investigated. Various molar concentrations of Y-DNAs (0.3, 0.6, 1.2, and 1.5 μ M) were successively tested with fixed amounts of adenosine triphosphate (1 mM) and T4 ligase (30 Weiss units) (Fig. 1c). However, no significant effects were observed in T-DNA formation. In addition, molar ratios of WY-CY to EY (1:0.5, 1:1, 1:2, and 1:4) were also varied under fixed amount of WY-CY (Fig. 1d). While only 0.3 μ M of EY-DNA was used to make T-DNA with 0.6 μ M of WY-CY, the band intensity of T-DNA should be much lower in the group of 0.5 ratios. Meanwhile, there were no significant differences observed among the groups of 1, 2 and 4 ratios. Several NaCl concentrations (15,50,100,200 and 400 mM) were also tested at room temperature in order to determine the effect on product yield. This resulted in no deviations in T-DNA mass at concentrations lower than 200 mM. At higher NaCl concentrations, product yield decreased sharply, likely due to damaged T4 ligase (Fig. 1e) 16,17 .
The entire process of ligation is composed with protein-DNA recognition and catalytic reactions for nick sealing. In case of protein-DNA recognition process, the stereospecific interaction of protein on DNA substrates is a major concern. It is much known that the major and minor grooves in double stranded DNA structure play an important role in protein-DNA recognition [18][19][20][21] . There are two types of  (15,50,100,200 and 400 mM) were tested. Each data point represents the mean of triplicate experiments; error bars represent the SD. mechanism for recognizing specific DNA sequences by proteins: i) Hydrogen bonds with specific bases, and ii) Sequence-dependent deformations of the DNA helix. In addition, arginine residues of protein and minor grooves of DNA electrostatically bind to each other. Especially, its binding efficiency may be determined by the shape of minor groove in DNA structure. Likewise, DNA liagse and RNA ligase may be in contact with the minor groove in DNA helical structure. Additionally, a nick-joining in double stranded DNA by T4 DNA ligase involves three catalytic reaction steps: formation of enzyme-adenylate, formation of double stranded DNA-adenylate, and nich-sealing. It is inferred that the nick recognition and activity of DNA ligase at end points of two different overhangs may be dependent of both minor groove stereochemistry and thermodynamics [22][23][24] . Thus, characteristic thermodynamics and stereochemistry of DNA should be considered to identify new DNA sewing architectures through the enzymatic activity of ligase.
End sequence-dependent investigation for the formation of DNA sewing nanostructure. In this study, the Gibbs energy and minor groove at end sequences were principally investigated with regard to sequence-dependent ligase efficiency. It is known that the width of minor groove varies depending on the sequence of nucleotides 20 . The distance between the phosphates backbones were critically affected by specific sequence arrangement with changes of negative electrostatic potentials along the minor grooves; AT-rich sequences tend to have more narrow minor grooves than GC-rich sequences. Likewise, Gibbs free energy, which indicates the thermodynamic states of DNA double helix structure, is also highly dependent on the sequence of DNA. Correspondingly, nearest neighbor thermodynamic parameters, which are described in the Supplementary Fig. S2 online, suggest that AT-rich sequences have less stable Gibbs free energy states than GC-rich sequences.
Each overhang sequence in all candidates was selected to have 50% GC content, where it is expected to form a spiral helix 3,25 . Gibbs energies in all base-pair combinations were calculated using nearest neighbor base-base thermodynamics (see Supplementary Fig. S2 online) [26][27][28][29] . The base-pair combinations were classified into four subgroups according to hydrogen bonding order. It is known that the released Gibbs free energy of DNA is dependent on the sequential arrangement (  (Fig. 2c,d). It is evident that the distinctive energy and minor groove structures of 5′ -end cohesive base sequences are easily recognized by T4 ligase and activated to further seal nick sites.
Among the 52 pairs, only five different cohesive base pair sequences were tested and finally chosen for use in the association of thermodynamic properties of 5′ -end cohesive base pairs sequences with T4 ligase efficiency (see Supplementray Fig. S4a online). These pairs were further tested to elucidate the effect of partial yield (exemplified as either WY-CY or EY-CY) and complete T-DNA. Among the pairs, the Gibbs energies of three pairs were equal or similar to each other. However, the remainder showed different Gibbs energies. T-DNA total yield is best produced around − 4.52 kcal/mol (see Supplementary  Fig. S4b online). It is strongly confirmed that the thermodynamic properties of DNA increase the efficiency of T4 ligase.
Evaluation of abnormal sewing on DNA nanostructures. To further distinguish the differences in Gibbs energy of base sequences in the final product yield, mutual interactions of bases at end positions were profiled (Fig. 3a). Undesirable mismatches were intentionally positioned at the end sequences of Y-DNAs. Thermodynamic stability in unexpected base-base forms, which is comparable to that of a Watson-Crick, was observed [30][31][32][33][34] . Here, it is noted that two Y-DNA blocks possessing different overhang bases but the same body sequences are non-complementary to each other at either one or two bases or at a larger number of bases ( Fig. 3b and see Supplementary Fig. S5 online). In cases of either one or two mismatched base pairs, partial T-DNAs were obtained in an unexpected manner. On the other hand, no mismatch ligation was observed when there were three or four mismatched bases pairs. With an increased number of mismatched bases pairs, the possibility of mismatch ligation dramatically decreased because of instability of thermodynamic property and helical structure.
After investigation of abnormal ligation, we questioned why CY, which has two different cohesive ends of EY and WY, could induce EY-CY via mismatch ligation even though there is a WY binding site. It was significantly issued once we finely tune several ligation parameters in DNA assembly. Among the mismatch ligation cases (see Supplementary Fig. S5 online), AGTC and AGAC in each overhang of WY and EY were individually selected as the model sequences for the study of mismatch ligations. Undesirable ligations were also suspected to be caused by other environmental variables such as salt concentration and incubation temperature 11,[32][33][34] . EY (AGAC) and CY (GACT and GTCT at both sites) were intentionally ligated with the expectation of the formation of partial T-DNA (CY-EY). A similar experiment was also carried out using WY (AGTC). However, a significant amount of complete T-DNA was observed, which may have been induced by the mismatch ligation of either GACT and AGAC or GTCT and AGTC. By adjusting the temperature and salt concentration, no such mismatch ligations were produced (Fig. 4a). It is noted that non-Watson-Crick base pairs were significantly decreased either at 37 °C or with salt concentrations above 150 mM. If a ligation is incubated at a higher temperature, it may result in cohesive base pair hybridization; such mismatch ligations are significantly suppressed. Moreover, it is assumed that salt affects the DNA helical structure, which may ultimately influence the activity of T4 ligase for recognition of certain sequences 32,34 . It may recover the final ligation efficiency, thus minimizing the mismatch ligation. In conclusion, some guidelines should be proposed for the construction of pure enzyme-guided DNA sewing nanostructures through the selective end cohesive ligation of DNA blocks (Fig. 4b). Irrespective of the environmental conditions during synthesis, a complete DNA construct was achieved through the simple interconnection of a few DNA blocks via ligase activity using the rules suggested in the table of Fig. 4; Gibbs free energy should be in the range of − 4.0~ −5.0 kcal/ mole in the four-base cohesive hybridization, followed by no mismatched base pairs. These rules can be applied to the creation of a networked DNA sewing nanoconstruct based on any type of end cohesive sequence.

Identification of in vivo cytosolic RNA markers with functional DNA sewing nanomaterials.
To practically evaluate the complete T-DNAs suggested, we used them as novel diagnostic probes.
Some functional modules containing a loop structure with 31 bases were appended (Table S2a) to specifically identify cancer-specific RNA markers. Such functional Y-DNAs were abbreviated as L-DNA and were ligated to the central Y-DNA for formation of functional T-DNA (Fig. 5a), represented as LT-DNA. This formation was confirmed using GEMSA and was compared to normal T-DNAs (see Supplementary  Fig. S6 online). Upon the addition of a single oligonucleotide complementary to the loop sequence of L-DNA, a gel band was produced, indicating that the supplementary single nucleotides were successfully captured by the loop sequence of LT-DNA. Inspired by these achievements, two different L-DNAs corresponding to EZH2 encoding messenger RNA and microRNA 21 (termed miRNA21) were tested as specific breast cancer markers. These sequences were simultaneously ligated to either side of a CY to form a versatile LT-DNA (Table S2b indicates the sequence information of all oligonucleotides used in the L-DNAs). The LT-DNAs were shown to have potential to selectively capture target oligo-ligands. In addition to the stem-loop structure, L-DNAs containing either Cy3 TM or Cy5 TM fluorescent dyes were created. The fluorescence signal was quenched by working solutions such as Iowa Black ® RQ through fluorescence resonance energy transfer (FRET) (Fig. 5b,c). When such RNA markers react with the stem-loop structure of LT-DNA, the configuration opens into a linear shape, producing very strong fluorescence emissions that can be observed via FRET; the emission strengths of Cy3 TM and Cy5 TM were amplified by five-and two-fold, respectively. More interestingly, two different RNA markers corresponding to EZH2 and miR21 were simultaneously detected using the LT-DNAs; addition of the EZH2 RNA marker induced no Cy3 TM enhancement, while miR21 did not show fluorescent emissions in Cy5 TM . These results strongly indicate that this versatile DNA nanoconstruct can be used for multi-detection of several RNA markers with high selectivity and sensitivity.

Discussion
In this study, we proposed predictive criteria for optimized selection of 5′ -end overhang sequences for the directed assembly of DNA blocks through enzymatic ligation. Through the consideration of thermodynamics and stereochemistry on 5′ overhang sequences, the yield and purity were significantly influenced. In addition, this data provides evidence on the essential role of DNA substrates in DNA-T4 ligase recognition and its activation mechanism. Using fluorescence codes, we investigated the anisotropicity and cancer-diagnostic capacity of the DNA constructs. Such a predictive model will allow the design of new interweaved DNA materials for nucleic acid-based bioapplications including genetics and gene sequencing.

Materials and Methods
Synthesis of model DNA sewing materials. To synthesize a Y-DNA block, three different oligonucleotides were designed and manufactured. Table S1 demonstrates the T-DNA and Y-DNA sequence information. All oligonucleotides were provided by Integrated DNA Technologies (IDT, Inc., Coralville, IA). Y-DNA (6 μ M) was annealed in a buffer composed of 50 mM NaCl, 10 mM Tris-HCl (pH 8.0) and Imaging analysis. The DNA products were evaluated by analyzing gel electrophoretic images of 3% agarose gel. Gel electrophoresis was performed under 100 V for 40 min, and the gels were immediately stained with ethidium bromide (EtBr) (2 μ g/ml) for 20 min. The gel images were visualized using a GELDoc-it imaging system under Launch VisionWorksLS UPV and were analyzed using TotalLab Quant gel quantification software version 2.01, provided by ImageMaster (TotalLab Ltd., Newcastle upon Tyne, UK). Yields and ligation efficiency were compared as product bands divided by total bands. T-DNAs moved more through the gels compared with Y-DNAs, indicating the higher molecular weight of T-DNA (79713.2 Da) compared to Y-DNA (26902.5 Da). Calculation of Gibbs energy of 5′ cohesive bases. Gibbs energy of cohesive ends was calculated with nearest neighbor thermodynamic properties. Gibbs energies were further verified using the IDT internet service 'oligo analyzer 3.1 tool. ' Gibbs energies were measured at 6 μ M Oligo, 15 mM Na + , 0 mM Mg 2+ , 0 mM dNTPs concentrations.

Optimization parameters in
Design of functional DNA sewing materials and evaluation of their target binding ability. New stem-loop DNA was designed such that its characteristic functionality was supplemented into Y-DNA. One oligonucleotide contained an additional loop with 31 bases, which was capable of specifically identifying cancer-specific RNA markers depending on the sequence. L-DNA consisted of three parts: (i) basic structural part of Y-DNA, (ii) capturing part that binds target DNA and (iii) stem part (five base pairs) that maintains the loop structure. Using two high ligation efficiency overhang sequences (GACT and GAGT), LT-DNA was synthesized. The synthesis protocol is the same as that of T-DNA (L-DNAs in 0.6 μ M were combined with with T4 ligase (2 μ l) and 10× ligase buffer (5 μ l) in a total volume of 50 μ l at 37 °C. After the construction of LT-DNA, target DNA (6 × 10 −5 μ mol) complementary to the capturing part sequences was added to the LT-DNA solution at 37 °C. LT-DNA captured target DNAs for four hours. After the reaction, products were evaluated using GEMSA.
Fluorescence measurements of multifunctional DNA materials. Two oligonucleotides with a modified 5′ end with dark quencher (Iowa black RQ) and Cyanine dyes (Cy3 or Cy5) were provided by Integrated DNA Technologies (IDT, Inc., Coralville, IA) ( Table S2b). After the loop-stem structure was formed, Cy3 or Cy5 reacted with quencher (Iowa Black RQ) to mute the fluorescence intensity of the cyanine dyes. After target RNAs were probed by LT-DNA, the loop structures were disrupted, and the fluorescence increased. All measurements were made in 100 μ l solutions containing 0.6 μ M LT-DNA with twice (12 × 10 μ mol) the molar amount of complementary target RNA (mRNA EZH2 and miRNA21) from its LT-DNA. The fluorescence of these reaction mixtures was measured by excitation with a 512 nm and 550 nm laser light source in a SpectraMax M5 (Molecular Devices, Sunnyvale, CA). Cy3 was measured with excitation at 512 nm/emission at 614 nm, and Cy5 was measured with excitation