Encoding quantized fluorescence states with fractal DNA frameworks

Signal amplification in biological systems is achieved by cooperatively recruiting multiple copies of regulatory biomolecules. Nevertheless, the multiplexing capability of artificial fluorescent amplifiers is limited due to the size limit and lack of modularity. Here, we develop Cayley tree-like fractal DNA frameworks to topologically encode the fluorescence states for multiplexed detection of low-abundance targets. Taking advantage of the self-similar topology of Cayley tree, we use only 16 DNA strands to construct n-node (n = 53) structures of up to 5 megadalton. The high level of degeneracy allows encoding 36 colours with 7 nodes by site-specifically anchoring of distinct fluorophores onto a structure. The fractal topology minimises fluorescence crosstalk and allows quantitative decoding of quantized fluorescence states. We demonstrate a spectrum of rigid-yet-flexible super-multiplex structures for encoded fluorescence detection of single-molecule recognition events and multiplexed discrimination of living cells. Thus, the topological engineering approach enriches the toolbox for high-throughput cell imaging.


Supplementary Methods
HPLC characterization and purification. HPLC purification and characterization were carried out (for TDN nodes and FDF structures with no more than 7 nodes) on an Agilent 1260 system. An SEC column (Phenomenex BioSec-SEC-S4000, 300 × 7.8 mm) was used to characterize and purify the TDNs and some of the FDFs (F2,n, F3,1, and F4,1). Chromatograms were recorded at 260 nm. The mobile phase was 25 mM Tris-HCl, pH 7.2, 450 mM NaCl with a flow rate of 1 ml min -1 .
Gel characterization and purification. For gel characterization and purification of TDNs, the PAGE electrophoresis was carried out in an 8% polyacrylamide gel (acrylamide/bis-acrylamide ratio, 29:1) using a Bio-Rad vertical gel electrophoresis system (typically 120 V, 100 min). For FDFs, the electrophoresis was carried out in an agarose gel (BioRad, 1% w/v) using a Bio-Rad horizontal gel electrophoresis system (typically 85 V, 60 min). The loading buffer contained 50% glycerol and colour tracker (Bromphenol Blue and Xylene Cyanol FF). The electrophoresis buffer was 1× TBE (tris-borate-EDTA). After electrophoresis, the gel was stained with Gel Red (Biotium, USA) following the protocol provided by the manufacturer.
STORM imaging of FDFs. STORM Imaging (N-STORM super-resolution microscope, Nikon) was performed using inclined illumination with excitation intensity of ~200 W/cm 2 at 405nm, 561nm or 647 nm, following the protocols provided by the manufacturer. All images were reconstructed from more than 15000 frame long time-lased movies acquired with 20 ms integration time. For monocolour imaging, the images were reconstructed using spot finding and Gaussian fitting algorithms with ImageJ software. Fluorescent microspheres were used as fiducial markers for drift correction by tracking the position of each marker, and performed in the final super-resolution reconstruction. For multicolour imaging, Nikon N-STORM analysis software was used for image reconstruction.

Molecular Dynamics
Simulations. The simulations of linear and network DNA tetrahedron structures were carried out using oxDNA with the sequence-dependent parametrization of hydrogen-bonding and stacking interactions 1, 2 . The simulations were carried out on NVIDIA GPUs using molecular dynamics (MD) simulation with an Andersen-like thermostat and simulation time step of 0.009 ps 3 . The temperature was set to 20 °C. We ran the MD simulations for each of the DNA nanostructures for the number of steps time corresponding to 30 μs. Moreover, to speed up the sampling of different conformations, a diffusion coefficient that corresponds to 7.6×10 -8 m 2 /s of a 14 bp duplex in the simulation was applied, which corresponds to approximately 600 times faster diffusion than observed experimentally 4 . To obtain the average size for each nanostructure and calculate the mean deviation, we saved 1000 different conformations from the MD simulation. We then randomly picked one structure from the ensemble and aligned all of the remaining structures onto this one so that the root-mean-square distance between the centers of mass of all corresponding nucleotides is minimized.
Decoding accuracy estimation. To evaluate the reliability of barcode decoding, we established an estimation model based on computer-generated samples bearing random errors. Take the 7-node FDF barcodes as an example. We first used MatLab to generate a large number of three-dimensional vectors to represent the barcodes (36 barcode species, N=1000 each) with combinations of three fluorophore species (e.g., a vector [1,1,5] indicates the barcode with 1 Cy5, 1 ROX, and 5 A488 dyes). Considering our experimental observation, we assumed that each fluorophore on the barcodes has a 5% chance of being lost in fluorophore counting and a 1% chance of being overcounted. These variations with given probabilities were introduced into the computer-generated vectors, mimicking real barcode samples with errors in fluorophore counting.
Next, the vectors of these computer-generated samples were identified one by one by using MatLab.
Briefly, if a sample vector matches any of the standard barcodes, it is marked as "matched" (Supplementary Fig. 16). Otherwise, it is "unmatched". Among the matched samples, there were still a few that matched to incorrect barcodes ("matched incorrectly"). Whereas, a part of unmatched samples could still be correctly identified via the Cosine Similarity analysis given by Equation (1): where "sample" refers to a sample vector that needs to be identified; "reference" indicates the vector of a standard barcode; and θ is the angle between them in their vector space, which reflects their similarity (a bigger cos θ indicates a higher similarity).
The samples were calculated with reference barcodes using this equation. The reference barcode resulting in the maximal cos θ is regarded as the estimated answer of a given sample. If the estimated answer is correct ("estimated correctly"), it is also regarded as being correctly decoded.
Analysis of stability in cell medium. To analyze the serum stability of FDFs, these constructs were suspended in a solution comprising 10% (v/v) fetal bovine serum (FBS) in 1640 culture medium to achieve a solution with 10 nM final structure concentration at 37 °C . The resulting mixture was incubated at 37 °C and 20 µl aliquots were collected after 2, 8, 12, and 24 h for analysis by gel electrophoresis.

Supplementary Tables
Supplementary Table 1. Sequences used in this study.