Scalable amplification of strand subsets from chip-synthesized oligonucleotide libraries

Synthetic oligonucleotides are the main cost factor for studies in DNA nanotechnology, genetics and synthetic biology, which all require thousands of these at high quality. Inexpensive chip-synthesized oligonucleotide libraries can contain hundreds of thousands of distinct sequences, however only at sub-femtomole quantities per strand. Here we present a selective oligonucleotide amplification method, based on three rounds of rolling-circle amplification, that produces nanomole amounts of single-stranded oligonucleotides per millilitre reaction. In a multistep one-pot procedure, subsets of hundreds or thousands of single-stranded DNAs with different lengths can selectively be amplified and purified together. These oligonucleotides are used to fold several DNA nanostructures and as primary fluorescence in situ hybridization probes. The amplification cost is lower than other reported methods (typically around US$ 20 per nanomole total oligonucleotides produced) and is dominated by the use of commercial enzymes.

Typically bands longer than 50 nt correspond to incompletely cut concatemers. Lanes with odd numbers are one-barcode designs, lanes with even numbers two-barcode designs. The gel demonstrates, that all 16 amplification reactions produced a comparable amount of oligonucleotides. The reactions 2 and 4 contained oligonucleotides for a scaffold-free structure (c); 13 and 14 contained the 48-helix bundle structure, and 15 and 16 contained the planar rectangular origami structure. (b) HPLC trace at 260 nm of subpool nicking reaction 2. Fraction 1 contains the excess of the two split nicking primers; fraction 2 contains the 32 nt target oligonucleotides, fraction 3 the excised intervening sequence, fraction 4 contains the 48 nt target oligonucleotides; and fraction 5 all uncut concatemers. (c) Is a TEM image from a scaffold-free single-stranded tile structure folded with fractions 2 and 4 and synthetic oligonucleotides. Even though all oligonucleotides in a fraction have the same length, they can have slightly different retention times due to a different base composition giving rise to multiple peaks observed in the chromatogram. Lane 1: 10 bp marker; lane 2: an aliquot of the crude oligonucleotide library contained too little material to show up in the pel (else, a band at 100 nt should be visible). (3-6) 2 µl aliquots of the 5 last HindIII digest, (7-10) analysis of the crude double nicking reaction before workup. Bands at 70 nt are the primary FISH probes, bands at 100 nt are FISH probes plus 30 nt intervening sequence. (8) was probe set 82D2-82D5; (9) contained probe set 82A1. For the FISH experiments, the crude double nicking reaction was purified by HPLC and the 70 nt products were collected. Subpool (6/10) did not amplify (presumably due to a pipetting error).

Supplementary Note 1: One or two-barcode design?
In this work, we tested two different template strand designs: One with one and one with two 10 nt barcodes per intervening sequence (Supplementary Fig. 1). The rationale for one barcode was to keep the length of the template oligonucleotides as short as possible, because the accumulated sequence errors increase with increasing length during oligonucleotide synthesis. Two barcodes on the other hand increase the sequence space for different barcodes.
The two-barcode design has several advantages over the one-barcode design: 1) Structures shown in Figure 4 were folded from oligonucleotides amplified with the two-barcode design. Both designs amplified oligonucleotides that could be folded, however the oligonucleotides amplified from the two-barcode designs folded more reliably and with 2) The intervening sequence is longer and can be removed more easily by HPLC chromatography (discussed in Supplementary Note 3). This is particularly true where short oligonucleotides (such as 25-mers) were produced. For these, a larger retention time difference of short incompletely nicked oligonucleotides (e.g. 25 nt + the intervening sequence) from the longest target sequence (e.g. 52 nt + intervening sequence) is desirable.
3) For the 2-barcode design, we also tested nicking primers consisting of 2 halves, the Nb.BsrDI nicking primer and the Nt.BspQI nicking primer. The split nicking primers were only 20 nt long and the excess of primers could be more easily separated by anion exchange chromatography from target sequences (typically between 26-48 nt) than the 40 nt fulllengh nicking primer.
8 4) The intervening sequence is cut close to the center (Supplementary Fig. 1d) after the second round restriction digest, whereas the one-barcode version is cut close to the transition to the target sequence (MMM sequence in b). This may lead to a less efficient religation for the third and final RCA step because of the low melting temperature of the short fragment.
For the FISH experiments, however, we chose the one-barcode design to keep the length of the template at 100 nt (recommendation from LC Sciences).
Moreover, we tested different nicking enzymes cutting the upstream strand (Nb.BtsI and Nb.BsrDI), but found no significant differences in the performance of the two enzymes in the digest of RCA products.

Supplementary Note 2: Oligonucleotide and primer sequences
Oligonucleotide libraries: To generate the sequences for a library order, we attached barcode and restriction enzyme sequences to the reverse complement of the target sequence as in Supplementary   Fig. 1 Note that second round primers of the one-barcode design are identical to nicking primers after the final round of rolling circle amplification. "48 hb" is a so-called multilayered 48-helix bundle origami structure (Fig. 4 b); 6x6x64 is a scaffold-free DNA brick structure (adapted from Ref. 5 ).

Primers for the FISH library:
The library for the FISH probes was ordered from LC Sciences in the one-barcode design to keep the final length of the sequences at 100 nt (70 nt target sequences + 30 nt intervening sequences). Note that the second round primers are identical to the nicking primers.
Pool 1  For the HPLC workup, guanidinium was chosen as a cation because it is a slightly denaturing (chaotropic) cation. Sodium is the most common cation for anion exchange chromatography but stabilizes DNA duplexes. To achieve fully denaturing conditions with a NaCl buffer, the column had to be heated to 85 °, which decreases its lifetime considerably and more DNA damage can be expected than at lower temperatures. With guanidinium chloride denaturing conditions were obtained at lower temperatures and with a comparable resolution.

11
The HPLC trace in Supplementary Fig. 2 b shows a fully denaturing experiment, where primers and intervening sequence elute as single-stranded DNA without forming a duplex.
To achieve fully denaturing conditions, the crude material can be desalted and the temperature could be increased (e.g. to 85 °C).
Partially stabilizing the duplex by lower temperatures or by precipitating the crude reaction in the presence of 10 mM MgCl2 allows stable double strands such as the complex between intervening sequence and nicking primer(s) to stay double-stranded and have retention times comparable to 60 nt (single barcode) or 80 nt (two barcodes). This way, most of the unwanted intervening sequence can be removed more easily from staple strands ( Supplementary Fig. 3 and Fig. 3 lane 4)

Supplementary Note 4: Comparison with other methods
Our method compares favorably with other published amplification methods. In this note we summarize the advantages of the method presented herein over competing methods.

Supplementary Note 5: Yields and costs
Starting from attomoles of circular templates, we determined that an 8-hour RCA reaction can yield up to ~9400 copies of templates of 72 nucleotides, determined by quantitative real-time PCR. The last rolling circle amplification (and in some experiments even the second round) however yields a viscous, gelatinous solution, which is very difficult to pipette as concatemers tangle up. This also makes it impossible to aliquot a RCA reaction that reaches these high concentrations. Experimenting with different reaction conditions revealed an upper limit of around 20 µM copies of typical 30-50-mers in a RCA reaction.
13 This is in good agreement with the findings of Dahl. 24 The third round is therefore not limited by the amount of template or dNTPs, but mainly by the upper concentration limit of concatemer copies. The mechanism for this is not clear but may be due to sterical hinderance of tangled concatemers. Due to the gelatinous nature of the RCA products, the concentration of concatemers could only be determined by one of two methods: 1) Through the digestion of a final nicking reaction and the workup with a silica column The current cost is about 0.1 cents/ base in commercial oligonucleotide libraries. We expect this price to drop substantially as the production cost of oligonucleotide libraries is not dominated by chemical costs as in standard column-based oligonucleotide synthesis.
This data suggests that a 100% cannot be reached. We do not have a satisfactory explanation for this behavior and also increasing the reaction times to 16 hours did not did not eliminate concatemers. Unfortunately we found that residual concatemers present in the oligonucleotide pool would impede the folding of both scaffold-free single-stranded tile structures and DNA origami. Moreover, excessive nicking primers, which were always added in a large excess, may be problematic in certain applications. We therefore decided to limit the usage of nicking enzymes to 0.8 U/µl of final RCA reaction and increase the reaction times to 16 h where the desired product are already the main product and to remove the remaining concatemers and nicking primers by anion exchange HPLC (Supplementary   Note 3). In applications such as FISH, where residual concatemers or nicking primers may be tolerated, one may skip the HPLC step.

RCA yield
Initially, we did not test different concentrations of dNTPs as we had no indication that the amplification yield in our protocol is suboptimal. Our yield exceeds the 15 µM copies On the other hand, an elevated dNTP concentration could inhibit the polymerase efficiency and therefore we tested the RCA efficiency as a function of the dNTP concentration between 0.1 and 1.6 mM (each) for two different amplification times (3 h and 14 h).
The experiment in Supplementary Figure 7 simulates a last round RCA reaction. For this, 3 synthetic templates (67, 77 and 87 nt) with a randomized target sequence were circularized and aliquots were amplified in parallel with different dNTP concentration. After the indicated times, the reactions were heat inactivated, a nicking primer was added and the RCA product was digested to monomers with Nt.BspQI. Only one of the two nicking enzyme 16 was used to generate an easy to analyze digestion pattern where only template monomers and multimers, but no fragments containing unequal numbers of target and intervening sequences are produced. The digestion products were analyzed by denaturing PAGE and the intensities of the gel images were extracted by imageJ. In the left diagram, the normalized total mass of RCA products (monomers and all multimers) was analyzed. The right diagram shows the normalized mass of digested monomers alone.
The left diagram reveals that higher dNTP concentrations yield more RCA product, therefore no inhibition of the polymerase seems to take place here. However, the normalized mass of digested monomers reach a maximum at 0.4 mM. Therefore, a final concentration of 0.4 mM may be more optimal for the production of oligonucleotides than the ~0.9 mM used in the protocol. The differences between the two conditions are however only a few percent. The data further suggests, that shortening the RCA reaction time to 3 hours does not drastically decrease the yield.