Compared with conventional methods, single-molecule real-time (SMRT) DNA sequencing exhibits longer read lengths than conventional methods, less GC bias, and the ability to read DNA base modifications. However, reading DNA sequence from sub-nanogram quantities is impractical owing to inefficient delivery of DNA molecules into the confines of zero-mode waveguides—zeptolitre optical cavities in which DNA sequencing proceeds. Here, we show that the efficiency of voltage-induced DNA loading into waveguides equipped with nanopores at their floors is five orders of magnitude greater than existing methods. In addition, we find that DNA loading is nearly length-independent, unlike diffusive loading, which is biased towards shorter fragments. We demonstrate here loading and proof-of-principle four-colour sequence readout of a polymerase-bound 20,000-base-pair-long DNA template within seconds from a sub-nanogram input quantity, a step towards low-input DNA sequencing and mammalian epigenomic mapping of native DNA samples.
Single-molecule real-time (SMRT) DNA sequencing1 has opened many avenues in genomic interrogation1,2,3. In SMRT sequencing, DNA strand replication by an individual DNA polymerase is optically measured using fluorescently labelled dNTP analogues. An essential component of SMRT sequencing is the zero-mode waveguide (ZMW)4, a zeptolitre-volume cylindrical cavity (∼100 nm diameter and height) in which the DNA–polymerase complex is immobilized4. The main advantages of SMRT sequencing over second-generation sequencing methods include long average read lengths of more than 10,000 bases and lack of GC bias3,5,6, critical for gap-free sequencing, and the ability to directly detect DNA base modifications by monitoring polymerase kinetics2. Apart from DNA sequencing, ZMWs have been exploited for single-molecule RNA sequencing and epigenetics7 as well as a variety of other single-molecule studies8,9,10,11,12,13.
A critical limiting step of SMRT sequencing is the loading of long DNA templates into ZMW confinements. For a DNA template to be sequenced, a polymerase-bound DNA template must bind to the bottom of the ZMW through biotin–streptavidin (Stv) chemistry, a process that requires substantial DNA sampling time inside the ZMW. Mismatch between the equilibrium hydrodynamic diameter of long DNAs (>560 nm for >10,000 base pairs (bp)14) and the ZMW diameter (100–150 nm) creates an entropic barrier to molecular entry under diffusive conditions15,16. Under diffusive conditions this barrier biases entry of short DNA templates over long ones, or conversely, favours fast escape of longer DNA from the confinement over short DNA escape17. Although magnetic bead assays have been developed to improve loading efficiencies, input DNA requirements are still above nanogram levels, and it is critical that shorter DNA fragments are completely removed to avoid competitive binding. Therefore, despite available methods for producing sequencing libraries from low-input DNA (for example, sub-nanogram)18,19, the potential of SMRT sequencing for epigenetics from low-input libraries, for example, from needle biopsies and single cells, can only be realized when sub-nanogram inputs can be efficiently loaded into ZMWs.
We have recently introduced nanopore-ZMWs (NZMWs)20, which allow rapid electrical loading of DNA molecules from solution into ZMW cavities. In this device, an array of waveguides, which have nanopores at their bases, sits atop thin insulating membranes. Application of voltage across NZMWs generates an electric field that draws charged molecules into the sequencing volume. In this work, we investigate electrophoretic packaging and binding of DNA molecules inside NZMWs. We find that DNA loading rates are virtually DNA length independent, and that overall loading efficiencies are 5–6 orders of magnitude higher than for diffusive loading/binding. Second, despite the presence of a nanopore in an NZMW, which normally translocates DNA coils, we find extremely long dwell times of DNA inside NZMWs, which we attribute to coil frustration due to an interplay of the electric field and geometric confinement. Despite this, binding of Stv-end-labelled DNA to the biotinylated NZMW floor is highly efficient, which is surprising given the coil entanglement inside the NZMW cavity. Finally, we demonstrate the rapid loading from sub-nanogram amounts of a 20,000 bp DNA template, and show proof-of-principle four-colour sequence readout from this template sequence.
Figure 1 describes the main features of our experimental setup. A scanning electron micrograph of a ZMW array on a silicon wafer is shown in Fig. 1a, along with a transmission electron micrograph of one NZMW from a small sub-array generated on the device. Our microscope design spectrally probes each NZMW in the array, while allowing simultaneous electrical control over DNA loading using a pair of electrodes. The use of three laser lines allowed excitation of YOYO-1-stained DNA for studies of its packing inside NZMWs, as well as for four-colour readout of the SMRT sequencing nucleotide analogues. A confocal pinhole array is placed in registry with the NZMWs, and spectral resolution is achieved using a prism that linearly disperses the emission from each NZMW21 (Fig. 1b), allowing detection of the four dye-phospholinked DNA bases (Fig. 1c). Inherent photoluminescence background from the silicon nitride substrate, which has overwhelmingly high orange–red photoluminescence22,23 (Fig. 1d), was reduced ∼40-fold by first fabricating NZMWs on a 20-nm-thick SiO2 film atop plasma-etched silicon nitride, and then back-etching to remove the photoluminescent nitride layer (Supplementary Fig. 2). Photoluminescence background of the resulting freestanding SiO2 membrane layer is then sufficiently low, allowing high signal-to-noise single-molecule fluorescence measurements.
Voltage-driven DNA packing into NZMWs
Force-induced DNA packaging into confinements is a common process in viral life cycles24. This process requires energy to overcome entropic, elastic and electrostatic contributions associated with DNA compression. For the very same reasons, the efficiency of diffusion-based DNA loading into ZMWs is very low, particularly for large DNA molecules (>1 kilobase pairs (kb)), whose radius of gyration exceeds the ZMW diameter. In NZMWs, the nonlinear d.c. electric field20 generated by voltage application (Fig. 2a, left inset) can provide the DNA with the required energy for DNA packing (see also Supplementary Fig. 3). We studied this process by fluorescently labelling DNA (see Methods), and then recording NZMW fluorescence as a function of time. Using an array of six NZMWs, each with a 3–5 nm nanopore at its base, we investigated DNA capture into the NZMWs21,22 for DNA lengths ranging in size from 1 kb to 48.5 kb. On arrival in the NZMW illumination volume the DNA molecule emits fluorescence, resulting in intense spikes that are followed by slow YOYO-1 bleaching (mean decay constant = 350 ± 240 ms, see Supplementary Fig. 4). Colour-coded representative traces are plotted for a range of DNA lengths at 200 mV applied bias (Fig. 2a). As seen in the spike traces, spike intensity correlates with DNA length. A greater intensity for longer DNA molecules implies that more DNA bases enter the NZMW excitation volume4. By assuming uniform vertical DNA packing and an exponentially decaying ZMW illumination field, we solved a simple model for the expected fluorescence, F, as a function of number of base pairs, N (Supplementary Section 5): where A is an amplitude fit parameter, b is the height of packed DNA per base pair, and Λ is a decay length constant. In Fig. 2b, we plot the experimental peak intensities measured for 35–340 molecules of each length, where the dashed line is a two-parameter fit to equation (1) using Λ = 20 nm (refs 4,25). The fit goes through the data points within error, yielding b = 2.5 ± 1.5 nm kb–1. This value of b corresponds to the height of a DNA cross-section, ∼2 nm, which can be thought of as the DNA wrapping around the inner ZMW volume once for every 1 kb length (340 nm contour length). Based on this fit, we obtain a DNA base pair density within the NZMW of 0.050 bp nm–3, roughly an order of magnitude smaller than that within viruses (∼0.2–0.6 bp nm–3)26, yet almost four orders of magnitude higher than in free solution (∼10−5 bp nm–3)14.
The first step of DNA packing into an NZMW is capture of the molecule from bulk into the NZMW volume. From the interspike duration statistics for different DNA lengths, in Fig. 2c (top) we plot the mean capture rate RC as a function of DNA length, where RC has been concentration-normalized (voltage = 200 mV). Given the energy barrier for confining the DNA coil inside an NZMW, one would expect shorter DNAs to be captured more efficiently than longer molecules. The shortest polymer studied was 1 kb DNA, which has a radius of gyration of ∼60 nm (ref. 27). The rates were weakly dependent on DNA lengths in the range 1–48.5 kb, mildly increasing with DNA length. This contrasts with diffusive loading into ZMWs, in which case shorter molecules (below 1 kb) are strongly favoured20, but is overall not vastly different from DNA capture behaviour into nanopores28 or nanopipettes29,30. Absence of any dependence of capture on DNA length into NZMWs would suggest that capture is mostly governed by voltage-induced DNA drift outside the NZMW31,32, which displays no length dependence owing to the invariant mobility of DNA in free solution electrophoresis (shown to be valid for DNA longer than 400 bp)33. However, the mild favouring of longer DNA fragments in our case suggests that a slight barrier at the mouth of the ZMW exists, in which case a similar species with a higher charge is more efficiently trapped by the protruding field outside the NZMW mouth, as observed for DNA capture into 4 nm pores28. However, once captured into the NZMW we observe DNA packing, consistent with our observations of fast fluorescence spikes that increase in amplitude with DNA length (Fig. 2b).
Efficient DNA capture allows rapid loading of large DNA molecules from ultralow-concentration samples into NZMWs for long-read sequencing. Table 1 summarizes capture data for long DNA fragments using a higher voltage range than for the data in Fig. 2, normalized to nucleic acid mass (units of pg−1 min−1) per 1 µl volume in order to relate to nucleic DNA masses in a single eukaryotic cell (6 pg per cell for human cell DNA). In Table 1, we also present data for prokaryotic ribosomal RNA (rRNA) capture, which exhibits a similar efficiency (Supplementary Section 6). Given the linear relationship between capture and concentration in solid-state nanopores34,35,36, extrapolation leads to sub-minute loading timescales for a 10 pg DNA sample (in a 1 µl volume). For comparison, conventional magnetic bead loading of 10 kb SMRTbell samples requires 1.5 ng of DNA and an hour loading time for ∼30–35% loading efficiency, which corresponds to loading rates that are five orders of magnitude slower than our NZMWs. Finally, handling DNA in an ∼1 µl volume is relatively straightforward, and further demonstrated to be compatible with solid-state nanopore measurements35.
To understand DNA dynamics inside the NZMW under applied voltage, we quantified the lower bound duration of the fluorescence pulses, that is, minimum DNA residence time (or dwell times) in the NZMW, as a function of DNA length (Fig. 2c, bottom). Residence times weakly increase with DNA length in the range 1–48.5 kb, with timescales (160–570 ms) at least two orders of magnitude longer than observed ‘docking times’ for 48.5 kb molecules in both optical and electrical experiments near planar nanopores (≤4 ms)37,38. Eventually, we observed DNA departure from the NZMW, either via translocation through the nanopore or escape from the top. This is observed as a sharp drop in fluorescence to its baseline value, such as the one seen in the inset, purple trace in Fig. 2a. Previous single-molecule studies of confined polymers have observed orders of magnitude longer relaxation times compared with unconfined polymers39. Despite the strong electric field outside the pore entrance, which in a planar pore causes rapid DNA threading after a short millisecond-scale docking period, we hypothesize that long DNA dwell times of the packed DNA in the NZMW are related to coil frustration due to a compressed DNA state in the NZMW.
Efficient DNA binding to NZMW surface
In SMRT sequencing, template DNA is first bound to a DNA polymerase:Stv fusion protein, so that the complex can bind to surface biotin groups on the waveguide floors. The biotin–Stv bond is among the strongest non-covalent bonds40. Based on the reaction's binding constant41, ∼22 ms reaction time of a Stv molecule with a single surface biotin is required in a 100 nm ZMW (Supplementary Section 7). The dashed red line in Fig. 2c (bottom) shows that this minimal reaction time is far shorter than DNA molecule residence times for all DNAs, which should ensure DNA:Stv binding to a biotinylated surface. To confirm binding, we conjugated YOYO-1 stained end-biotinylated DNA with Stv to preform a DNA–Stv complex, followed by loading experiments. The two traces in Fig. 2d show typical fluorescence versus time traces under constant applied voltage for 48.5 kb Stv–DNA versus free DNA capture into NZMWs. Clearly, bleaching of the YOYO-1 dye precedes the observation of DNA escape for the Stv–DNA, and a comparison of the traces shows a much longer Stv–DNA residence time in the NZMW (>20 s) than for free DNA (typically 0.5–2 s).
To show that DNA binding is strictly due to Stv binding to biotins at the NZMW base, we performed fluorescence measurements of dual-labelled Stv–DNA27,35,37,38,39,41 to a 1 × 4 array of biotin-functionalized NZMWs (see Methods). A dual-labelled 1.5 kb Stv–DNA (Stv = Alexa647, DNA = YOYO-1) was probed as follows: first, red fluorescence was observed to detect Stv binding, and then voltage was switched off and YOYO-1 fluorescence was probed to report on DNA capture. As shown in Fig. 3a, we observed short-lived spikes in fluorescence from ZMWs, indicating arrival of Stv into the ZMW volumes, followed by single-step dye photobleaching (Fig. 3a inset, red trace). After the voltage was turned off, all four NZMWs exhibited fluorescence spikes with smooth photobleaching curves characteristic to YOYO-1 bleaching (Fig. 3a inset, green trace). Figure 3a is an image showing the integrated false-colour fluorescence from the ‘voltage on’ and ‘voltage off’ periods, which demonstrates highly efficient Stv binding in NZMWs. In summary, one, Stv binds at or near the bottom of the NZMW, as indicated by the bright red fluorescence in all NZMWs, and two, DNA remains immobilized in the NZMW, even after the voltage has been switched off. In Fig. 3b, we show successive frames that show Stv–DNA capture at 300 mV, followed by slow bleaching that persists even after the voltage has been turned off.
We probed the binding of a 48.5 kb Stv–DNA (λ-DNA) labelled with YOYO-1. In this case of an extremely long DNA template, both entry and binding should be entropically disfavoured because the molecule's radius of gyration is larger than the NZMW diameter, and because its conformation should be restricted inside the smaller NZMW. We find that capture and binding are both extremely efficient, even for a 48.5 kb fragment. Figure 3c presents integrated images of a 5 × 5 NZMW array before and after a voltage pulse in the presence of λ-DNA. On applying 400 mV for 23 s, 21 of the 25 NZMWs contained DNA, indicated by fluorescence from DNA-filled NZMWs that persisted after voltage release. After a 5 s 1 V pulse, 23 out of the 25 pores were loaded, which corresponds to a loading efficiency of 92% (Supplementary Fig. 6). Notably, throughout the experiment we did not observe fluorescence from ZMWs adjacent to the 5 × 5 NZMW array. Since DNAs may enter ordinary ZMWs only via diffusion, we reason that the energetic barrier for DNA packing is too costly to allow efficient entry and binding, so long DNA molecules in the solution are entropically trapped outside the ZMWs.
Sub-nanogram long DNA template capture and sequencing
The high rate and yield with which long DNA molecules pack into NZMWs and bind to their surfaces through biotin–Stv linkages presents an opportunity for low-input DNA sequencing. However, the first step towards this is to show that voltage loading a DNA–polymerase complex into NZMWs is compatible with subsequent SMRT sequencing, since the applied voltage can dissociate the polymerase from DNA, for example. We first probed this by voltage-loading a pre-bound complex of 72-nucleotide circular DNA and DNA polymerase (Fig. 4a). We fabricated a 2 × 2 NZMW array on a membrane with roughly 100 ZMWs, the NZMW cis surface was biotinylated, and the sequencing cell was assembled such that the cis solution contained a 1 nM concentration of template and 330 nM of fluorescently labelled dNTP analogues (see Methods for details). Figure 4b displays fluorescence images (red channel only) of the waveguide array at several points during this experiment. Because the solution lacked Mg2+, the bound enzyme was inactive. On application of voltage, the template DNA was drawn into the NZMWs, while simultaneously negatively charged dNTP analogue molecules were focused into the NZMWs, resulting in increased fluorescence at each NZMW (Fig. 4b(ii)). We plot an abridged 640-nm-illumination fluorescence trace from the top left NZMW in the array shown in Fig. 4b. We first applied three ∼1-s-long 750-mV pulses (Fig. 4b(ii)), which captured and immobilized template DNAs on the biotinylated NZMW bases. We then verified the captured enzyme activity by adding Mg2+ to the solution, which activates the polymerase, as indicated by the appearance of discrete fluorescence bursts from the NZMWs that signals base incorporations into the growing DNA strand (Fig. 4b(iv)). The other ZMWs on the membrane exhibited no continued fluorescence because they did not load any template DNA in the rapid loading time, despite the template being short (a typical loading process takes minutes to hours in SMRT cells). In contrast, all four NZMWs captured DNA during this ∼3 s loading period. To further verify that the fluorescence bursts came from polymerase synthesis, we spiked the system with KCl to a final concentration of 850 mM, which inactivated the enzymes, evidenced by the cessation of bursts following KCl addition (Fig. 4b(v)). This capture-and-activate experiment (see Supplementary Movie 1) illustrates that NZMW-based DNA–polymerase trapping does not impact the polymerase activity, indicated by successful enzyme activation by Mg2+ addition and enzyme inhibition at high salinity.
Finally, we demonstrate that a long DNA template can be loaded at sub-nanogram levels and its sequence can be read out. The template molecule was a 20 kb SMRTbell sequencing construct whose sequence has previously been determined. After forming a template–primer–polymerase construct, we applied a 2 s, 600 mV pulse to load a molecule into an NZMW from a 3 pM bulk concentration (40 pg µl–1), followed by activation using Mg2+. In Fig. 4c, we plot successive camera frame montages during several 1.1-s-long excerpts of the sequencing process (frame rate was 72.7 frames per second, two rows of 40 frames shown per excerpt). As seen from the montages, base incorporations by the polymerase result in fluorescence that persists over several frames1. The vertical position of a fluorescence burst on the charge-coupled device (CCD) corresponds to a particular nucleotide analogue (Fig. 1), and in the top excerpt we highlight the first instance of each base incorporation. To analyse whether the sequence of bases read by our system corresponds to the template sequence, we wrote a Python program for base-calling (Supplementary Section 10). The program analyses the fluorescence bursts to find their mean position on the CCD, duration and chronological order, and compares these with control datasets that contain pure dNTPs (see Fig. 1b, bottom left). Post-analysis, colour-coded raw burst data for the 20 kb SMRTbell are shown in Fig. 4d (see Supplementary Movie 2, green box). Using our algorithm, we obtained reads that map to the 20 kb SMRTbell template sequence (available from an independent Pacific Biosciences sequencing run) with 67% single-read accuracy and typical read lengths of up to 1.6 kb. While this result demonstrates the compatibility of our NZMWs with low-input capture of long fragments and their sequencing, further signal optimization and base-calling improvements are needed.
We have demonstrated that integration of ZMWs with nanopores allows DNA loading for SMRT sequencing from unprecedentedly small input quantities. Our study of voltage-induced DNA packing kinetics into NZMWs in detail using fluorescently stained DNA molecules reveals loading rates that are DNA-length-independent in the range 1.0–48.5 kb, which allows sequencing from broader library length distributions without accompanying loading bias. Owing to compression of the voltage-loaded DNA coil, a molecule resides in the NZMW volume with timescales that are much longer than docking times in the absence of the cylindrical waveguide confinement above the pore. This allows efficient binding of Stv-tagged DNA to the biotinylated NZMW surface, critical for subsequent interrogation using optical probing. Finally, we have demonstrated a proof-of-concept sequencing assay where a fragment, in this case a 20 kb SMRTbell template with known sequence, was loaded from sub-nanogram quantities in seconds, and sequence data were obtained immediately thereafter. Further improvement of NZMW device throughput and integration with massively parallel SMRT sequencing would present an unprecedented ability of studying sequence and base modification information from precious DNA samples, for example single cells or needle biopsies, without any deleterious DNA amplification steps.
To biotinylate the NZMW membrane surface, NZMW chips (see Supplementary Section 1 for fabrication details) were immersed in hot piranha solution (3:1 H2SO4:H2O2) for 5 min and thoroughly rinsed in deionized water. They were then dried under vacuum and baked at 85 °C for 10 min. After baking, chips were immediately immersed in a room-temperature solution of 0.5 mg ml−1 biotin–poly(ethylene glycol)–silane dissolved in 200 proof ethanol for two or more hours.
Sample molecule preparation
Biotinylated 1,519 bp DNA was prepared via polymerase chain reaction (PCR) with a biotinylated primer. Biotinylated λ-DNA was prepared by extending λ-DNA single-stranded overhangs with Klenow fragment polymerase in the presence of biotinylated dNTPs. DNA molecules were incubated with YOYO-1 at a 10:1 bp:dye ratio at 65 °C for 30 min. To conjugate to Stv, biotinylated molecules were incubated with a 2× stoichiometric excess of streptavidin for 20 min at room temperature. Circular DNAs were ligated from a 5′-phosphorylated single-stranded molecule with CircLigase II (Epicentre) using a standard protocol42. The 20 kb SMRTbell template was prepared from NoLimits 20,000 bp DNA Fragment (Thermo Fisher Scientific, Inc.) using the SMRTbell template prep kit 1.0 (Pacific Biosciences, Inc.) and size-selected with 15 kb cut off using BluePippin instrument (Sage Science, Inc.).
Primer binding was performed by incubating primer with template at a 20:1 concentration ratio at 80 °C for 2 min, followed by cooling to 30 °C at 1 °C s−1 (Biorad CFX96). A 6× stoichiometric ratio of polymerase (Pacific Biosciences P6) was then incubated with primer-bound template at 30 °C for 4 h (proprietary buffer solutions), followed by 37 °C for 30 min. Samples were then put in 50% glycerol with dithiothreitol and placed at −20 °C for storage.
All relevant data are available from the authors, and/or are included with the manuscript as source data or Supplementary Information.
Publisher's note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
We acknowledge Y.-C. Tsai, I. Vilfan, J. Hanes, R. Lam and M. McCauley for aid in sample preparation, as well as J. Sutin for assistance with the multimode fibre setup on our microscope. This work was supported by funding from the National Institutes of Health (HG006873 and HG009186, to M.W. and J.K.). This work was performed in part at the Cornell Nanoscale Facility, a member of the National Nanotechnology Infrastructure Network (NNIN), which is supported by the National Science Foundation (grant ECCS-1542081).
About this article
Nature Methods (2017)