Introduction

DNA sequencing is a fundamental technology in the biological and medical sciences. Recently,several analytical methods have been developed to detect DNA or RNA at the single moleculelevel using chemical or physical microscopic technologies1,2,3. Inparticular, ion channels have been shown to detect individual DNA or RNA strands, leading tothe promise of high-speed sequencing and analysis of DNA4,5,6,7,8,9,10,11,12,13.

In 1996, Kasianowicz et al.4 first demonstrated that the ?-hemolysin (?HL)channel could be used to detect nucleic acids at the single molecule level. The ?HL channelhas a 1.5?nm-diameter limiting aperture14,15,16,17 and itsvoltage-dependent gating can be controlled, such that the pore remains open indefinitely17, making it an ideal candidate for nanopore-based detection and discrimination.Individual single-stranded polyanionic nucleic acids are driven through the pore by theapplied electric field and the polynucleotides cause well-defined, transient reductions inthe pore conductance4,8,10,12. Because the residence time of thepolynucleotide in the pore is proportional to the RNA or DNA contour length, it was suggestedthat a nanopore may be able to sequence DNA in a ticker-tape fashion if the current signaturesof the four bases can be discriminated from each other.4 Towards the goal ofsequencing with nanopore4,13,18, in another approach, an ?HL channel with acovalently linked adaptor in the pore was used to identify unlabelednucleoside-5?-monophosphates one by one following exonuclease cleavage19.However, a complete exonuclease-nanopore system based on this concept to sequence DNA has sofar not been demonstrated.

Despite the ability of nanopores to detect and characterize some physical properties of DNAat the single molecule level, the more demanding goal of accurate base-to-base sequencing bypassing a single stranded DNA through the nanopore has not yet been realized. Oxford NanoporeTechnologies recently announced the ability to accomplish strand sequencing in a nanopore at3-base resolution with an error rate of 4%20. Another group reported singlebase resolution strand sequencing with a nanopore, but had difficulty correctly determininghomopolymer sequences21.

The native ?HL channel has an inherent ability for high-resolution molecular discrimination.For example, it can discriminate between aqueous H+ and D+ions17 and Robertson et al.22 have recently demonstrated thatthe ?HL channel can easily separate poly(ethylene glycol) (PEG) molecules at better than thesingle monomer level. In the latter study, a molecular mass or size spectrum estimated fromthe mean current caused by individual PEG molecules entering the pore easily resolvesindividual ethylene glycol repeat units. In addition, the mean residence time of the polymerin the pore increases with the PEG size23. Based on these previousinvestigations using nanopores to detect and distinguish molecules with different structuresand the fact that DNA polymerase can recognize nucleotide analogs with extensive modificationat the 5?-terminal phosphate group as efficient substrates24,25,26,27,28,we propose a novel nanopore-based sequencing by synthesis (Nano-SBS) strategy that willaccurately differentiate each of the four different sized tags attached to the 5?-phosphate ofeach nucleotide at the single molecule level for sequence determination. The basic principleof the Nano-SBS approach is described as follows. As each nucleotide is incorporated into thegrowing DNA strand during the polymerase reaction, its tag is released by phosphodiester bondformation (Fig. 1). The tags will enter a nanopore in the order ofrelease, producing unique ionic current blockade signatures due to their distinct chemicalstructures, thereby determining DNA sequence electronically at single molecule level withsingle base resolution. We demonstrated that the 5?-terminal phosphate position of thenucleotide is unique in its ability to tolerate sizable modifications by large tags based onPEG molecules without affecting polymerase recognition. This overcomes the inherentconstraints imposed by the small differences among the 4 bases, a challenge which all othernanopore sequencing methods have faced for decades. Thus, the proposed Nano-SBS approachidentifies individual bases by the detection and differentiation of the large tags releasedduring the polymerase reaction instead of the small nucleotides themselves. The tags are largemolecules that have slow diffusion rates, which greatly increase their chance of entering thenanopore and producing unique ionic current blockade signals. As proof-of-principle, weattached four different length coumarin-PEG tags to the terminal phosphate of2?-deoxyguanosine-5?-tetraphosphate. We demonstrate efficient incorporation of the nucleotideanalogs during the polymerase reaction and better than baseline discrimination among the fourtags based on their nanopore ionic current blockade signatures. This approach coupled withpolymerase covalently attached to the nanopores in an array format should yield asingle-molecule Nano-SBS platform.

Figure 1
figure 1

Mechanism of primer extension and release of tag-polyphosphate for nanoporedetection.

Results

General principle of single molecule electronic DNA sequencing by synthesis usingPEG-labeled nucleotides and nanopore detection

The single molecule electronic Nano-SBS system, which is shown schematically in Fig. 2, depicts the DNA polymerase bound in close proximity to thenanopore entrance. A template to be sequenced is added along with the primer. To thistemplate-primer complex, four differently tagged nucleotides are added to the bulk aqueousphase. After polymerase catalyzed incorporation of the correct nucleotide, thetag-attached polyphosphate will be released and pass through the nanopore to generate aunique ionic current blockade signal, thereby identifying the added base electronicallybecause the tags have distinct chemical structures. An example of four continuousnucleotide incorporation reactions with different tags for each base is shown in Supplementary Fig. S1. An array of nanopores, each with a covalentlyattached polymerase adjacent to the pore entrance, will allow single-molecule SBS.

Figure 2
figure 2

Schematic of single molecule DNA sequencing by a nanopore with phosphate-taggednucleotides.

In the envisioned approach, each of the four nucleotides will carry a different tag.During SBS, these tags, attached via the 5?-phosphate of the nucleotide, will bereleased into the nanopore one at a time where they will produce unique current blockadesignatures. For the purpose of illustration only, several tags are shown inside the porein the order of their sequential release; in actuality only one tag will enter the poreat a time. A large array of such nanopores could lead to highly parallel, highthroughput DNA sequencing.

This tag-based Nano-SBS system offers the following advantages over strand sequencingthrough nanopores: (1) it overcomes the inherent constraints imposed by the smalldifferences among the 4 bases by instead using 4 large and distinct molecular tags, whichare easily differentiated by a nanopore; and (2) there is no need to slow down the transitspeed of the tag through the pore as long as the tag is detectable, because the polymeraseextension and tag release rate is much slower than the tag interaction time with the pore.This would also eliminate phasing issues inherent to strand sequencing. Here, we describethe synthesis and efficient incorporation of a new class of nucleotide analogs with5?-phosphate-attached tags. These tags consist of four different length PEGs and acoumarin moiety. We also demonstrate four distinct ionic current blockade patternsproduced by these tags in an ?-hemolysin channel at the single molecule level. Thisproof-of-principle study of the separate elements of the proposed Nano-SBS systemdemonstrates the feasibility of integrating them into a single molecule electronic SBSnanopore sequencer in the future.

Design, synthesis and characterization of PEG-labeled nucleotides

The four 5?-phosphate tagged 2?-deoxyguanosine-5?-tetraphosphates (Fig.3) were synthesized according to the generalized synthetic scheme shown in Fig. 4. First, 2?-deoxyguanosine-5?-triphosphate (dGTP) was converted to2?-deoxyguanosine-5?-tetraphosphate (dG4P). Then, a diaminoheptane linker was added to theterminal phosphate of the tetraphosphate to produce dG4P-heptyl-NH2 (Product A)in order to attach different length PEG tags. In a separate set of reactions,6-methoxy-coumarin N-hydroxysuccinimidyl ester was reacted with one of fouramino-PEGn-COOH molecules with 16, 20, 24 or 36 ethylene glycolunits, to produce coumarin-PEGn-COOH molecules, which were subsequentlyconverted to the corresponding NHS-esters (Product B). The coupling ofdG4P-heptyl-NH2 (Product A) with the coumarin-PEGn-NHSesters (Product B) yields the four final nucleotide analogs, abbreviatedcoumarin-PEGn-dG4Ps (Fig. 4, n = 16, 20, 24,36). The coumarin moiety was used as a prototype modifier to further tune the size of thetag as well as to track the purification of intermediates and the final nucleotideanalogs. Synthesis of the expected coumarin-PEGn-dG4P molecules wasconfirmed by MALDI-TOF mass spectroscopy (Supplementary Fig. S2).

Figure 3
figure 3

Structures of four coumarin-PEGn-dG4P nucleotide analogs.

Figure 4
figure 4

Synthetic scheme for fourcoumarin-PEGn-deoxyguanosine-5?-tetraphosphates(coumarin-PEGn-dG4P, n = 16, 20, 24, 36).

We next tested the coumarin-PEGn-dG4P nucleotide analogs in polymeraseextension reactions using the Therminator DNA polymerase. A primer-loop-template was designedwhere the next complementary base was a C, enabling dGMP to be added to the DNA primer(Supplementary Fig. S3).Coumarin-PEGn-triphosphate is released during the reaction (Supplementary Fig. S4). MALDI-TOF-MS confirmed that indeed each of thefour coumarin-PEGn-dG4P nucleotide analogs gave the correct extensionproduct with 100% incorporation efficiency, as shown by the appearance of a single peak at~8,290 daltons in the mass spectra (Fig. 5). The absence of a primerpeak at 7,966 daltons suggested that the reaction proceeded essentially to completion. Animportant feature of the Nano-SBS approach is that the extended DNA chains contain allnatural nucleotides without any modifications, allowing SBS to continue over extensivelengths. All the extension products shown in Fig. 5 represented theincorporation of the coumarin-PEGn-dG4P nucleotide analogs, with noproducts derived from potential residual dGTP or dG4P, since the molecules were purifiedtwice in an HPLC system that separates these molecules effectively with a retention timedifference of more than 10?min between the two groups of compounds. To further excludethis possibility, we treated the purified coumarin-PEGn-dG4P nucleotideanalogs with alkaline phosphatase, which would degrade any contaminating tri- ortetra-phosphate to the free nucleoside and used the resulting HPLC-repurifiedcoumarin-PEGn-dG4P nucleotide analogs in extension reactions.

Figure 5
figure 5

MALDI-TOF MS measurement of the extension products obtained with the fourcoumarin-PEGn-dG4P nucleotide analogs.

A template-loop primer (Supplementary. Fig. S3), in which thetemplate contained a C at the next position, was used along with one of the fourcoumarin-PEGn-dG4P nucleotide analogs for the polymerase reaction.In each case, 2?-dGMP is incorporated into the growing DNA strand, yielding a singlebase primer extension product with 100% efficiency, shown as a single peak in each massspectrum at ~ 8290 daltons.

The tags released during incorporation of the coumarin-PEGn-dG4Pnucleotide analogs in polymerase reactions should becoumarin-PEGn-triphosphates(coumarin-PEGn-P3). To reduce the complexity of the chargeon the tags, we treated the released tags (coumarin-PEGn-P3)with alkaline phosphatase, yielding coumarin-PEGn-NH2 tags(Supplementary Fig. S4) and then analyzed these tags for theirnanopore current blockade effects. In further developing the Nano-SBS system, we canpursue such treatment of the released coumarin-PEGn-P3tagswith alkaline phosphatase, which would be attached to the entrance of the nanoporesdownstream of the polymerase, to generate coumarin-PEGn-NH2tags. Alternatively, we can optimize the conditions for using nanopores to directly detectthe released charged coumarin-PEGn-triphosphate tags. For theproof-of-principle studies reported here, in order to obtain large amounts of material fortesting by MALDI-TOF MS and protein nanopores, we produced synthetic versions of theexpected released tags (coumarin-PEGn-NH2) by acid hydrolysisof the four coumarin-PEGn-dG4P nucleotide analogs to cleave the P-N bondbetween the polyphosphate and heptylamine moiety (Supplementary Fig.S4). The expected coumarin-PEGn-NH2molecules wereconfirmed by MALDI-TOF-MS analysis, following HPLC purification (Fig.6). MALDI-TOF-MS results indicated that thecoumarin-PEGn-NH2tags generated by acid hydrolysis wereidentical to the tags produced by alkaline phosphatase treatment of the releasedcoumarin-PEGn-triphosphate tags during the polymerase reaction.

Figure 6
figure 6

Characterization of the released coumarin-PEGn-NH2 tagsby MALDI-TOF MS.

Coumarin-PEGn-NH2 tags generated by acid hydrolysis ofcoumarin-PEG16-dG4P yieldingcoumarin-PEG16-NH2 (blue, m/z = 1115),coumarin-PEG20-dG4P yieldingcoumarin-PEG20-NH2 (green, m/z = 1289),coumarin-PEG24-dG4P yieldingcoumarin-PEG24-NH2 (orange, m/z = 1465) andcoumarin-PEG36-dG4P yieldingcoumarin-PEG36-NH2 (red, m/z = 1991), are identical tothe tags produced by alkaline phosphatase treatment of the releasedcoumarin-PEGn-triphosphate tags during the polymerase reaction(Supplementary. Fig. S4), as shown by MALDI-TOF-MS analysis. Acomposite image of four separately obtained MS spectra is shown. The structures of thecoumarin-PEGn-NH2 tags are shown to the right.

Characterization of the current blocking effect of the tags innanopores

To demonstrate the feasibility of our proposed electronic single molecule SBS approach,we measured a heterogeneous mixture of the fourcoumarin-PEGn-NH2 tags for their current blockade effectson a single ?HL nanopore (Fig. 7). The top of Fig.7 shows the profile of current blockade versus time. The lower left of Fig. 7 shows a representative subset of the time series data, indicatingthat inside the nanopore, PEG tags produce current blockades that are characteristic oftheir size. The relative frequency distribution of the histogram of blockade events(<i>/<iopen>) shows four well separated and distinctpeaks for the four coumarin-PEGn-NH2 tags (n = 36, 24,20 and 16 from left to right respectively in Fig. 7, lower right).To highlight the wide separation of the peaks and offer clear evidence that detection ofa specific nucleotide might be accomplished by the unique blockade signal afforded by itsreleased PEG tag, the peaks are fitted with single Gaussian functions and thecorresponding 6 ? error distributions are shown (colored rectangles at top in Fig. 7, lower right). We also characterized separately each of thecoumarin-PEGn-NH2molecules with the pore (data notshown), which confirmed the identity of the different-sized PEG-related peaks shown inFig. 7. These results suggest that a single base could bediscriminated with accuracy better than 1 in 5x108 events, represented inFig. 7, lower right, by using A, C, G and T designations, whichwould occur when four different nucleotides with four different length PEGs such as thosetested here are used for DNA sequencing.

Figure 7
figure 7

Discrimination of the coumarin-PEGn-NH2 tags in proteinnanopores at single molecule level.

Four coumarin-PEGn-NH2 tags (n = 16, 20, 24 and36), derived from the corresponding coumarin-PEGn-dG4P nucleotideanalogs, were pooled and diluted in 4?M KCl, 10?mM Tris, pH 7.2 for nanoporemeasurement. (Top) Profile of current blockade versus time. (Lower Left) Asubset of the time series data indicates that when these PEG tags enter a single?-hemolysin ion channel, they cause current blockades that are characteristic of theirsize. (Lower Right) A histogram of the mean current blockade caused by 4individual PEG tags (sizes 36, 24, 20 and 16 from left to right) shows baselineresolution with a 10?kHz measurement bandwidth. The colored bars at the top representthe 6 ? distribution of the data (assuming Gaussian distributions for each of four PEGtags that could represent each of the four DNA nucleotides), which suggests that asingle base could be discriminated with an accuracy better than 1 in5x108 events, represented in this figure by using A, C, G and Tdesignations, which would occur when four different nucleotides with these fourdifferent length PEGs are used for DNA sequencing.

Discussion

As described above, a single ?HL ion channel can separate single molecules based on theirsize and easily resolves a mixture of PEGs to better than the size of a single monomer unit(i.e., < 44 g/mol)16,18,22. This high resolution arises fromthe interactions between the PEG polymer, the electrolytes (mobile cations) and amino acidside chains that line the ?HL channel's lumen16. These interactions allow thepore to be used as a nanometer-scale sensor that is specific to the size, charge andchemical property of an analyte.

Here, such analysis is extended to PEGs with different chemical groups on either terminus.The single channel ionic current recording in Fig. 7 (top and lowerleft) illustrates the blockades caused by the four different sizedcoumarin-PEGn-NH2molecules, one at a time. As withunmodified PEG, each of the current blockades is unimodal (i.e., described well withGaussian distributions and well-defined mean values).

To accurately discriminate between the four bases (A, C, G and T) for strand nanoporesequencing, one or more of the following strategies need to be adopted: (1) enhance anddifferentiate the strength of the detection signals; (2) develop an effective method todiscern and process the electronic blockade signals generated; (3) control the translocationrate of nucleic acids through the pore, e.g., by slowing down DNA movement; and (4)design and make new and more effective synthetic nanopores. As we demonstrated here, theNano-SBS approach has transformed the problem of resolving the 4 individual bases to that ofdiscriminating among 4 large well-differentiated tags, which essentially solves the firstthree problems.

DNA sequencing by synthesis is the dominant platform for genomics research and personalizedmedicine29,30,31,32,33. Kumar et al. first reported the modification ofnucleoside-5?-triphosphates, either by introducing more phosphate groups to produce tetra-and penta-phosphates and introducing fluorophores directly to the terminal phosphate orattaching a linker between the terminal phosphate and the fluorophore24,25.Tetra- and penta-phosphates were shown to be better DNA polymerase substrates, andfluorophore-labeled phosphate nucleotides have been used widely for DNA sequencing26,27,28,34. Here, we have demonstrated a novel approach to enhancediscrimination of the four nucleotides by modifying them at the terminal phosphate moietywith distinct large chemical tags for single molecule electronic SBS. The physical andchemical properties of the tag can be further adjusted to optimize the nanopore captureefficiency and measurement accuracy. For instance, the insertion of a positively chargedlinker consisting of four lysines or arginines between the polyphosphate and the PEG willproduce precursors with a neutral charge and released tags with a net positive charge. Usingthe appropriate magnitude and sign of the potential23, the released tags, butnot nucleotide substrates, will be transported through the pore.

The coumarin moiety on the tagged nucleotides can be replaced with other molecules oflarger size or different charge to further enhance nanopore discrimination. Clearly, it isimportant that every tag released in a polymerase reaction be maintained in the proper orderfor real-time single molecule Nano-SBS. Despite all these precautions, some unreactednucleotide analogs might enter the pore. Thus, the ability to discriminate between cleavedtags and unreacted nucleotide analogs will be important; fortunately, these two groups oftags should be easily differentiated by a nanopore due to their significant size and chargedifferences. In addition, it has not escaped us that the tagged nucleotide Nano-SBS approachcan be implemented in a straightforward way by adding the four nucleotides (A, C, G and T)labeled with identical tags on the 5?-phosphate in a stepwise fashion to reduce the overallcomplexity of the system, analogous to pyrosequencing30 and the Ion Torrentapproach33. However, unlike those methods, the Nano-SBS approach has theadvantage of single molecule sensitivity without the requirement for DNA amplification, andhence no issues with sequencing through homopolymeric regions, since tags released at eachposition of the homopolymer are detected discretely by the nanopore at single-moleculelevel.

The single molecule electronic Nano-SBS approach described here should be applicable toeither protein nanopores (e.g., ?HL; Mycobacterium smegmatis porin A,MspA)35, or solid-state nanopores36,37,38,39,40,41,42.These options will provide nanopores with different properties that are appropriate fordetecting a library of tags. To implement this novel strategy for DNA sequencing, an arrayof nanopores43 can be constructed on a planar surface to facilitate massivelyparallel DNA sequencing.

In conclusion, we have conducted proof-of-principle studies for a novel single moleculeelectronic Nano-SBS platform that will measure the tags released from the nucleotidesubstrates during the polymerase reaction, for sequence determination. In its fullimplementation in the future, it should be capable of long, accurate reads and potentiallyoffer very high throughput electronic single molecule DNA sequencing.

Methods

Synthesis of coumarin-PEGn-dG4P nucleotide analogs

The synthesis of coumarin-PEGn-dG4P involves three steps (A, B, C) asshown in Fig. 4 . All of the nucleotide analogs were purifiedby reverse-phase HPLC on a 150 × 4.6?mm column (Supelco), mobile phase: A, 8.6?mMEt3N/100?mM 1,1,1,3,3,3-hexafluoro-2-propanol in water (pH 8.1); B, methanol.Elution was performed from 100% A isocratic over 10?min followed by a linear gradient of0?50% B for 20?min and then 50% B isocratic over another 30?min.

Synthesis of 2?-deoxyguanosine-5?-tetraphosphate (dG4P)

The synthesis of 2?-dG4P is carried out starting from 2?-dGTP. 300 µmoles of 2?-dGTP(triethylammonium salt) were converted to the tributylammonium salt by using 1.5?mmol (5eq) of tributylamine in anhydrous pyridine (5?ml). The resulting solution was concentratedto dryness and co-evaporated twice with 5?ml of anhydrous DMF. The dGTP (tributylammoniumsalt) was dissolved in 5?ml anhydrous DMF and 1.5?mmol 1, 1-carbonyldiimidazole (CDI) wasadded. The reaction was stirred for 6?hr, after which 12 µl methanol was added andstirring continued for 30?min. To this solution, 1.5?mmol phosphoric acid(tributylammonium salt, in DMF) was added and the reaction mixture was stirred overnightat room temperature. The reaction mixture was diluted with water and purified on aSephadex-A25 column using a 0.1?M to 1?M TEAB gradient (pH 7.5). The dG4P elutes at theend of the gradient. The appropriate fractions were combined and further purified byreverse-phase HPLC to yield 175 µmol of the pure tetraphosphate (dG4P).31P-NMR: ?, ?10.7 (d, 1P,?-P), ?11.32 (d, 1P, ?-P), ?23.23 (dd, 2P, ?, ?-P); ESI-MS (-vemode): Calc. 587.2; Found 585.9 (M-2).

Synthesis of dG4P-heptyl-NH2 (Product A)

To 80 µmol dG4P in 2?ml water and 3.5?ml 0.1?M 1-methylimidazole-HCl (pH 6) were added154?mg EDAC and 260?mg diaminoheptane. The pH of the resulting solution was adjusted to 6with concentrated HCl and stirred at room temperature overnight. This solution was dilutedwith water and purified by Sephadex-A25 ion-exchange chromatography followed byreverse-phase HPLC to yield ~20 µmol dG4P-heptyl-NH2 (Product A), which wascharacterized by ESI-MS (-ve mode): calc. 699.1; Found 698.1 (M-1).

Synthesis of coumarin-PEGn-NHS esters (Product B)

Amino-PEGn-acids (1 eq) [Amino-d(PEG)16,20, 24, 36-acids; QuantaBiodesign] were dissolved in 0.1?M sodium carbonate-sodium bicarbonate buffer (pH 8.6),followed by addition of coumarin-NHS (1 eq) in DMF and the reaction mixture was stirredovernight. The coumarin-PEGn-acids obtained were purified by silica-gelchromatography using a CH2Cl2-MeOH (5?15%) mixture and theappropriate fractions were combined. These compounds were analyzed by MALDI-TOF MS (Supplementary Table 1).

Reaction of the coumarin-PEGn-acids with 1.5 eq. ofdisuccinimidyl carbonate (DSC) and 2 eq. of triethylamine in anhydrous DMF for 2?h yieldsthe corresponding coumarin-PEGn-NHS esters (Product B). Thecoumarin-PEGn-NHS esters, which move slightly faster than thecorresponding acids on silica-gel plates, were purified by silica-gel chromatography usinga CH2Cl2-MeOH (5-15%) mixture and then used in the next step.

Synthesis of Coumarin-PEGn-dG4P

dG4P-heptyl-NH2 (Product A) synthesized above was taken up in 0.1?M sodiumcarbonate-bicarbonate buffer (pH 8.6) and to this stirred solution was added 1 eq. of oneof the coumarin-PEGn-NHS esters (Product B) in DMF. The resultingmixture was stirred overnight at room temperature and then purified on a silica-gelcartridge (15-25% MeOH in CH2Cl2 to remove unreacted coumarin-acidor -NHS ester and then 5:4:1 isopropanol/NH4OH/H2O). The crudeproduct was further purified twice by reverse-phase HPLC to provide purecoumarin-PEGn-dG4P: coumarin-PEG16-dG4P (retentiontime, 31.7?min); coumarin-PEG20-dG4P (retention time, 32.2?min);coumarin-PEG24-dG4P (retention time, 33.0?min);coumarin-PEG36-dG4P (retention time, 34.3?min). The structure of allthe molecules was confirmed by MALDI-TOF MS (Supplementary Table2).

DNA polymerase extension reactions using coumarin-PEGn-dG4Pnucleotide analogs

Extension reactions were performed using a template-loop-primer(5?-GATCGCGCCGCGCCTTGGCGCGGCGC-3?, M.W. 7966), in which the next complementary base on thetemplate is a C, allowing extension by a single G (Supplementary Fig.S3). Each extension reaction was carried out in a GeneAmp PCR System 9700 thermalcycler (Applied Biosystems) at 65°C for 25 minutes in 20 µl reactions consisting of 3 µMtemplate-loop-primer, 1X Therminator ? buffer [50mM KCl, 20mM Tris-HCl, 5mMMgSO4, 0.02% IGEPAL CA-630 (pH 9.2)], 2 units of Therminator ? DNA polymerase(New England Biolabs) and 15 µM of one of the coumarin-PEGn-dG4Pnucleotide analogs. The DNA extension products were precipitated with ethanol, purifiedthrough C18 ZipTip columns (Millipore) and characterized by MALDI-TOF MS. As shown inFig. 5 in the main text, four identical extension products(expected molecular weight 8,295) were obtained. These polymerase extension reactions foreach coumarin-PEGn-dG4P were repeated and the released products(coumarin-PEGn-triphosphates, Supplementary Fig.S4) were treated with alkaline phosphatase (1U at 37°C for 15?min) to yield thecoumarin-PEGn-NH2 tags, which were extracted intodichloromethane and characterized by MALDI-TOF-MS.

Acid hydrolysis of coumarin-PEGn-dG4P to yield thecoumarin-PEGn-NH2 tags (Supplementary Fig.S4)

Acetic acid was added to the coumarin-PEGn-dG4P nucleotide analogs to afinal concentration of 10% and the solution was vigorously shaken overnight to ensure thehydrolysis of the N-P bond between the ? phosphate and the heptylamine to yield thecoumarin-PEGn-NH2 tags. The solution was dried using aCentriVap and resuspended in an appropriate volume of water. A 1 µl aliquot was collectedfor MALDI-TOF MS characterization and a second aliquot was measured at 260?nm and 350?nmusing a NanoDrop ND-1000 spectrophotometer for further characterization of the resultingmolecules. The resulting coumarin-PEGn-NH2 tags were theexpected size as measured by MALDI-TOF MS (see Fig. 6 and Supplementary Table 3).

Nanopore Measurements of the Coumarin-PEGn-NH2Tags

Membrane and channel formation

Single ?-hemolysin (?HL) channels were inserted into solvent-free planar lipid bilayermembranes44 fabricated across an ~ 80 ?m diameter hole in a 25 ?m thickTeflon partition separating two electrolyte solution wells as described previously45. 4?M KCl, 10?mM Tris titrated to pH 7.2 with citric acid was usedthroughout the experiment. Membranes were formed by first wetting the partition with 1 %v/v hexadecane/pentane. 10?mg/mL diphytanoyl phosphatidylcholine (DPhyPC) in pentanewere spread at both air-electrolyte solution interfaces with the solution levels wellbelow the hole in the Teflon partition. After 10?min, the solution levels were raisedabove the hole spontaneously to form a membrane. Approximately 0.5 ?L of 0.5?mg/mL ?HLwas injected into the solution immediately adjacent to the membrane and the ioniccurrent was observed until a single channel inserted into the membrane. The cischamber contents were then exchanged with protein-free electrolyte solution to maintaina single channel for the duration of the experiment.

Coumarin-PEGn-NH2 molecules (n = 16, 20, 24 and 36)were added to the trans side of the pore (defined as the ?-barrel side of thechannel) to a final concentration between 0.4 ?mol/L and 1 ?mol/L of each component.Ionic current was recorded between two matched Ag/AgCl electrodes (3 M KCl) at a fixedpotential (?40?mV) for approximately 15?min to achieve sufficient counting statistics.Data were recorded with a 4-pole Bessel filter at 10?kHz oversampled at 50?kHz.

Data analysis

Data were analyzed off-line with an in-house program written in LabVIEW (NationalInstruments) as described previously23. In brief, blockades were locatedwith an event detector based on a simple threshold algorithm set at 5 ? of the currentnoise in the open state. When an event is detected, the points in the rise time anddecay time were discarded (~ 60 ?s and 20 ?s, respectively). The mean blockade depth wascalculated from the remaining points and the open channel current was calculated fromthe mean of 0.8?ms of open channel data separated 0.2?ms from the threshold. The datawere reported as a ratio of the means (<i>/<iopen>) andthe nanopore spectra were calculated as a histogram of these values.