Single-molecule long-read DNA sequencing with biological nanopores is fast and high-throughput but suffers reduced accuracy in homonucleotide stretches. We now combine the CsgG nanopore with the 35-residue N-terminal region of its extracellular interaction partner CsgF to produce a dual-constriction pore with improved signal and base-calling accuracy for homopolymer regions. The electron cryo-microscopy structure of CsgG in complex with full-length CsgF shows that the 33 N-terminal residues of CsgF bind inside the β-barrel of the pore, forming a defined second constriction. In complexes of CsgG bound to a 35-residue CsgF constriction peptide, the second constriction is separated from the primary constriction by ~25 Å. We find that both constrictions contribute to electrical signal modulation during single-stranded DNA translocation. DNA sequencing using a prototype CsgG–CsgF protein pore with two constrictions improved single-read accuracy by 25 to 70% in homopolymers up to 9 nucleotides long.
Subscribe to Journal
Get full journal access for 1 year
only $8.25 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Coordinates and the electron potential maps for the CsgG–CsgF cryo-EM structure have been deposited in the PDB and EMDB under accession codes 6SI7 and EMD-10206, respectively. R9 pores are proprietary mutants of E. coli CsgG developed by ONT and are available as membrane-embedded single pores incorporated in Flongle, MinION, GridION and PromethION flow cells.
Bayley, H. & Cremer, P. S. Stochastic sensors inspired by biology. Nature 413, 226–230 (2001).
Howorka, S., Cheley, S. & Bayley, H. Sequence-specific detection of individual DNA strands using engineered nanopores. Nat. Biotechnol. 19, 636–639 (2001).
Meller, A., Nivon, L., Brandin, E., Golovchenko, J. & Branton, D. Rapid nanopore discrimination between single polynucleotide molecules. Proc. Natl Acad. Sci. USA 97, 1079–1084 (2000).
Akeson, M., Branton, D., Kasianowicz, J. J., Brandin, E. & Deamer, D. W. Microsecond time-scale discrimination among polycytidylic acid, polyadenylic acid, and polyuridylic acid as homopolymers or as segments within single RNA molecules. Biophys. J. 77, 3227–3233 (1999).
Benner, S. et al. Sequence-specific detection of individual DNA polymerase complexes in real time using a nanopore. Nat. Nanotechnol. 2, 718–724 (2007).
Olasagasti, F. et al. Replication of individual DNA molecules under electronic control using a protein nanopore. Nat. Nanotechnol. 5, 798–806 (2010).
Wang, S., Zhao, Z., Haque, F. & Guo, P. Engineering of protein nanopores for sequencing, chemical or protein sensing and disease diagnosis. Curr. Opin. Biotechnol. 51, 80–89 (2018).
Kasianowicz, J. J., Brandin, E., Branton, D. & Deamer, D. W. Characterization of individual polynucleotide molecules using a membrane channel. Proc. Natl Acad. Sci. USA 93, 13770–13773 (1996).
Butler, T. Z., Pavlenok, M., Derrington, I. M., Niederweis, M. & Gundlach, J. H. Single-molecule DNA detection with an engineered MspA protein nanopore. Proc. Natl Acad. Sci. USA 105, 20647–20652 (2008).
Stoddart, D., Franceschini, L., Heron, A., Bayley, H. & Maglia, G. DNA stretching and optimization of nucleobase recognition in enzymatic nanopore sequencing. Nanotechnology 26, 084002 (2015).
Stoddart, D. et al. Nucleobase recognition in ssDNA at the central constriction of the alpha-hemolysin pore. Nano Lett. 10, 3633–3637 (2010).
Stoddart, D., Heron, A. J., Mikhailova, E., Maglia, G. & Bayley, H. Single-nucleotide discrimination in immobilized DNA oligonucleotides with a biological nanopore. Proc. Natl Acad. Sci. USA 106, 7702–7707 (2009).
Maglia, G., Heron, A. J., Stoddart, D., Japrung, D. & Bayley, H. Analysis of single nucleic acid molecules with protein nanopores. Methods Enzymol. 475, 591–623 (2010).
Cherf, G. M. et al. Automated forward and reverse ratcheting of DNA in a nanopore at 5-Å precision. Nat. Biotechnol. 30, 344–348 (2012).
Manrao, E. A. et al. Reading DNA at single-nucleotide resolution with a mutant MspA nanopore and phi29 DNA polymerase. Nat. Biotechnol. 30, 349–353 (2012).
Brown, C. G. No Thanks, I’ve already got one. YouTube https://www.youtube.com/watch?v=nizGyutn6v4 (2016).
Goyal, P. et al. Structural and mechanistic insights into the bacterial amyloid secretion channel CsgG. Nature 516, 250–253 (2014).
Robinson, L. S., Ashman, E. M., Hultgren, S. J. & Chapman, M. R. Secretion of curli fibre subunits is mediated by the outer membrane-localized CsgG protein. Mol. Microbiol. 59, 870–881 (2006).
Van Gerven, N., Van der Verren, S. E., Reiter, D. M. & Remaut, H. The role of functional amyloids in bacterial virulence. J. Mol. Biol. 420, 3657–3684 (2018).
Cao, B. et al. Structure of the nonameric bacterial amyloid secretion channel. Proc. Natl Acad. Sci. USA 111, E5439–E5444 (2014).
Chapman, M. R. et al. Role of Escherichia coli curli operons in directing amyloid fiber formation. Science 295, 851–855 (2002).
Nenninger, A. A., Robinson, L. S. & Hultgren, S. J. Localized and efficient curli nucleation requires the chaperone-like amyloid assembly protein CsgF. Proc. Natl Acad. Sci. USA 106, 900–905 (2009).
Nenninger, A. A. et al. CsgE is a curli secretion specificity factor that prevents amyloid fibre aggregation. Mol. Microbiol. 81, 486–499 (2011).
Schubeis, T. et al. Structural and functional characterization of the curli adaptor protein CsgF. FEBS Lett. 592, 1020–1029 (2018).
Chi, Q., Wang, G. & Jian, J. The persistence length and length per base of single-stranded DNA obtained from fluorescence correlation spectroscopy measurements using mean field theory. Phys. A 392, 1072–1079 (2013).
Jain, M., Olsen, H. E., Paten, B. & Akeson, M. The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol. 17, 239 (2016).
Carter, J. M. & Hussain, S. Robust long-read native DNA sequencing using the ONT CsgG Nanopore system. Wellcome Open Res. 2, 23 (2017).
Wick, R. R., Judd, L. M. & Holt, K. E. Performance of neural network basecalling tools for Oxford Nanopore sequencing. Genome Biol. 20, 129 (2019).
Tang, G. et al. EMAN2: an extensible image processing suite for electron microscopy. J. Struct. Biol. 157, 38–46 (2007).
Pettersen, E. F. et al. UCSF Chimera–a visualization system for exploratory research and analysis. J. Comput. Chem. 25, 1605–1612 (2004).
Robinson, J. T. et al. Integrative genomics viewer. Nat. Biotechnol. 29, 24–26 (2011).
Medaka v.0.8.1 (Oxford Nanopore Technologies, 2018); https://nanoporetech.github.io/medaka/
Miroux, B. & Walker, J. E. Over-production of proteins in Escherichia coli: mutant hosts that allow synthesis of some membrane proteins and globular proteins at high levels. J. Mol. Biol. 260, 289–298 (1996).
Casadaban, M. J. Transposition and fusion of the lac genes to selected promoters in Escherichia coli using bacteriophage lambda and Mu. J. Mol. Biol. 104, 541–555 (1976).
Zheng, S. Q. et al. MotionCor2: anisotropic correction of beam-induced motion for improved cryo-electron microscopy. Nat. Methods 14, 331–332 (2017).
Rohou, A. & Grigorieff, N. CTFFIND4: fast and accurate defocus estimation from electron micrographs. J. Struct. Biol. 192, 216–221 (2015).
Kimanius, D., Forsberg, B. O., Scheres, S. H. & Lindahl, E. Accelerated cryo-EM structure determination with parallelisation using GPUs in RELION-2. Elife 5, e18722 (2016).
Reboul, C. F., Eager, M., Elmlund, D. & Elmlund, H. Single-particle cryo-EM-improved ab initio 3D reconstruction with SIMPLE/PRIME. Protein Sci. 27, 51–61 (2018).
Emsley, P., Lohkamp, B., Scott, W. G. & Cowtan, K. Features and development of Coot. Acta Crystallogr. D 66, 486–501 (2010).
Afonine, P. V. et al. Real-space refinement in PHENIX for cryo-EM and crystallography. Acta Crystallogr. D 74, 531–544 (2018).
Smart, O. S., Neduvelil, J. G., Wang, X., Wallace, B. A. & Sansom, M. S. HOLE: a program for the analysis of the pore dimensions of ion channel structural models. J. Mol. Graph. 14, 354–360 (1996).
R: A Language and Environment for Statistical Computing (R Foundation for Statistical Computing, 2017); https://www.R-project.org/
Corpet, F. Multiple sequence alignment with hierarchical clustering. Nucleic Acids Res. 16, 10881–10890 (1988).
We are grateful to R. Thompson and J. van Rooyen for assistance during cryo-EM data collection on Titan Krios 1 at the Astbury Biostructure Laboratory, Leeds and Krios m02 at Diamond - eBIC, Harwell Science and Innovation Campus, UK, respectively. We thank R. Efremov for advice on cryo-EM image processing, and are grateful to S. Young at ONT for helpful discussion and advice on MinION data analysis. This work received funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme under grant agreement no. 649082 (BAS-SBBT). S.E.V.d.V. is a recipient of a PhD fellowship of the Flanders Research Foundation (FWO).
VIB and ONT have jointly filed two provisional patent applications on the construction and use of dual-constriction pores in nanopore sensing applications (PCT/GB2018/051858 and PCT/GB2018/051191). VIB has a funded research collaboration agreement with ONT related to CsgG-derived nanopores. ONT uses CsgG-derived nanopores in its MinION, GridION and PromethION nanopore sequencing devices. As inventors on VIB intellectual property, S.E.V.d.V., N.V.G. and H.R. receive a share in royalty payments. R.H., P.S., J.K., M.J., E.J.W. and L.J. are employees of ONT and own company share options.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
a, SDS PAGE of CsgG or CsgG:CsgF complex obtained by tandem affinity purification of the outer membrane proteins extracted from cells expressing CsgG-Strep II (pPG1) or CsgG-Strep II and CsgF-His (pNA62), respectively. Gel representative for n>10 experiments. b, Representative 2D class averages for the CsgG:CsgF dataset enriched for single pores (that is C9 CsgG:CsgF complexes), generated using SIMPLE and used for 3D reconstruction using Relion-2.0. c, Off-axis top view and cross-sectional side view of the CsgG:CsgF cryo-EM 3D electron potential map reconstructed to 3.4 Å. d, Representative region of electron potential map of the CsgG:CsgF complex. Region of focus is the constriction helix of FCP, stacking against the lumen of the CsgG β-barrel. One CsgF protomer is highlighted in purple, the others in grey; CsgG is depicted in gold. Heteroatoms are in blue (nitrogens) and red (oxygens). The electron potential map is cut-off at a contour of 0.5, shown in stick and mesh representation, rendered using UCSF Chimera 1.10.2. (e) Fourier Shell Correlation (FSC) curves of the final 3D reconstruction (black: FSC corrected map, green: FSC unmasked map, blue: FSC masked map, red: FSC phase randomized unmasked map).
a, Production and temperature stability assessment of the CsgG:CsgF pore complex. Incubation of purified CsgG and CsgF in a 1:1 ratio results in the formation of a SDS stable CsgG:CsgF pore complex that is heat stable up to 70 °C. b, The N-terminal residues of CsgF insert into the CsgG channel and form a second region of constriction, whilst the remaining ~ 100 residues form a cap like head structure (Fig. 1e, f). For nanopore sensing purposes, we sought to produce a complex of CsgG with the CsgF constriction peptide (FCP), lacking the neck and head domains. To do so, CsgG was complexed with CsgF mutants modified to insert a TEV cleavage site at position 30, 35 or 45. The reconstituted CsgG:CsgF pore complexes were digested with TEV protease and analysed by SDS-PAGE (c). M: molecular mass marker, Lane 1, 2: Strep II-tag affinity purified CsgG:CsgF complex and excess CsgG, Lane 3: isolation of CsgG:CsgF complex by size exclusion chromatography, Lane 4: CsgG:CsgF35-TEV cleaved with TEV protease to generate CsgG:FCP complex, Lane 5: flow through of CsgG:FCP after Strep purification, Lane 6: CsgG:FCP heated to 60 °C for 10 minutes. Lane 7: Eluted CsgG:FCP complex from Strep column, Lane 8: CsgG pore as the control, Lane 9: TEV protease as the control.
a, Multiple sequence alignment (Multalin43) of 22 representative CsgF sequences. Aligned sequences are shown as mature proteins (that is lacking their N-terminal signal peptide). The N-terminal 33 residues of the mature protein form a continuous stretch of high sequence conservation (48% average pairwise sequence identity) encompassing the region interacting with CsgG and forming the CsgF constriction peptide. CsgF homologues included in the multiple sequence alignment are UniProt entries Q88H88; A0A143HJA0; Q5E245; Q084E5; F0LZU2; A0A136HQR0; A0A0W1SRL3; B0UH01; Q6NAU5; G8PUY5; A0A0S2ETP7; E3I1Z1; F3Z094; A0A176T7M2; D2QPP8; N2IYT1; W7QHV5; D4ZLW2; D2QT92; A0A167UJA2. b, Schematic diagram of CsgF protein architecture. (SP) signal peptide, cleaved upon secretion; (FCP) CsgF constriction peptide, CsgF neck and head region are coloured green.
a, Schematic representation of the electrophysiology setup of CsgG-based nanopores as used for polynucleotide sequencing. CsgG-based channels (G) are reconstituted into artificial membranes with the periplasmic vestibule and β-barrel exposed to the cis and trans sides, respectively. Polynucleotide – enzyme (E) complexes are added to the cis side and current reads are recorded under an electric potential (Δψ) of 100 to 300 mV. b, c, Representative single channel traces (b) and current - voltage (IV) curves (c) for wildtype CsgG, CsgGF56Q and CsgGR9 and their FCP complexes: CsgG:FCP, CsgGF56Q:FCP and CsgGR9:FCP. I-V curves show mean ± 95% confidence interval of at least 60 single channels per pore, with the exception of wildtype CsgG (36 single channels) and CsgG:FCP (14 single channels).
a, b, Single channel conductance trace of two representative CsgGR9:FCP nanopores during a 24 hour sequencing run, recorded at -180 mV. The data show both CsgGR9 and CsgGR9:FCP are predominantly in a sequencing, DNA-occupied state, with apo pores capturing new DNA strands within seconds. The two traces show a CsgGR9:FCP pore complex that stays intact of the 24h sequencing run (a), as well as a pore complex that shows dissociation of the FCP peptides during the sequencing run (at ~ 19h; b). Upon FCP dissociation, the channel continues sequencing now as a CsgGR9 apo pore (labeled CsgGR9). Arrows indicate the average conductance levels of the open pore and the DNA-occupied pore during sequencing intervals. The zoomed in panels show two representative 30s time windows of the sequencing run of the intact CsgGR9:FCP channel (left) and the CsgGR9 channel following dissociation of FCP (right). The full and zoomed in sequencing runs show high DNA capture rates for CsgGR9:FCP channels throughout the 24h sequencing run. c, Scatter plot of the open pore current of 25 CsgGR9:FCP channels during 24h sequencing runs, recorded at -180 mV. Open pore plots for CsgGR9:FCP pores that stay intact throughout the 24h run (n=22), and pores that lose FCP (n=3) are coloured blue and red, respectively.
Extended Data Fig. 6 Constriction mapping oligos and single read basecalls for CsgGR9 and CsgGR9:FCP nanopores.
a, Set of static polyA ssDNA oligonucleotides in which one base is missing from the DNA backbone (iSpc3). These oligos that have differing location of the abasic nucleotide, dubbed SS20 to SS38, were used to map the constriction position in CsgGF56Q or CsgGF56Q:FCP (Fig. 3d). Biotin modification at the 3’ end of each strand is complexed with monovalent streptavidin to block translocation of the oligo and give a defined distance marker between the pore entrance (block site) and pore constriction (site of increased conductance when occupied by the abasic nucleotide; Fig. 3c). SS27-SS28 and SS32 (highlighted red) have their abasic nucleotide located at the CsgG and FCP constriction, respectively (Fig. 3d, e). b, Comparison of errors in single read (n=26) basecalls from CsgGR9 and CsgGR9:FCP pores that have been aligned to a representative region of the E. coli reference genome sequence. The region displayed corresponds to the locus 14,098 to 14,115. The figure is plotted using the Integrative Genomics Viewer software31. Pink/purple bars correspond to single reads in the forward and reverse directions respectively. Black horizontal bars correspond to deletions in the basecalls, where the number corresponds to the number of deletions at the specific loci. Individual substitutions are labeled with the miscalled nucleotide (C in blue, T in red, G in orange and A in green). Insertions are labeled ‘I’ (purple). Grey bars on top of the list of single reads of the CsgGR9 and CsgGR9:FCP pores correspond to the consensus accuracy per position.
About this article
Cite this article
Van der Verren, S.E., Van Gerven, N., Jonckheere, W. et al. A dual-constriction biological nanopore resolves homonucleotide sequences with high fidelity. Nat Biotechnol 38, 1415–1420 (2020). https://doi.org/10.1038/s41587-020-0570-8
Molecular Biomedicine (2021)
Nature Methods (2020)