Abstract
Single-cell genetic heterogeneity is ubiquitous in microbial populations and an important aspect of microbial biology; however, we lack a broadly applicable and accessible method to study this heterogeneity in microbial populations. Here, we show a simple, robust and generalizable method for high-throughput single-cell sequencing of target genetic loci in diverse microbes using simple droplet microfluidics devices (droplet targeted amplicon sequencing; DoTA-seq). DoTA-seq serves as a platform to perform diverse assays for single-cell genetic analysis of microbial populations. Using DoTA-seq, we demonstrate the ability to simultaneously track the prevalence and taxonomic associations of >10 antibiotic-resistance genes and plasmids within human and mouse gut microbial communities. This workflow is a powerful and accessible platform for high-throughput single-cell sequencing of diverse microbial populations.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 / 30 days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$259.00 per year
only $21.58 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
Similar content being viewed by others
Data availability
The database of ARGs (CARD database) used to design DoTA-seq primers can be accessed at card.mcmaster.ca. The database of plasmid replicon genes used to design DoTA-seq primers can be accessed as part of MOB-suite at https://github.com/phac-nml/mob-suite. The MAGs used to corroborate plasmid–taxa relationships can be accessed at https://opendata.lifebit.ai/table/SGB. The raw reads and 16S profiling abundance data of the ZymoBIOMICS human fecal community were taken from https://www.fecalreferencedb.com/.
For all DoTA-seq runs, processed data containing barcodes and their mapped associated reads are available from Zenodo at https://doi.org/10.5281/zenodo.6537689.
Raw sequencing reads are available from Zenodo at https://doi.org/10.5281/zenodo.10380035. Source data are provided with this paper.
Code availability
All code used in the analysis of DoTA-seq sequencing data are available on Zenodo at https://doi.org/10.5281/zenodo.10380035. Up-to-date versions of the scripts and code may be found on GitHub at https://github.com/lanfreem/DoTA-seq-Paper.
References
Jayaraman, R. Phase variation and adaptation in bacteria: a ‘Red Queen’s Race’. Curr. Sci. 100, 1163–1171 (2011).
Sulaiman, J. E. & Lam, H. Proteomic investigation of tolerant Escherichia coli populations from cyclic antibiotic treatment. J. Proteome Res. 19, 900–913 (2020).
Porter, N. T., Canales, P., Peterson, D. A. & Martens, E. C. A subset of polysaccharide capsules in the human symbiont Bacteroides thetaiotaomicron promote increased competitive fitness in the mouse gut. Cell Host Microbe 22, 494–506 (2017).
Jonsson, A.-B., Ilver, D., Falk, P., Pepose, J. & Normark, S. Sequence changes in the pilus subunit lead to tropism variation of Neisseria gonorrhoeae to human tissue. Mol. Microbiol. 13, 403–416 (1994).
Li, J. et al. Epigenetic switch driven by DNA inversions dictates phase variation in Streptococcus pneumoniae. PLoS Pathog. 12, e1005762 (2016).
Marcy, Y. et al. Dissecting biological ‘dark matter’ with single-cell genetic analysis of rare and uncultivated {TM7} microbes from the human mouth. Proc. Natl Acad. Sci. USA 104, 11889–11894 (2007).
Rinke, C. et al. Insights into the phylogeny and coding potential of microbial dark matter. Nature 499, 431–437 (2013).
Lan, F. et al. Single-cell analysis of multiple invertible promoters reveals differential inversion rates as a strong determinant of bacterial population heterogeneity. Sci. Adv. 9, eadg5476 (2023).
Blattman, S. B., Jiang, W., Oikonomou, P. & Tavazoie, S. Prokaryotic single-cell RNA sequencing by in situ combinatorial indexing. Nat. Microbiol. 5, 1192–1201 (2020).
Kuchina, A. et al. Microbial single-cell RNA sequencing by split-pool barcoding. Science 371, eaba5257 (2021).
Lan, F., Demaree, B., Ahmed, N. & Abate, A. R. Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding. Nat. Biotechnol. 35, 640–646 (2017).
Zheng, W. et al. High-throughput, single-microbe genomics with strain resolution, applied to a human gut microbiome. Science 376, eabm1483 (2022).
Hatori, M. N., Kim, S. C. & Abate, A. R. Particle-templated emulsification for microfluidics-free digital biology. Anal. Chem. 90, 9813–9820 (2018).
Clark, I. C. et al. Microfluidics-free single-cell genomics with templated emulsification. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-01685-z (2023).
Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of individual cells using nanoliter droplets. Cell 161, 1202–1214 (2015).
Klein, A. M. et al. Droplet barcoding for single-cell transcriptomics applied to embryonic stem cells. Cell 161, 1187–1201 (2015).
Eastburn, D. J., Sciambi, A. & Abate, A. R. Ultrahigh-throughput mammalian single-cell reverse-transcriptase polymerase chain reaction in microfluidic drops. Anal. Chem. 85, 8016–8021 (2013).
Lan, F., Haliburton, J. R., Yuan, A. & Abate, A. R. Droplet barcoding for massively parallel single-molecule deep sequencing. Nat. Commun. 7, 11784 (2016).
Diebold, P. J., New, F. N., Hovan, M., Satlin, M. J. & Brito, I. L. Linking plasmid-based β-lactamases to their bacterial hosts using single-cell fusion PCR. eLife 10, e66834 (2021).
Holmes, D. L. & Stellwagen, N. C. Estimation of polyacrylamide gel pore size from Ferguson plots of linear DNA fragments. II. Comparison of gels with different crosslinker concentrations, added agarose and added linear polyacrylamide. Electrophoresis 12, 612–619 (1991).
Cheng, Y.-Y. et al. Efficient plasmid transfer via natural competence in a microbial co-culture. Mol. Syst. Biol. 19, e11406 (2023).
Dobrindt, U. & Hacker, J. Whole genome plasticity in pathogenic bacteria. Curr. Opin. Microbiol. 4, 550–557 (2001).
Bonham, K. S., Wolfe, B. E. & Dutton, R. J. Extensive horizontal gene transfer in cheese-associated bacteria. eLife 6, e22144 (2017).
Murray, C. J. et al. Global burden of bacterial antimicrobial resistance in 2019: a systematic analysis. Lancet 399, 629–655 (2022).
Clark, R. L. et al. Design of synthetic human gut microbiome assembly and butyrate production. Nat. Commun. 12, 3254 (2021).
van der Waaij, L. A., Mesander, G., Limburg, P. C. & van der Waaij, D. Direct flow cytometry of anaerobic bacteria in human feces. Cytometry 16, 270–279 (1994).
Louca, S., Doebeli, M. & Parfrey, L. W. Correcting for 16S rRNA gene copy numbers in microbiome surveys remains an unsolved problem. Microbiome 6, 41 (2018).
Krinos, C. M. et al. Extensive surface diversity of a commensal microorganism by multiple DNA inversions. Nature 414, 555–558 (2001).
Hoskisson, P. A. & Smith, M. C. M. Hypervariation and phase variation in the bacteriophage ‘resistome’. Curr. Opin. Microbiol. 10, 396–400 (2007).
Wang, Y. et al. Dissolvable polyacrylamide beads for high‐throughput droplet DNA barcoding. Adv. Sci. 1903463, 1903463 (2020).
Lourenço, M. et al. A mutational hotspot and strong selection contribute to the order of mutations selected for during Escherichia coli adaptation to the gut. PLoS Genet. 12, e1006420 (2016).
Zhao, S. et al. Adaptive evolution within gut microbiomes of healthy people. Cell Host Microbe 25, 656–667 (2019).
Hsu, R. H. et al. Microbial interaction network inference in microfluidic droplets. Cell Syst. 9, 229–242 (2019).
Agasti, S. S., Liong, M., Peterson, V. M., Lee, H. & Weissleder, R. Photocleavable DNA barcode–antibody conjugates allow sensitive and multiplexed protein analysis in single cells. J. Am. Chem. Soc. 134, 18499–18502 (2012).
Duffy, D. C., McDonald, J. C., Schueller, O. J. A. & Whitesides, G. M. Rapid prototyping of microfluidic systems in poly(dimethylsiloxane). Anal. Chem. 70, 4974–4984 (1998).
Demaree, B., Weisgerber, D., Lan, F. & Abate, A. R. An ultrahigh-throughput microfluidic platform for single-cell genome sequencing. J. Vis. Exp. 2018, 57598 (2018).
Alcock, B. P. et al. CARD 2020: antibiotic resistome surveillance with the comprehensive antibiotic resistance database. Nucleic Acids Res. 48, D517–D525 (2019).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).
Steinegger, M. & Söding, J. MMseqs2 enables sensitive protein sequence searching for the analysis of massive data sets. Nat. Biotechnol. 35, 1026–1028 (2017).
Untergasser, A. et al. Primer3-new capabilities and interfaces. Nucleic Acids Res. 40, e115 (2012).
Li, D., Liu, C.-M., Luo, R., Sadakane, K. & Lam, T.-W. MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674–1676 (2015).
Robertson, J. & Nash, J. H. E. MOB-suite: software tools for clustering, reconstruction and typing of plasmids from draft assemblies. Microb. Genomics 4, e000206 (2018).
Pasolli, E. et al. Extensive unexplored human microbiome diversity revealed by over 150,000 genomes from metagenomes spanning age, geography, and lifestyle. Cell 176, 649–662 (2019).
Chaumeil, P.-A., Mussig, A. J., Hugenholtz, P. & Parks, D. H. GTDB-Tk: a toolkit to classify genomes with the Genome Taxonomy Database. Bioinformatics 36, 1925–1927 (2020).
Acknowledgements
We thank Y.-Y. Cheng of UW Madison for providing the E. coli and B. subtilis strains and helpful discussions. We thank R. Clark for assistance with the synthetic human gut community. We are also grateful to L. Comstock, University of Chicago for providing the B. fragilis recombinase deletion strain and B. Pfleger of UW Madison for providing the P. putida KT2440 strain. We thank L. Brinkman of the UW Madison animal resources and compliance for providing mouse fecal samples. This research was supported by the National Institutes of Allergy and Infectious Diseases under grant no. R21 AI156438 and R21 AI159980 for O.S.V., National Institute of General Medical Sciences under grant no. R01 GM038660 for R.L. and R35 GM124774 for O.S.V., Army Research Office under grant no. W911NF-19-1-0269, U.S. Department of Agriculture Hatch Award WIS05004 for R.L. and the Burroughs Wellcome Fund through the Careers Award at the Scientific Interfaces for F.L. The funders had no role in study design, data collection and analysis, decision to publish or preparation of the manuscript.
Author information
Authors and Affiliations
Contributions
F.L. and O.S.V. conceived the study. F.L., O.S.V., J.S. and R.L. designed the experiments and interpreted the data. F.L. performed experiments and analyzed data. J.S. designed the DoTA-seq assay and analyzed data for experiments involving phase variation in B. fragilis. T.D.R. designed scripts for analysis of single-cell digital PCR data. Z.Z. developed the data-analysis pipeline for natural microbiome samples. K.K. carried out single-cell digital PCR experiments. F.L. and O.S.V. wrote the paper. F.L., O.S.V., R.L., K. A. and T.D.R. contributed to the revision of the paper.
Corresponding authors
Ethics declarations
Competing interests
F.L. and O.S.V. have filed a related US non-provisional patent application entitled ‘Methods for isolating and barcoding nucleic acid’ (Application 18/311,010). The remaining authors declare no competing interests.
Peer review
Peer review information
Nature Methods thanks Linas Mazutis and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. Primary Handling Editors: Lei Tang and Lin Tang, in collaboration with the Nature Methods team. Peer reviewer reports are available.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Tracking cell lysis efficiencies using Cellbrite-Fix 555 and SYBR Green staining in >1000 droplets or gels in each step of the lysis protocol.
Cellbrite (CF555) stains cell membrane and cell wall components while SYBR Green stains DNA, allowing us to identify whether a cell is intact (CF555 and SYBR staining) or lysed (only SYBR staining). (a) Representative fluorescence microscopy overlay of Green Channel (SYBR staining), Red Channel (CF555 staining), and Brightfield (Droplets/Gels) as a water-in-oil emulsion, as gels before lysis (top two images) and as gels after lysis (bottom image). Arrows denote representative droplets containing cells. Scale bars represent 100 μm. (b) The percentage of droplets containing fluorescent particles (# of fluorescent particles divided by the total # of droplets or gels) by SYBR or CF555 staining for each condition. Before lysis, CF555 and SYBR fluorescent particles are approximately equal in abundance. Following lysis, CF555 particles are substantially reduced, indicating lysis of the cells. n represents total number of droplets analyzed for each condition. The discrepancy between % encapsulated cells in pre-lysis emulsion and pre-lysis gel is due to the approximate nature of the droplet counting algorithm. Since the droplet counting algorithms are imperfect, some droplets do not get counted and some get counted multiple times. Therefore, the % of stained cells are only approximate estimates. (c) The percentage of droplets containing unlysed cells at each step. Unlysed cells are defined as particles that are fluorescent in both CF555 and SYBR Green channels. Data shown are result of one independent experiment.
Extended Data Fig. 2 Coefficient of variation for technical replicates is negatively correlated with number of cells sequenced.
Scatter plot of the average number of cells for that species versus the coefficient of variation (CV) of gene prevalence for two technical replicates for every gene-species pair. CV is calculated as the standard deviation divided by the mean. The negative Spearman correlation (Rho) suggests that a moderate fraction of the variance found between technical replicates can be attributed to stochasticity due to small numbers of cells. P-value is based off comparison to a two-sided t-test null distribution.
Extended Data Fig. 3 Qualitative check of target gene amplification balance.
Target amplification balance for genes in synthetic community is plotted as a histogram on a log y-scale as a qualitative check of relative amplification efficiency. The X axis represents the fraction of the reads of each barcode group that represents the 16S rRNA gene (ie. barcodes in the middle of the x axis have half their reads map to 16S gene and half map to the target gene, barcodes on the far right contain mostly reads that map to the target gene and not the 16S gene). The distribution of ratios for low prevalence genes (ErmQ, tetW, highlighted in red) are biased towards target genes meaning that the target capture efficiency is high for those targets despite the observed low prevalence. Only cells that contain both 16S reads and at least one additional target gene are included.
Extended Data Fig. 4 BL rpoB gene is detected via PCR on genomic DNA extract.
Agarose gel showing PCR amplification of the Bifidobacterium specific rpoB gene on the genomic DNA extracted from pure cultures of B. adolescentis (BA), B. psuedocatenulatum (BP), B. longum (BL), or the 25-member synthetic human gut community (Comm). NTC represents no template control. MW molecular weight marker. Gel represents data from one independent experiment.
Extended Data Fig. 5 Independent confirmation of DoTA-seq results in Fig. 2 by colony and single-cell PCR genotyping.
Droplet PCR genotyping for (a) B. fragilis (BF) and the cepA gene, (b) P. johnsonii (PJ) and the mef(en2) gene, or (c) PJ and the tetQ gene. Cells were fixed and subjected to digital PCR amplification for species marker genes and ARGs in an emulsion PCR (see Methods). Blue bars show the proportion of droplets that showed amplification for the species marker gene as well as the ARG, green bars show the proportion of droplets that show amplification for the species marker gene (rpoB) but not the ARG. Time 0 and time 2 represents timepoints 0 and 2 from the synthetic community antibiotic experiment, respectively. n represents the number of species marker gene positive droplets that were counted in total for each condition. (d) Colony PCR genotyping of B. fragilis (BF) colonies of cells originating from the synthetic human gut community experiment exposed to antibiotics. Colonies from glycerol stocks of samples taken at timepoints 1 and 2 of the synthetic community antibiotics experiment shown in Fig. 3 are genotyped by qPCR for the BF specific rpoB gene and the antibiotic resistance gene cepA. The proportion of cepA positive colonies are shown in red, and proportion of cepA negative colonies are shown in blue. n represents the number of colonies that were positively identified as BF by successful amplification of the BF specific rpoB gene. In total, 24 colonies from timepoint 1 and 32 colonies from timepoint 2 were genotyped (many colonies were not BF). Data are result of one independent experiment.
Extended Data Fig. 6 Cell suspensions extracted from fecal material highly resembles source material.
Relative abundance of taxa (family level) as profiled by metagenomic sequencing for whole mouse fecal material (Y axis) and cells extracted from mouse fecal material using the gentle centrifugation method (X axis).
Extended Data Fig. 7 Single-cell sequencing from ZymoBIOMICS preserved human fecal microbiome results in under-representation of gram negative taxa.
Comparison of relative abundance of the major taxa (>0.1% abundance by 16S profiling) between standard 16S profiling and DoTA-seq of the ZymoBIOMICS human fecal microbiome standard. Since this sample was preserved in DNA/RNA shield which prematurely lyses many types of cells (see Supplementary Fig. 9), many taxa are missing or under-represented in DoTA-seq compared to 16S profiling.
Extended Data Fig. 8 ZymoBIOMICS samples contain premature lysis and under-representation of gram-negative bacteria in single-cell sequencing.
This chart shows DoTA-seq relative abundance of a Zymobiomics microbial community standard, which is a commercial mock microbial community mix preserved in DNA/RNA shield. Comparing the relative abundances with and without washing the cells prior to droplet making, we found that Gram-negative cells (in red) are almost undetected after 2X wash. This suggests that gram-negative cells were already lysed in the buffer and washing removed the cell debris and free-floating genomic DNA, leaving only the gram-positive cells intact for single-cell sequencing.
Extended Data Fig. 9 DoTA-seq elucidates diverse genetic subpopulations in B. fragilis generated via promoter inversion.
(a) Schematic of the potential diversity generated by B. fragilis CPS operons. There are a total of 8 CPS operons referred to as A-H, 7 of which contain promoters flanked by inverted repeats (triangles). These promoters switch ON and OFF through recombination at the inverted repeats driven by an endogenous recombinase (mpi). (b) Schematic of DoTA-seq reaction with primers designed to flank all 7 invertible promoters. Primers are represented by half-headed arrows. Gray vertical lines represent region of complementarity between amplicons and barcodes. P5 and P7 represent the Illumina sequencing adaptor sequences. (c) Bar plot of the relative frequencies of unique CPS promoter states in a single B. fragilis colony. Promoter states are represented by a 7-letter code, where the letters (A-H) denote that a given promoter is turned ON, and ‘-‘ denote the given promoter is turned OFF. Data points represent technical replicates. Error bars represent 1 s.d. from the mean of technical replicates (n = 3). Bar height represents the mean. Since a subset of combinatorial promoter states were rare in the population and not observed in all technical replicates, we computed the stochastic limit of detection (Supplementary Note 4). (d) An undirected graph network representation of the CPS promoter state subpopulations in (c). Nodes represent CPS promoter combinatorial states where diameter is proportional to relative frequency, and edges connect nodes that are one promoter flip away from each other. (e) Network representation of the measured combinatorial promoter states where the recombinase (mpi) responsible for promoter inversions was deleted. In this strain, the entire population is locked in a single state (A–E-H). The diameter of the node and edges are the same as (d).
Supplementary information
Supplementary Information
Supplementary Fig. 1 and Supplementary Notes 1–6.
Supplementary Table
Supplementary Table 1
Source data
Source Data Fig. 1
Summary DoTA-seq results for control experiments.
Source Data Fig. 2
Summary DoTA-seq results for synthetic gut community passaging experiments.
Source Data Fig. 3
Summary DoTA-seq results for mouse and human fecal samples.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lan, F., Saba, J., Ross, T.D. et al. Massively parallel single-cell sequencing of diverse microbial populations. Nat Methods 21, 228–235 (2024). https://doi.org/10.1038/s41592-023-02157-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41592-023-02157-7
This article is cited by
-
Single-cell sequencing of diverse microorganisms
Nature Reviews Genetics (2024)