We describe sequence tag–based analysis of microbial populations (STAMP) for characterization of pathogen population dynamics during infection. STAMP analyzes the frequency changes of genetically 'barcoded' organisms to quantify population bottlenecks and infer the founding population size. Analyses of intraintestinal Vibrio cholerae revealed infection-stage and region-specific host barriers to infection and showed unexpected V. cholerae migration counter to intestinal flow. STAMP provides a robust, widely applicable analytical framework for high-confidence characterization of in vivo microbial dissemination.
Subscribe to Journal
Get full journal access for 1 year
only $9.92 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
All prices are NET prices.
Levin, B.R., Lipsitch, M. & Bonhoeffer, S. Science 283, 806–809 (1999).
Gutiérrez, S., Michalakis, Y. & Blanc, S. Curr. Opin. Virol. 2, 546–555 (2012).
Watson, K.G. & Holden, D.W. Cell. Microbiol. 12, 1389–1397 (2010).
Wright, S. Nature 166, 247–249 (1950).
Krimbas, C.B. & Tsakas, S. Evolution 25, 454–460 (1971).
Nei, M. & Tajima, F. Genetics 98, 625–640 (1981).
Pollak, E. Genetics 104, 531–548 (1983).
Moxon, E.R. & Murphy, P.A. Proc. Natl. Acad. Sci. USA 75, 1534–1536 (1978).
Margolis, E. & Levin, B.R. J. Infect. Dis. 196, 1068–1075 (2007).
Barnes, P.D., Bergman, M.A., Mecsas, J. & Isberg, R.R. J. Exp. Med. 203, 1591–1601 (2006).
Li, Y., Thompson, C.M., Trzcin´ski, K. & Lipsitch, M. Infect. Immun. 81, 4534–4543 (2013).
Grant, A.J. et al. PLoS Biol. 6, e74 (2008).
Kaiser, P., Slack, E., Grant, A.J., Hardt, W.-D. & Regoes, R.R. PLoS Pathog. 9, e1003532 (2013).
Lim, C.H. et al. PLoS Pathog. 10, e1004270 (2014).
Ritchie, J.M., Rui, H., Bronson, R.T. & Waldor, M.K. mBio 1, e00047–10 (2010).
Charlesworth, B. Nat. Rev. Genet. 10, 195–205 (2009).
Cavalli-Sforza, L.L. & Edwards, A.W. Am. J. Hum. Genet. 19, 233–257 (1967).
Angelichio, M.J., Spector, J., Waldor, M.K. & Camilli, A. Infect. Immun. 67, 3733–3739 (1999).
Fu, Y., Waldor, M.K. & Mekalanos, J.J. Cell Host Microbe 14, 652–663 (2013).
Chiang, S.L. & Mekalanos, J.J. Mol. Microbiol. 27, 797–805 (1998).
Thelin, K.H. & Taylor, R.K. Infect. Immun. 64, 2853–2856 (1996).
Gibson, D.G. et al. Nat. Methods 6, 343–345 (2009).
House, B.L., Mortimer, M.W. & Kahn, M.L. Appl. Environ. Microbiol. 70, 2806–2815 (2004).
Simon, R., Priefer, U. & Pühler, A. Biotechnology (NY) 1, 784–791 (1983).
Davis, M.P.A., van Dongen, S., Abreu-Goodger, C., Bartonicek, N. & Enright, A.J. Methods 63, 41–49 (2013).
Caporaso, J.G. et al. Nat. Methods 7, 335–336 (2010).
Edgar, R.C. Bioinformatics 26, 2460–2461 (2010).
The authors thank D. Munera for help with animal experiments, M. Chao and T. Lieberman for discussions, and S. Lory, L. Comstock and members of the Waldor lab for comments on the manuscript. This work was supported by Swiss Foundation for Grants in Biology and Medicine (http://www.samw.ch) grant PASMP3_142724 /1 (S.A.), Swiss National Science Foundation (http://www.snf.ch) grant PBEZP3_140163 and German Academic Exchange Service (http://www.daad.org) grant D/11/45747 (P.A.z.W.), the National Institute of General Medical Sciences of the US National Institutes of Health (NIH) award number U54GM088558 (H.-H.C. and M.L.), NIH grant R37 AI–042347 and Howard Hughes Medical Institute (M.K.W.). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript, and the content is solely the responsibility of the authors and does not necessarily represent the official views of the funders.
The authors declare no competing financial interests.
Integrated supplementary information
Supplementary Figure 1 Simulation demonstrating that calculation of Ne using a large number of sequence tags provides an accurate high-resolution estimation of Nb.
(a-b) We simulate a population with 5, 50, 500, and 108 tags present in equal frequencies. Then they were passed through a bottleneck that reduced the population to 101-106 (bottleneck population size). In case of the 5, 50, and 500 tag simulations, this was followed by a second sampling step (5x105) that fits the number of sequenced barcodes. 108 tags represent the ideal case where (virtually) each bacterium has a unique tag such that after passage through the simulated bottleneck each bacterium is expected to have a distinct barcode. Bottlenecks were simulated by multinomial sampling with replacement and we used equations (1) and (2) from Krimbas & Tsakas5 to determine Nb. The results of 1000 independent simulations are shown. To illustrate the relative deviation from the theoretically expected Nb, data are normalized to the simulated bottleneck and the red, dotted line indicates the theoretically expected Nb (shown at 100 %). (a) In this box plot the median (black line), interquartile range (box), and 95 % confidence interval (whiskers) are indicated. (b) The scale is changed so that outliers (black squares) can be visualized. Note that the median of 1000 independent simulations accurately predicts Nb even with only 5 tags; however, the wide distribution of data-points make Nb estimations from this few tags inaccurate or impossible (negative values) with small numbers of experiments (i.e., few animal infections).
Supplementary Figure 3 The sequence tags are selectively neutral and stably integrated into the genome during the course of the experiment.
(a) Growth curves reveal the neutrality of the tags. V. cholerae strains containing different tags (pSoA158.1-pSoA158.32; black lines) were grown in LB medium with selection for the barcode (LB-Carb-Strep) and the absorbance at 600 nm was recorded in 10 min intervals for 20 h. The wild type (C6706; red line) grown in LB-Strep is given as a reference. (b) Same as in a without selection for the barcode (LB-Strep). (c) The stability of tag insertion was tested by comparing the cfu of V. cholerae grown in liquid culture without selection for the tags (LB-Strep) for 20 h on agar plates without selection for the tag (LB-Carb-Strep) and plates with selection for the tag (LB-Strep). To control for the technical variability of the assay, the same culture was also grown twice on plates without selection for the tag (LB-Strep). No significant difference (p = 0.30; Wilcoxon rank sum test) was detected between both assays. Individual tags (pSoA158.1-pSoA158.7) were tested in biologically independent triplicates. The bold line indicates the overall median for the indicated condition; the dotted line highlights the expected 1:1 ratio.
Supplementary Figure 4 Determination of the optimal theoretical framework and similarity threshold for calculation of bottleneck population sizes.
Correlation between experimentally determined bottleneck population size (bacterial load) and estimated bottleneck population size (Nb) with methods from Krimbas & Tsakas5 (black symbols), Nei & Tajima6 (middle grey symbols), and Pollak7 (light grey symbols). The diamond, square and triangle symbols represent biologically independent replicas. Each graph uses the same sequencing data that have been clustered with different similarity thresholds (Sim. threshold) during tag enumeration with uclust. The thresholds are given in the header of each graph. A sequence similarity threshold of 1.0 produced negative Nb for some data-points which are not displayed in the graph. Note that methods from Nei & Tajima and Pollak produce very similar results so that the symbols overlap. The same dataset analyzed according to Krimbas & Tsakas, with sequence similarity threshold of 0.9 and after INOC54 correction is given in figure 1a and used as a calibration curve for the animal experiments. The INOC54 correction removed non-specific tags, but had minimal influence on the results which indicates that the Nb determination is very robust and can tolerate the loss of several tags.
Supplementary Figure 5 High–spatial resolution analysis of bottleneck populations sizes in the proximal SI.
An additional representative example (Fig. 1b) of bottleneck population size (Nb', black dots) and bacterial load (cfu, red squares) at 20 h post-infection throughout the gastro-intestinal tract of a single animal after infection with 109 tagged V. cholerae. The dashed, grey lines mark the resolution limit for Nb' estimation. The sampling sites are indicated in light red in the schematic diagram of the gastro-intestinal tract
To exclude the possibility that the increase in the founding population in the proximal small intestine, which occurs in the late phase of infection, is caused by the uptake of tagged V. cholerae in food or stool, rabbits were prevented from food intake after infection by fitting them with a pet cone and housing them individually. Bottleneck population size (Nb', black dots) and bacterial load (cfu, red squares) in the proximal (I1), middle (I2), distal small intestine (I3), cecal fluid (Cf) and colon (Co) of three infant rabbits from a single litter at 20 h post-infection infected with an infective dose of 109 cfu. The marked similarity of the Nb' values shown here with those in figure 1b and 2a (late phase; pI1 = 0.71, pI2 = 0.67, pI3 = 0.83, pCf = 0.67, pCo = 0.49; Wilcoxon rank sum tests) argues against the idea that coprophagia is a primary explanation for the high Nb' values in I1 during the late phase of infection. The dotted lines indicate the resolution limit for Nb' estimation. The sample medians are represented by horizontal lines. Corresponding Nb' and bacterial load from the same animal are aligned vertically and always in the same sequential arrangement throughout the sample loci. The sampling sites are indicated in red in the schematic diagram of the gastro-intestinal tract.
Supplementary Figure 7 Onset of fluid accumulation in the GI tract during the late phase of the disease correlates with the backward movement of V. cholerae from the distal (I3) to the proximal (I1) SI.
The volume of fluid (black dots) accumulated in the cecum of 19 infant rabbits from 12 different litters (a proxy for the action of cholera toxin) was measured in the early, middle and late phases (~2 h; ~7 h; ~20 h post-infection) of infection. Sample medians are represented by horizontal lines.
(a-d) Scatter plots of tag frequencies from different inocula and sequencing runs. All inocula cultures were started from aliquots of the same frozen library. Samples of the same inoculum culture (A and A') were processed in parallel and sequenced on the same sequencer run (a). Samples from two independent inocula cultures (B and C) were processed independently and sequenced on the same sequencer run (b). Samples of the same inoculum culture (B and B') were processed in parallel and sequenced on separate sequencer runs (c).Samples from two independent inocula cultures (A and B) were processed independently and sequenced on separate sequencer runs (d). The correlation coefficients of the linear regression (R2) are given in the figure.
About this article
Cite this article
Abel, S., Abel zur Wiesch, P., Chang, HH. et al. Sequence tag–based analysis of microbial population dynamics. Nat Methods 12, 223–226 (2015). https://doi.org/10.1038/nmeth.3253
Nature Communications (2020)
BMC Genomics (2019)
Nature Reviews Genetics (2018)
Nature Methods (2018)
Nature Reviews Microbiology (2016)