Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Brief Communication
  • Published:

Quantifying and comparing bacterial growth dynamics in multiple metagenomic samples

Abstract

The accurate quantification of microbial growth dynamics for species without complete genome sequences is biologically important, but computationally challenging in metagenomics. Here we present dynamic estimator of microbial communities (DEMIC; https://sourceforge.net/projects/demic/), a multi-sample algorithm based on contigs and coverage values, to infer the relative distances of contigs from the replication origin and to accurately compare bacterial growth rates between samples. We demonstrate robust performances of DEMIC for various sample sizes and assembly qualities using multiple synthetic and real datasets.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Computational pipeline of DEMIC.
Fig. 2: Performance evaluation of DEMIC based on sequencing datasets of three species.
Fig. 3: Performance evaluation of DEMIC based on simulated data of 45 closely related species from five phyla.

Similar content being viewed by others

Data availability

The accession numbers and weblinks for all real datasets are provided in the Methods. Simulated data are available upon request from the corresponding author.

References

  1. Myhrvold, C., Kotula, J. W., Hicks, W. M., Conway, N. J. & Silver, P. A. Nat. Commun. 6, 10039 (2015).

    Article  CAS  PubMed  Google Scholar 

  2. Helaine, S. et al. Proc. Natl Acad. Sci. USA 107, 3746–3751 (2010).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Claudi, B. et al. Cell 158, 722–733 (2014).

    Article  CAS  PubMed  Google Scholar 

  4. Abel, S. et al. Nat. Methods 12, 223–226 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  5. Korem, T. et al. Science 349, 1101–1106 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Brown, C. T., Olm, M. R., Thomas, B. C. & Banfield, J. F. Nat. Biotechnol. 34, 1256–1263 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Breitwieser, F. P., Lu, J. & Salzberg, S. L. Brief. Bioinform. https://doi.org/10.1093/bib/bbx120 (2017).

    Article  PubMed Central  Google Scholar 

  8. Alneberg, J. et al. Nat. Methods 11, 1144–1146 (2014).

    Article  CAS  PubMed  Google Scholar 

  9. Albertsen, M. et al. Nat. Biotechnol. 31, 533–538 (2013).

    Article  CAS  PubMed  Google Scholar 

  10. Rearick, D. et al. Nucleic Acids Res. 39, 2357–2366 (2011).

    Article  CAS  PubMed  Google Scholar 

  11. Wu, Y. W., Tang, Y. H., Tringe, S. G., Simmons, B. A. & Singer, S. W. Microbiome 2, 26 (2014).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Wu, Y. W., Simmons, B. A. & Singer, S. W. Bioinformatics 32, 605–607 (2016).

    Article  CAS  PubMed  Google Scholar 

  13. Li, D., Liu, C. M., Luo, R., Sadakane, K. & Lam, T. W. Bioinformatics 31, 1674–1676 (2015).

    Article  CAS  PubMed  Google Scholar 

  14. Parks, D. H., Imelfort, M., Skennerton, C. T., Hugenholtz, P. & Tyson, G. W. Genome Res. 25, 1043–1055 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Thompson, L. R. et al. ISME J. 11, 138–151 (2017).

    Article  CAS  PubMed  Google Scholar 

  16. Lewis, J. D. et al. Cell Host Microbe 18, 489–500 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Sangwan, N., Xia, F. & Gilbert, J. A. Microbiome 4, 8 (2016).

    Article  PubMed  PubMed Central  Google Scholar 

  18. Sczyrba, A. et al. Nat. Methods 14, 1063–1071 (2017).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Luo, C. et al. Nat. Biotechnol. 33, 1045–1052 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Beaulaurier, J. et al. Nat. Biotechnol. 36, 61–69 (2018).

    Article  CAS  PubMed  Google Scholar 

  21. Bates, D., Mächler, M., Bolker, B. M. & Walker, S. C. J. Stat. Softw. 67, 1–48 (2015).

    Article  Google Scholar 

  22. Lê, S., Josse, J. & Husson, F. J. Stat. Softw. 25, 1–18 (2008).

    Article  Google Scholar 

  23. Ross, M. G. et al. Genome. Biol. 14, R51 (2013).

    Article  PubMed  PubMed Central  Google Scholar 

  24. Gao, F., Luo, H. & Zhang, C. T. Nucleic Acids Res. 41, D90–D93 (2013).

    Article  CAS  PubMed  Google Scholar 

  25. Schirmer, M., D’Amore, R., Ijaz, U. Z., Hall, N. & Quince, C. BMC Bioinformatics 17, 125 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Letunic, I. & Bork, P. Nucleic Acids Res. 44, W242–W245 (2016).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  27. Markowitz, V. M. et al. Nucleic Acids Res. 40, D115–D122 (2012).

    Article  CAS  PubMed  Google Scholar 

  28. Kang, D. D., Froula, J., Egan, R. & Wang, Z. PeerJ 3, e1165 (2015).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Langmead, B. & Salzberg, S. L. Nat. Methods 9, 357–359 (2012).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Li, H. et al. Bioinformatics 25, 2078–2079 (2009).

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This research was supported by grant R01GM123056 (H.L.) from the National Institutes of Health.

Author information

Authors and Affiliations

Authors

Contributions

H.L. and Y.G. conceived and designed the project. Y.G. implemented the method. Both authors analyzed the data, and wrote and edited the manuscript.

Corresponding author

Correspondence to Hongzhe Li.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Integrated supplementary information

Supplementary Figure 1 Peak-to-trough ratio and pipeline of DEMIC.

(a) For most bacteria, DNA replication starts from a fixed origin in circular genome. (b) Replication forks proceed bi-directionally, and more than two replication forks may occur in fast growing bacteria. For a genome region, its DNA copy number is higher if it is nearer the fixed replication origin, and lower if it is farther away from the origin. (c) When complete genome sequence is available, ordinary linear regression model can be fitted between genome locations and logarithm-transformed sequencing coverages, and the growth dynamics of a bacterial population can be measured by coverage ratio between replication origin (peak) and terminus (trough). The peak-to-trough ratio (PTR) cannot be directly calculated without the full and complete genome.

Supplementary Figure 2 The average read coverages of four species in 50 samples of the synthetic dataset.

Each sample is a mixture of two to four real sequencing datasets from different species: Lactobacillus gasseri, Enterococcus faecalis, Citrobacter rodentium and Escherichia coli.

Supplementary Figure 3 Number of growth rate (PTR) estimates by three computational methods for the three species with contig clusters generated by binning algorithm from the synthetic dataset.

Whereas PTRC and DEMIC successfully estimated all 122 growth rates, iRep only output 59 growth rates by default, and other growth rates were categorized as ‘unfiltered’.

Supplementary Figure 4 Scatterplots and correlations of the PTR estimates from DEMIC (red) and iRep (blue) with PTR values (Pearson’s r value) in 36 sequencing datasets of Lactobacillus gasseri.

The shaded areas indicate the 99% level of confidence interval.

Supplementary Figure 5 Evaluation of effects of sample sizes on the performances of DEMIC (red) and iRep (blue) based on L. gasseri, E. faecalis and C. rodentium (n = 10 for each).

Box plots of correlations between the estimated PTRs and true PTRs (Pearson’s r values) of all evaluations, indicating the median (center line), first and third quartiles (box edges), and 1.5 times the interquartile range (whiskers) of the correlations.

Supplementary Figure 6 Phylogenetic tree generated by iTOL for 45 species from five phyla in simulated data that were randomly selected from records of DoriC.

According to NCBI Taxonomy, six species have synonym name (genus or species): Desulfotomaculum nigrificans (Desulfotomaculum carboxydivorans), Cutibacterium acnes (Propionibacterium acnes), Sphaerochaeta coccoides (Spirochaeta coccoides), Pseudopropionibacterium propionicum (Propionibacterium propionicum), Acidipropionibacterium acidipropionici (Propionibacterium acidipropionici) and Sediminispirochaeta smaragdinae (Spirochaeta smaragdinae).

Supplementary Figure 7

A total of 1,336 PTRs randomly assigned to 45 species from 15 genera of five phyla and 50 samples.

Supplementary Figure 8

A total of 1,336 average coverages randomly assigned for 45 species from 15 genera of five phyla and 50 samples.

Supplementary Figure 9 True versus estimated PTRs from DEMIC and iRep for 41 species represented by contig clusters.

Symbol shape indicates whether species were filtered or not in the estimates from iRep.

Supplementary Figure 10 An example of contig filtering in DEMIC.

(a-b) iRep failed to accurately estimate the growth rates of two closely related species, P. terrae and P. polymyxa, that were mixed into the same contig cluster by binning algorithm. (c) In the contig cluster, P. polymyxa is the dominant species but with a high proportion of contamination from P. terrae. DEMIC effectively filtered out contigs from P. terrae and kept most of the contigs from P. polymyxa by iteratively updating the contig cluster based on the PC1 distribution of all remaining contigs. (d) DEMIC estimates were highly correlated with PTRs of P. polymyxa (r = 0.994, n = 28).

Supplementary Figure 11 Applicable contig clusters and computational resources of DEMIC in two real metagenomic datasets using MetaBAT as the binning algorithm.

iRep was applied to the same SAM records and contig clusters, whereas PTRC was applied using a complete genome library that is independent of the contig clusters generated by MetaBAT.

Supplementary Figure 12 Applicable contig clusters and computational resources of DEMIC in two real metagenomic datasets using MaxBin as the binning algorithm.

iRep was applied to the same SAM records and contig clusters, whereas PTRC was applied using a complete genome library that is independent of the contig clusters generated by MaxBin.

Supplementary Figure 13 Growth dynamics PTR estimates by DEMIC for the RedSea datasets.

(a) Overview of the estimated growth rates (ePTRs) for contig clusters from seawater samples of different depths in eight RedSea stations. (b-c) An example of depth-related variation in growth rates estimated by DEMIC. The estimated growth rates were significantly lower in depth 500 m compared to those in 10 m and 100 m (one-sided Mann-Whitney U test; n = 5,5,3,6 for 10 m, 100 m, 200 m and 500 m, respectively), for contig cluster 36 generated by MetaBAT, which has 60% completeness and an average identity of 92% with Marinobacter adhaerens. The box plots indicate the median (center line), first and third quartiles (box edges), and 1.5 times the interquartile range (whiskers).

Supplementary Figure 14 Growth dynamics PTR estimates by DEMIC for the PLEASE datasets.

(a) Overview of a subset of contig clusters with estimated bacterial growth rates (ePTRs) in healthy and Crohn’s disease samples at the baseline. (b) Completeness and contamination of contig clusters in the datasets. For each contig cluster, the size and composition of the pie represent the number of samples and proportion of disease/control status as well as treatment duration in samples that can be estimated by DEMIC for growth rates, respectively. (c) Some species represented by contig clusters showed different growth rates between healthy and disease subjects, but such a difference disappeared completely or partially after anti-TNF or enteral diet treatment (one-sided Mann-Whitney U-Test, p value < 0.05 after FDR correction). ePTR: estimated PTR from DEMIC. For metabat2.239, n = 3,9,8,3,2; for metabat2.259, n = 4,13,12,13,7; for metabat2.55, n = 4,22,14,17,15 in the group of control, Crohn’s disease baseline, week 1, week 4 and week 8, respectively. The box plots indicate the median (center line), first and third quartiles (box edges), and 1.5 times the interquartile range (whiskers).

Supplementary information

Supplementary Figures and Tables

Supplementary Figures 1–14 and Supplementary Tables 1–3

Reporting Summary

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gao, Y., Li, H. Quantifying and comparing bacterial growth dynamics in multiple metagenomic samples. Nat Methods 15, 1041–1044 (2018). https://doi.org/10.1038/s41592-018-0182-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s41592-018-0182-0

This article is cited by

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing