Gut bacterial communities of diarrheic patients with indications of Clostridioides difficile infection

Schneider, Dominik; Thürmer, Andrea; Gollnow, Kathleen; Lugert, Raimond; Gunka, Katrin; Groß, Uwe; Daniel, Rolf

doi:10.1038/sdata.2017.152

Download PDF

Data Descriptor
Open access
Published: 17 October 2017

Gut bacterial communities of diarrheic patients with indications of Clostridioides difficile infection

Scientific Data volume 4, Article number: 170152 (2017) Cite this article

2884 Accesses
15 Citations
11 Altmetric
Metrics details

Subjects

Abstract

We present bacterial 16S rRNA gene datasets derived from stool samples of 44 patients with diarrhea indicative of a Clostridioides difficile infection. For 20 of these patients, C. difficile infection was confirmed by clinical evidence. Stool samples from patients originating from Germany, Ghana, and Indonesia were taken and subjected to DNA isolation. DNA isolations of stool samples from 35 asymptomatic control individuals were performed. The bacterial community structure was assessed by 16S rRNA gene analysis (V3-V4 region). Metadata from patients and control individuals include gender, age, country, presence of diarrhea, concomitant diseases, and results of microbiological tests to diagnose C. difficile presence. We provide initial data analysis and a dataset overview. After processing of paired-end sequencing data, reads were merged, quality-filtered, primer sequences removed, reads truncated to 400 bp and dereplicated. Singletons were removed and sequences were sorted by cluster size, clustered at 97% sequence similarity and chimeric sequences were discarded. Taxonomy to each operational taxonomic unit was assigned by BLASTn searches against Silva database 123.1 and a table was constructed.

Design Type(s)	observation design • disease state design
Measurement Type(s)	rRNA_16S
Technology Type(s)	DNA sequencing
Factor Type(s)	National Origin • Sign or Symptom
Sample Characteristic(s)	Homo sapiens • feces

Machine-accessible metadata file describing the reported data (ISA-Tab format)

Quantitative characterization of Clostridioides difficile population in the gut microbiome of patients with C. difficile infection and their association with clinical factors

Article Open access 19 October 2020

Culture dependent and independent analyses suggest a low level of sharing of endospore-forming species between mothers and their children

Article Open access 04 February 2020

Natural and after colon washing fecal samples: the two sides of the coin for investigating the human gut microbiome

Article Open access 25 October 2022

Background & Summary

Infections with Clostridioides difficile (formerly Clostridium difficile, see Lawson et al.¹) have significantly increased over the past decade^2–5. The organism is a Gram-positive, obligate anaerobic spore-forming bacterium, which is frequently found as member of the gut microbiome in healthy individuals, but eventually can also act as human pathogen causing disease that ranges from severe diarrhea to life-threatening toxic megacolon⁶. It produces two potent exotoxins, toxin A (enterotoxin, tcdA) and toxin B (cytotoxin, tcdB)⁷. Some isolates also express a third, so-called binary toxin (C. difficile transferase, CDT)⁸. The risk to suffer from a C. difficile infection increases with prior broad-spectrum antibiotic treatment, which supports the assumption that an imbalanced gut microbiome increases the likelihood of a C. difficile infection⁹.

In this data report, we provide the bacterial community composition in stool samples of 79 human individuals including 44 patients with diarrhea indicative for infection with C. difficile and 35 asymptomatic control individuals from regions of Germany (Seesen, Lower Saxony), Ghana (Eikwe, Western Region), and Indonesia (Medan, Sumatra)., For 20 of the 44 patients, clinical evidence of a C. difficile infection was obtained. For the remaining patients, the presence of C. difficile was indicated by 16S rRNA gene data or MALDI-TOF mass spectrometry. In total, we provide 20,844,594 paired-end 16S rRNA gene reads sequenced with the v3 chemistry of Illumina and a MiSeq instrument. Correspondingly, this dataset represents a total of 10,422,297 bacterial 16S rRNA gene sequences. After all processing steps, which included read-merging, quality-filtering, primer sequence removal, dereplication, singleton removal, read-trimming, chimera removal, and removal of extrinsic domains (Archaea, chloroplasts) 7.204.189 (69.1%) high quality 16S rRNA gene sequences remained for analysis (see Table 1 (available online only) for 16S rRNA gene sequence processing statistics). Additionally, we supply metadata including gender, age, country, presence or absence of diarrhea, C. difficile ribotype, toxin PCR ribotype, toxin test from stool, concomitant diseases at time of sampling, and antiobiotic treatment within the last three months (Table 2 (available online only)).

Table 1 16S rRNA gene sequence processing statistics and biosample accession numbers

Full size table

Table 2 Metadata of patients

Full size table

The dataset contributes to unveil the significance of the gut microbiome in diseased and asymptomatic patients. In a first analysis, we observed C. difficile as a rather low abundant (mainly <1%, with one exception) bacterial community member in stool samples (Fig. 1). The exception was patient_029 (male, age 91), who showed a high abundance of C. difficile (42.67%).

**Figure 1: Bacterial community composition at family level of human stool samples analysed in this study.**

Whether the low abundance of C. difficile in most stool samples from diarrheic patients might indicate adhesion or invasion of C. difficile to the intestinal epithelium remains to be analyzed. However, a similar study also observed low abundances of C. difficile in CDI patients¹⁰. Furthermore, C. difficile is not the only potential pathogen of diseased patients. The stool samples of some patients contain other potentially pathogenic bacterial species belonging to different genera such as Escherichia/Shigella, Salmonella or Staphylococcus. In addition, some stool samples also contained facultative human-pathogenic Klebsiella and Pseudomonas species. These results support the hypothesis that the gut microbiome contributes to the pathogenic potential or at least can be used as an indicator of C. difficile infections. This is of special interest for C. difficile infections from Ghana, as most of the so far analyzed genomes of strains from this African country lack the toxin genes¹¹. Furthermore, most German patients had a higher age than the patients from the other regions and showed a typical C. difficile infection profile, including treatment with antibiotics and presence of mainly toxin-positive strains. In contrast patients from Ghana and Indonesia were younger and had less antibiotic treatment than the German patients, and harboured predominantly toxin-negative strains (Table 2 (available online only)).

The Unifrac¹² based bacterial community structure comparison shows variations in structure and diversity within potentially C. difficile-infected and reference patients (Fig. 2). We observed a low but significant correlation of the bacterial microbiome to patients who exhibited diarrhea (P=0.006, r²=0.0709) and diagnosed C. difficile positive by microbiological tests (P=0.017, r²=0.0628), respectively. In general, patients that have been diagnosed C. difficile positive harbour a less diverse bacterial microbiome (Fig. 2), which has also been observed recently^13,14.

**Figure 2: Multivariate analysis of the bacterial community from human stool samples.**

Methods

Stool sample preparation and processing

This study was approved by the Ethical Committee of the University Medical Center, Göttingen, Germany (2011-03-29). Diarrhea was defined as the passage of ≥three loose or liquid defecations per day. Upon informed consent, randomly selected patients with diarrhea and non-diarrheal volunteers agreed to submit a stool sample using stool containers and complete a standardised questionnaire about their lifestyle and medical history. Within two hours after providing the stool samples, they were cultured on Clostridium difficile agar base used with selective supplement (Oxoid, Basingstoke, Hampshire, UK) and 7% (v/v) defibrinated human blood for 48 h at 38 °C in anaerobic condition using gas packs (bioMérieux, Marcy-l’Ètoile, France). Stool samples were also tested for the presence of C. difficile glutamate dehydrogenase (GDH) antigen and toxins A and B by the C. DIFF QUIK CHEK COMPLETE test (Techlab, Blacksburg, USA). In addition, the stool sample that was used for C. difficile identification was also frozen immediately after taken from the patients, stored at −20 °C for a maximum of 11 months (based on duration of local sampling period) and transported within 24 h to Göttingen (Germany), where identification of C. difficile was confirmed by recultivation and MALDI-TOF mass spectrometry using Biotyper (Bruker Daltonics, Bremen, Germany) with score values of ≥2,000. All C. difficile strains were further characterized by toxin determination using the RealStar Clostridium difficile PCR Kit 1.0 (Altona Diagnostics, Hamburg, Germany). Ribotyping and toxinotyping was kindly performed by L. von Müller (Homburg, Germany) and M. Rupnik (Maribor, Slovenia) as previously be reported¹¹. In addition, the Luminex xTag GPP test was used for all Ghanaian stool samples according to the manufacturer’s instructions (Luminex, Hertogenbosch, The Netherlands) in order to identify C. difficile and other potential intestinal pathogens¹¹. The stool sample was also used for DNA isolation in order to determine bacterial community composition.

Nucleic acid extraction and amplification of 16S rRNA genes

DNA was extracted from all stool samples using the MagNA Pure LC 2.0 Instrument with the MagNA Pure LC Total Nucleic Acid Isolation kit following the instructions of the manufacturer (Roche, Mannheim, Germany). Bacterial 16S rRNA gene amplicons were generated using fusion primers TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG-CCTACGGGNGGCWGCAG (MiSeq_overhang-D-Bact-0341-b-S-17) and GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG-GACTACHVGGGTATCTAATCC (MiSeq_overhang-S-D-Bact-0785-a-A-21) including bacteria targeting primers from Klindworth et al.¹⁵. The PCR reaction mixture with a total volume 50 μl contained 1 U Phusion high fidelity DNA polymerase (Biozym Scientific, Oldendorf, Germany), 5% DMSO, 0.2 mM of each primer, 200 μM dNTP, 0.2 μl of 50 mM MgCl₂, and 25 ng of isolated DNA. Thermal cycling scheme for bacterial amplicons was as follows: initial denaturation for 1 min at 98 °C, 25 cycles at 98 °C for 45 s, 45 s at 60 °C, and 30 s at 72 °C, and a final extension at 72 °C for 5 min. The resulting PCR products were checked by agarose gel electrophoresis for appropriate size and purified using the magnetic bead capture kit NucleoMag PCR (Macherey-Nagel, Düren, Germany) as recommended by the manufacturer. Quantification of the PCR products was performed using the Quant-iT dsDNA HS assay kit and a Qubit fluorometer (Invitrogen GmbH, Karlsruhe, Germany) following the manufacturer’s instructions. PCR products were used to attach indices and Illumina sequencing adapters using the Nextera XT Index kit (Illumina, San Diego). Index PCR was performed using 5 μl of template PCR product, 2.5 μl of each index primer, 12.5 μl of 2x KAPA HiFi HotStart ReadyMix and 2.5 μl PCR grade water. Thermal cycling scheme was as follows: 95 °C for 3 min, 8 cycles of 30 s at 95 °C, 30 s at 55 °C and 30 s at 72 °C and a final extension at 72 °C for 5 min. Bacterial 16S rRNA genes were sequenced using the dual index paired-end (v3, 2×300 bp) approach for the Illumina MiSeq platform as recommended by the manufacturer.

16S rRNA gene sequence processing and analyses

Demultiplexing and clipping of sequence adapters from raw sequences were performed by employing CASAVA data analysis software (Illumina). Paired-end sequences were merged using PEAR v0.9.10¹⁶ with default parameters. Subsequently, sequences with an average quality score lower than 20 and containing unresolved bases were removed with the split_libraries_fastq.py script from QIIME 1.9.1¹⁷. We additionally removed non-clipped reverse and forward primer sequences by employing cutadapt 1.10¹⁸ with default settings. For operational taxonomic unit (OTU) clustering, we used USEARCH version 8.1.1861¹⁹ with the UPARSE²⁰ algorithm to truncate reads to 400 bp (-fastx_truncate), dereplicate (-derep_fulllength), sort by cluster size and remove singletons (-sortbysize). Subsequently, OTUs were clustered at 97% sequence identity using USEARCH (-cluster_otus), which includes de novo chimera removal. Additionally, chimeric sequences were removed using UCHIME²¹ included in software package USEARCH with reference mode (-uchime_ref) against RDPs trainset15_092015.fasta²². All quality-filtered sequences were mapped to chimera-free OTUs and an OTU table was created using USEARCH (-usearch_global). Taxonomic classification of the picked reference sequences (OTUs) was performed with parallel_assign_taxonomy_blast.py against SILVA SSU database release 123.1²³. Extrinsic domain OTUs, chloroplasts, and unclassified OTUs were removed from the dataset by employing filter_otu_table.py. Sample comparisons were performed at the same surveying effort, utilizing the lowest number of sequences by random resampling (10.000 reads per sample). Species richness, alpha and beta diversity estimates were determined using the QIIME script alpha_rarefaction.py. Non-metric multidimensional scaling (NMDS) and statistical tests were performed with the vegan package²⁴ in R²⁵.

Data Records

The paired-end reads of the 16S rRNA gene sequencing were deposited in the National Center for Biotechnology Information (Data Citation 1). The dataset consists of 158 zipped FASTQ files that were processed by the CASAVA software (Illumina), which includes demultiplexing and removal of adapter sequences. The OTU table (otu_table_PRJNA353065.xlsx) used for all analyses and the corresponding representative OTU sequences clustered at 97% genetic identity (otu_sequences_PRJNA353065.fasta) are accessible at figshare.com (Data Citation 2).

Technical Validation

Success of 16S rRNA gene amplicon generation was controlled by reviewing the amplicon size (approximately 550 bp) and absence of contaminations on an agarose gel. Additionally, negative (PCR reaction without template) and positive controls (genomic DNA of E. coli DH5a) were performed to ensure purity of the employed reagents. To reduce possible PCR biases, all PCRs were performed in triplicate and after purification pooled equimolar.

Usage Notes

The OTU table (otu_table_PRJNA353065.xlsx) used for all analyses and the corresponding representative OTU sequences clustered at 97% genetic identity (otu_sequences_PRJNA353065.fasta) are accessible at figshare (Data Citation 2).

Additional Information

How to cite this article: Schneider, D. et al. Gut bacterial communities of diarrheic patients with indications of Clostridioides difficile infection. Sci. Data 4:170152 doi: 10.1038/sdata.2017.152 (2017).

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Table 3 Diversity metrics for each bacterial microbiome at a sequence depth of 10.000 16S rRNA gene reads

Full size table

References

Lawson, P. A., Citron, D. M., Tyrrell, K. L. & Finegold, S. M. Reclassification of Clostridium difficile as Clostridioides difficile (Hall and O’Toole 1935) Prévot 1938. Anaerobe 40, 95–99 (2016).
Article Google Scholar
Honda, H. & Dubberke, E. R. Clostridium difficile infection: a re-emerging threat. Mo. Med. 106, 287–291 (2009).
PubMed PubMed Central Google Scholar
Rupnik, M., Wilcox, M. H. & Gerding, D. N. Clostridium difficile infection: new developments in epidemiology and pathogenesis. Nat. Rev. Microbiol. 7, 526–536 (2009).
Article CAS Google Scholar
Lessa, F. C., Gould, C. V. & McDonald, L. C. Current status of Clostridium difficile infection epidemiology. Clin. Infect. Dis. 55 (Suppl 2): S65–S70 (2012).
Article CAS Google Scholar
Ghose, C. Clostridium difficile infection in the twenty-first century. Emerg. Microbes Infect. 2, e62 (2013).
Article Google Scholar
Hensgens, M. P. M. et al. Clostridium difficile infection in the community: a zoonotic disease? Clin. Microbiol. Infect. 16, 635–645 (2012).
Article Google Scholar
Hatheway, C. L. Toxigenic clostridia. Clin. Microbiol. Rev. 3, 66–98 (1990).
Article CAS Google Scholar
Hemmasi, S. et al. Interaction of the Clostridium difficile Binary Toxin CDT and Its Host Cell Receptor, Lipolysis-stimulated Lipoprotein Receptor (LSR). J. Biol. Chem. 290, 14031–14044 (2015).
Article CAS Google Scholar
Almeida, R., Gerbaba, T. & Petrof, E. O. Recurrent Clostridium difficile infection and the microbiome. J. Gastroenterol. 51, 1–10 (2016).
Article CAS Google Scholar
Schubert, A. M. et al. Microbiome data distinguish patients with Clostridium difficile infection and non-C. difficile-associated diarrhea from healthy controls. MBio. 5, e01021–14 (2014).
Article Google Scholar
Janssen, I. et al. High prevalence of nontoxigenic Clostridium difficile isolated from hospitalized and non-hospitalized individuals in rural Ghana. Int. J. Med. Microbiol. 306, 652–656 (2016).
Article Google Scholar
Lozupone, C. & Knight, R. UniFrac: a new phylogenetic method for comparing microbial communities. Appl. Environ. Microbiol. 71, 8228–8235 (2005).
Article CAS Google Scholar
Seekatz, A. M., Rao, K., Santhosh, K. & Young, V. B. Dynamics of the fecal microbiome in patients with recurrent and nonrecurrent Clostridium difficile infection. Genome Med. 8, 47 (2016).
Article Google Scholar
Longo, D. L., Leffler, D. A. & Lamont, J. T. Clostridium difficile Infection. N. Engl. J. Med. 372, 1539–1548 (2015).
Article Google Scholar
Klindworth, A. et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Res. 41, e1 (2013).
Article CAS Google Scholar
Zhang, J., Kobert, K., Flouri, T. & Stamatakis, A. PEAR: a fast and accurate Illumina Paired-End reAd mergeR. Bioinformatics 30, 614–620 (2014).
Article CAS Google Scholar
Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010).
Article CAS Google Scholar
Martin, M. Cutadapt removes adapter sequences from high-throughput sequencing reads. EMBnet. journal 17, 10–12 (2011).
Article Google Scholar
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Article CAS Google Scholar
Edgar, R. C. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat. Methods 10, 996–998 (2013).
Article CAS Google Scholar
Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C. & Knight, R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27, 2194–2200 (2011).
Article CAS Google Scholar
Cole, J. R. et al. Ribosomal Database Project: data and tools for high throughput rRNA analysis. Nucleic Acids Res. 42, D633–D642 (2014).
Article CAS Google Scholar
Quast, C. et al. The SILVA ribosomal RNA gene database project: improved data processing and web-based tools. Nucleic Acids Res. 41, D590–D596 (2013).
Article CAS Google Scholar
Oksanen, J. et al. vegan: Community Ecology Package (2016).
Team, R. C. R. A Language and Environment for Statistical Computing (2015).
Faith, D. P. Conservation evaluation and phylogenetic diversity. Biol. Conserv. 61, 1–10 (1992).
Article Google Scholar

Data Citations

Schneider, D. NCBI Sequence Read Archive SRP093596 (2016)
Schneider, D. Figshare https://doi.org/10.6084/m9.figshare.c.3877591.v1 (2017)

Download references

Acknowledgements

This work was funded by the Federal State of Lower Saxony, Niedersächsisches Vorab (VWZN2889). We thank L. von Müller and M. Rupnik for kindly having performed ribotyping and toxinotyping. We thank the patients and healthy volunteers who provided their stool samples.

Author information

Uwe Groß and Rolf Daniel: These authors contributed equally to this work.

Authors and Affiliations

Genomic and Applied Microbiology and Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, Georg-August-University of Göttingen, 37077 Göttingen, Germany
Dominik Schneider, Andrea Thürmer, Kathleen Gollnow & Rolf Daniel
Institute of Medical Microbiology, University Medical Center Göttingen, 37075 Göttingen, Germany
Raimond Lugert, Katrin Gunka & Uwe Groß

Authors

Dominik Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Thürmer
View author publications
You can also search for this author in PubMed Google Scholar
Kathleen Gollnow
View author publications
You can also search for this author in PubMed Google Scholar
Raimond Lugert
View author publications
You can also search for this author in PubMed Google Scholar
Katrin Gunka
View author publications
You can also search for this author in PubMed Google Scholar
Uwe Groß
View author publications
You can also search for this author in PubMed Google Scholar
Rolf Daniel
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.D. and D.S. conceived the study and the experiments. R.L. extracted the DNA. A.T. and K.Go. performed sample preparation and sequencing. U.G. organized the sample collection. K.Gu. and U.G. contributed data. D.S. analyzed sequence data. D.S., U.G., A.T. and R.D. wrote the manuscript. All authors interpreted the results and reviewed the manuscript.

Corresponding author

Correspondence to Rolf Daniel.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

ISA-Tab metadata

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.

Reprints and permissions

About this article

Cite this article

Schneider, D., Thürmer, A., Gollnow, K. et al. Gut bacterial communities of diarrheic patients with indications of Clostridioides difficile infection. Sci Data 4, 170152 (2017). https://doi.org/10.1038/sdata.2017.152

Download citation

Received: 20 December 2016
Accepted: 15 August 2017
Published: 17 October 2017
DOI: https://doi.org/10.1038/sdata.2017.152

This article is cited by

Microbiome differential abundance methods produce different results across 38 datasets
- Jacob T. Nearing
- Gavin M. Douglas
- Morgan G. I. Langille
Nature Communications (2022)
The microbial diversity following antibiotic treatment of Clostridioides difficile infection
- Dana Binyamin
- Orna Nitzan
- Avi Peretz
BMC Gastroenterology (2021)
Characterization of Gut Microbiota in Hospitalized Patients with Clostridioides difficile Infection
- Bahareh Vakili
- Abolfazl Fateh
- Seyed Davar Siadat
Current Microbiology (2020)
Bacterial succession along a sediment porewater gradient at Lake Neusiedl in Austria
- Avril Jean Elisabeth von Hoyningen-Huene
- Dominik Schneider
- Rolf Daniel
Scientific Data (2019)