Rare transmission of commensal and pathogenic bacteria in the gut microbiome of hospitalized adults

Bacterial bloodstream infections are a major cause of morbidity and mortality among patients undergoing hematopoietic cell transplantation (HCT). Although previous research has demonstrated that pathogens may translocate from the gut microbiome into the bloodstream to cause infections, the mechanisms by which HCT patients acquire pathogens in their microbiome have not yet been described. Here, we use linked-read and short-read metagenomic sequencing to analyze 401 stool samples collected from 149 adults undergoing HCT and hospitalized in the same unit over three years, many of whom were roommates. We use metagenomic assembly and strain-specific comparison methods to search for high-identity bacterial strains, which may indicate transmission between the gut microbiomes of patients. Overall, the microbiomes of patients who share time and space in the hospital do not converge in taxonomic composition. However, we do observe six pairs of patients who harbor identical or nearly identical strains of the pathogen Enterococcus faecium, or the gut commensals Akkermansia muciniphila and Hungatella hathewayi. These shared strains may result from direct transmission between patients who shared a room and bathroom, acquisition from a common hospital source, or transmission from an unsampled intermediate. We also identify multiple patients with identical strains of species commonly found in commercial probiotics, including Lactobacillus rhamnosus and Streptococcus thermophilus. In summary, our findings indicate that sharing of identical pathogens between the gut microbiomes of multiple patients is a rare phenomenon. Furthermore, the observed potential transmission of commensal, immunomodulatory microbes suggests that exposure to other humans may contribute to microbiome reassembly post-HCT.

: Analysis of hospital geography. a) Layout of rooms in the HCT ward. Room numbers are indicated and double occupancy rooms are underlined. b) Network view of patients who were roommates for at least 24 hours. Each node represents a single patient, colored according to if they have a banked stool sample or metagenomic sequencing data present. Edges are drawn between patients who were roommates, and edge width represents the length of overlap in the same room. c) Histogram of the number of rooms patients occupied for at least 24 hours. d) Histogram of the number of unique roommates patients had for at least 24 hours.
11360_05_SR v a n R A v a n S A v a n H A v a n A v a n X A v a n Y A v a n Z A

Bacteroides stercoris
Comparison of samples from the same patient   Figure S5: Enterococcus faecium (a) and Escherichia coli (b) strains compared to external datasets, including hospitalized adult and pediatric HCT patients, hospitalized infants and vancomycin-resistant E. faecium isolates 3,69-73 . Panels are separated according to whether comparisons were made within the data in this manuscript (Bhatt-Bhatt), between our data and external data (Bhatt-SRA) or within external data (SRA-SRA). In linked-read sequencing libraries, we were able to estimate the impact of barcode swapping. There are ~10 million possible 10X barcodes (these are the barcodes which convey long-range information, different from the sample index barcodes). While a subset of 10X We also attempted to measure the degree of barcode swapping in dual-indexed lanes of short read Illumina sequencing. Using the uniquely identifiable p-crAssphage genome as a marker for swapping, we observed roughly 0.5% of sequencing reads swapped between samples on the same lane that shared one index sequence. Samples on the lane that shared no sequencing indices often had p-crAssphage below 1e-5%. Simple relative abundance metrics cannot distinguish between barcode swapping and a true difference in abundance between samples. However, even with the 0.5% rate of swapping, we regularly observed >5x coverage of the p-crAssphage genome in what we believe to be the truly negative samples, and the resulting inStrain comparisons revealed sufficient paired genome coverage and 100% popANI.
We never observed identical p-crAssphage genomes between samples from different patients sequenced with unique dual indices or on different lanes.
For short-read sequencing samples, we know which pairs of samples share one of two index sequences and have the possibility of being impacted by swapping. We cannot estimate the impact of barcode swapping like was done for linked-read datasets. We simply eliminated all comparisons where two samples had the possibility of barcode swapping, and all comparisons that could be affected by "secondary" swapping, where the samples were not directly affected, but an interaction between other samples from the two patients could cause false positives.
While this filtering may discard legitimate transmission events, we believe it is necessary to lower the number of false positives.
Previous DNA extraction and short-read sequencing efforts did not follow the randomization constraints above and we cannot guarantee that laboratory contamination did not happen at some point in the process. However, we note that cases of laboratory contamination or barcode swapping would result in the entire microbiome composition of one sample being transferred to another. After our stringent filters, we only discovered one case where patients shared two separate species. As these were both Lactobacillus species, our hypothesis about probiotic consumption is a possible explanation.