Introduction

The gut is inhabited by microbial communities that form an intimate and beneficial association with the host. It is an open microbial ecosystem composed of resident commensals that are continuously exposed to transient exogenous microbes originating from the diet. Many fermented foods and yoghurt in particular, contain high quantities of live bacteria, typically up to 109 CFU/g. These foods have been major contributors to the human diet since the Neolithic Era1, yet our modern understanding of the impact of food-ingested bacteria on our resident gut microbiome remains limited. To date, a majority of studies failed to identify significant modulations of the resident human gut microbiota upon consumption of fermented food2,3,4,5,6. This is likely due to the use of methods lacking sufficient phylogenetic resolution at the species level.

During the last two decades, human gut microbiota structure has increasingly been assessed by culture-independent methods based on 16S rRNA gene quantification or sequencing7,8,9,10,11. These methods can assess the overall structure of microbial communities but remain restricted in phylogenetic identification i) to the genus or species-subgroup levels and ii) by the availability of phylogenetically characterized 16S rDNA sequences in public databases. High throughput DNA sequencing-based approaches have emerged as a powerful alternative to study the complex microbial communities12. This deep sequencing permitted the construction of the first human gut microbiota gene catalog13. Subsequently, quantitative metagenomics has achieved species-level resolution, not only for previously known organisms but also for unknown species14,15,16. These technical and analytical advances provide the tools to explore the composition and metabolic capacity of the microbiota with an accuracy heretofore unattainable.

We deployed and extended these advances to identify gut microbial species modulated by a fermented milk product (FMP) in a 4-week intervention study in subjects suffering from Irritable Bowel Syndrome (IBS) with constipation (n = 28)18. IBS is a chronic functional disorder of the intestine with a prevalence of 10% to 25% and where 67% of diagnoses are for women17.

We aimed to use a species-level metagenomic approach to identify specific members of the gut microbiome modulated by the FMP. We found that the FMP potentiated the production of butyrate and others short chain fatty acids (SCFA) and decreased the levels of the opportunistic pathogen, Bilophila wadsworthia. These modifications occurred in parallel with an improvement of IBS symptoms18. Metabolic reconstruction of known and unknown species suggests potential cross-feeding involving the food-ingested bacteria and resident commensals that might be relevant for the overall gut homeostasis.

Results

We administered a FMP or an acidified milk product (MP; 125 g/serving) to subjects fulfilling the Rome III criteria for IBS with constipation (IBS-C) during 4 weeks18 (Figure 1). Bacteria contained in the FMP were Bifidobacterium animalis subsp. lactis CNCM I-2494, Streptococcus thermophilus CNCM I-1630 Lactobacillus delbrueckii subsp. bulgaricus CNCM I-1632 and CNCM I-1519 and Lactococcus lactis CNCM I-1631. The consumption of the FMP improved the IBS condition of the subjects compared to the control group (i.e abdominal distension, acceleration of the oro-caecal and colonic transit times, overall IBS symptom severity)18. In order to assess the impact of the FMP on the gut microbiota of the enrolled subjects, we collected their stools before and after the intervention, extracted and sequenced their fecal DNA using a SOLiD v4 Next Generation Sequencing (NGS) device. We compared the sequencing data against the human gut microbiome catalog of 3.3 million genes13.

Figure 1
figure 1

Study design and overview of the bioinformatic pipeline used in this study.

Genes of FMP species are detected in fecal samples

To validate the sensitivity of the method, we verified that DNA originating from the FMP bacterial species could be efficiently recovered in stool samples. We complemented the human microbiome catalog with 9030 non-redundant genes of the FMP genomes, otherwise absent, to insure an optimum mapping of the reads against the FMP genes. Abundance of all FMP species was significantly increased in stool samples in the FMP group at the end of the study. The low level signal detected at baseline and in control subjects is likely due to the presence of endogenous species phylogenetically close to the FMP species (Figure 2 A/B). B. lactis CNCM I-2494 was the dominant FMP species in stool, in agreement with its relatively high survival in the upper GI tract19,20.

Figure 2
figure 2

The FMP and MP modulate species of the gut microbiota.

(A,B) Relative abundance of the FMP-species measured in FMP (A) and MP (B) groups at baseline and after intervention. (C,D) Relative abundance of known or unknown (MGS) species measured in FMP (C) or MP (D) groups at baseline and after intervention. Only MGS that were significantly modulated are depicted. Statistical significance is reported by asteriks (*;p<0.05, ***;p<0.001; Wilcoxon-paired test corrected for multiple tests comparison using the Benjamini-Hochberg procedure). Results are presented using Tukey box-and-whisker plots as quartiles (25%, median and 75%). Outliers were illustrated as single dots as per Tukey option of the Prism Graph software.

Endogenous microbial species are modulated by the FMP

Starting from a comprehensive gene-centric approach, we aimed to reduce data complexity from whole genome sequencing (WGS) reads to individual species. Our bioinformatics pipeline, summarized in figures 1 and S1, relied on 2 key principles: i) Increase (decrease) of a species results in enrichment (deprivation) of the abundance of its genes in the fecal microbiome. Consequently, gene counts should lead to identification of species modulated by the intervention. ii) Genes of a single species should have similar co-abundance profiles across individuals since they are physically linked on a DNA molecule (chromosome, plasmid, phage). As a result co-abundance based gene-clustering leads to identification of clusters of genes corresponding to one bacterial species as already demonstrated14,15.

We identified 1320 and 641 genes as significantly over- or under-represented after the consumption of the FMP or milk product (MP), respectively. We designated these FMP or MP “modulated genes”. Twenty seven percent of the FMP-modulated and 22% of the MP-modulated genes could be assigned to known species by blastN (Figure S2 A/B/C). Clustering by Spearman correlation was applied to the 1320 and 641 modulated genes using the MetaHIT cohort to increase statistical power and improve species assignment36.

Benchmarking against genes assigned to known species revealed that for clusters retrieved with a Spearman correlation factor of 0.75, 97% of genes were assigned to one cognate genome (i.e. known species) (Figure S3). This result confirms that clusters of genes identified can accurately serve as a proxy for intestinal bacterial species. Our analysis revealed that three clusters of modulated genes were derived from known species: Bilophila wadsworthia, Parabacteroides distasonis and Haemophilus parainfluenzae. Eleven additional clusters of genes without species-level identities were retrieved from the 1961 modulated genes (Figure S2D/E). The assignment of genes to unknown species allowed us to increase the percentage of genes without species-level identities for the FMP group from 27% to 51% (Figure S2A). Unknown species were assigned a number prefixed with the tag “MGS,” for MetaGenomic Species.

Integrating both known species and MGS results in a significant gain in statistical power through the compression of the number of variables as compared with the full catalog of 3.3 million genes. The identification of co-variant clusters also yields signatures that are more reliable than a single gene target. We computed the relative abundance of each known or unknown species by averaging the frequency of 50 representative genes for each species, as reported previously36 and compared the abundances using a Wilcoxon paired-test corrected for multiple testing. This revealed that consumption of the FMP stimulated MGS126, MGS203, MGS106, MGS109 and Bifidobacterium dentium. Three species, Parabacteroides distasonis, B. wadsworthia and Clostridium sp. HGF_2 were found to be inhibited (p<0.05) (Figure 2C). In the MP group, only two species, Haemophilus parainfluenzae and MGS204 varied significantly (p<0.05) (Figure 2D).

Reconstitution of the genetic repertoire of unknown species

We then attempted to extend the co-variance approach to the full catalog of 3.3 million genes in order to retrieve other genes of the FMP or MP modulated species. A unique gene of a given cluster was used as a “prey” to “chase” the genes of the catalog co-varying with the prey at a given threshold. Sets of co-varying genes from the full catalog were retrieved in a step by step approach and formed metagenomics clusters centered around the initial modulated genes identified above. We benchmarked this approach with 10 known species (Table S1) and retrieved an average of 2554 genes per species with a specificity of >95%, illustrating the applicability of this approach for retrieving genes of unique bacterial species (Table S1, cf. supplementary material). We applied the same procedure to the FMP-modulated MGS and retrieved >1900 genes for all the MGS (Figure S5A) but 2 (MGS106, MGS109), which had sub-bacteria-sized gene repertoires and were not further investigated. Five MGS were identified as Clostridiales including Eubacterium (MGS146), Clostridium (MGS134), Roseburia (MGS204) and unknown genera (MGS203, MGS126) and 2 MGS as Bifidobacterium (MGS109, MGS106) (Table 1, Supplementary Material).

Table 1 Taxonomic assignment of unknown species (MGS)

The genetic repertoire of a MGS should contain the information needed to perform essential bacterial functions (e.g. DNA replication, peptidoglycan synthesis, protein synthesis). Essential genes represented an average 9.5% of MGS genetic repertoires, which is comparable to other gut commensals such as Escherichia coli and Bifidobacterium longum (Figure S5B). By the same token, it is expected that functions encoded by MGS should also cover the major metabolic pathways present in any microbial cell (i.e. carbohydrate, lipid, amino acids and nucleic acids metabolisms). Genes of MGS were assigned putative functions by comparing them to Cluster of Orthologous Groups (COGs)21 using the CD-Search Tool22. Sorting of these COGs by their hierarchical functional categories confirmed that functions encoded by the MGS similarly cover the major metabolic pathways compared to reference commensals (E. coli and B. longum) (Figure S5C). This result was supported by the projection of the MGS and E. coli on the global map of KEGG metabolism (Figure 3A, Figure S6).

Figure 3
figure 3

Metabolic reconstruction of unknown gut microbial species shows a FMP-mediated increase of potential butyrate producers.

A) Projection on KEGG metabolic pathways of functions encoded by the MGS126 reconstructed genome (in red) using Ipath tool55. Functions of the KEGG global map were depicted underneath. B) Presence of genes predicted to encode enzymes of the butyrate synthesis pathway (thiolase EC 2.3.1.9 (THL); β-hydroxybutyryl-CoA dehydrogenase (BHCD; EC 1.1.1.35); crotonase (CRO; EC 4.2.1.17); butyryl-CoA dehydrogenase (BCD; EC 1.3.99.2); Electron-Transfer Flavoprotein α and β sub-units (ETFα and ETF β; E.C. 1.5.5.1); butyrate kinase (BK; EC 2.7.2.7); butyryl-coA acetyl-coA transferase (ButCoA; EC 2.8.3.8) among the reconstituted Clostridiales MGS. C) Abundance of butyrogenic modules of the human gut microbiota in the IBS cohort at baseline. Mean and standard deviation are represented. Results are presented using Tukey box-and-whisker plots.

We concluded that we retrieved nearly complete genomes of the FMP-modulated species, allowing further metabolic and ecological analyses.

FMP increases potential butyrate producers

The Clostridiales are primary butyrate producers. Our analysis revealed that MGS126, MGS203, MGS204 and Clostridium sp. HGF2 possess the genetic capacity to produce butyrate (Figure 3B).

In order to assess the impact of the FMP on the butyrate producing community, we reconstituted the butyrogenic modules of the entire human microbiome gene catalog by applying the gene-clustering algorithm to the genes of the catalog encoding the butyrate synthesis pathway (cf. Supplementary Methods). This procedure yielded 39 butyrogenic modules of which 30 could not be taxonomically assigned at the species level (Figure 3C). The 9 species-assigned modules were previously reported to belong to experimentally-confirmed butyrate producers Roseburia intestinalis, Roseburiainulinivorans, Butyriovibrio crossotus, Clostridium L2-50, Faecalibacterium prausnitzii, Eubacterium hallii, Lachnopsiraceae bacterium 5_1_63FAA, Coprococcus ART55/1 and Acidaminococcus intestini D2123,24,25. Among the 30 non-assigned species modules, 27 belong to the Order Clostridiales including 5 Lachnopsiraceae, 8 Clostridiaceae, 2 Veillonellaceae and 1 Bacteroidales. Each of these taxa is known to have members capable of butyrate production25,26,27,28. This result validated our approach, as we retrieved butyrogenic modules belonging to major known intestinal butyrate producers. The relative abundances of the butyrogenic modules were calculated across the studied cohort. Module 12, among the 10 most abundant, belongs to MGS126. Modules 44 and 20 belong to MGS203 and MGS204, respectively. As expected, in parallel with their respective MGS, modules 12 and 20 were significantly increased and decreased after FMP and MP consumption, respectively (Wilcoxon paired test; data not shown). A trend (p<0.10) showing an increase of the module 44 (MGS203) was also observed in the FMP group, although it did not reach statistical significance (data not shown). Intergroup comparisons of butyrogenic modules after the interventions indicated a higher abundance of module 4 (Roseburia inulinivorans) in the FMP group as compared to the MP group (p<0.05; ANCOVA without multiple test correction; p>0.05 after Benjamini-Hochberg multiple test correction; Figure S8).

Butyrogenic bacteria are known to produce propionate as an end product of L-fucose degradation29. The presence of L-fucose degrading genes in the genomes of MGS126 and MGS203 might indicate a metabolic potential to also produce propionate (Table 2).

Table 2 Selected shared functions of FMP-modulated species

FMP stimulates the production of SCFA in vitro

In order to further assess the impact of the FMP on short chain fatty acid production we used a 5-compartiment ex vivo human colonic fermenter (SHIME ® PRODIGEST)30 inoculated independently with 2 different healthy donors and kept stable for >5 weeks. We observed that the addition of the FMP led to a significant increase of butyrate, propionate and total SCFA, in the 3 different vessels mimicking the ascending, transverse and descending colon conditions as compared with baseline (Figure 4, Figure S7). Butyrate was the SCFA with the largest observed increased registering an average 1.9 fold augmentation across vessels and individuals (Figure 4).

Figure 4
figure 4

The FMP stimulates the production of butyrate by human colonic microbiota in vitro.

The SHIME colonic fermenter mimicking conditions of the acending, transverse and descending colons (reference 30) was inoculated with fecal microbiota of 2 healthy human donors (black and grey columns) and after 2 week (C1 & C2 time points) exposed to FMP during 3 weeks (T1, T2 and T3 time points). Mean fold changes between control and treatment periods are reported when statistical significance was met. n.s: non-significant, *** (p<0.001, ANOVA).

Functions encoded by FMP-modulated bacteria

Next, we reasoned that species stimulated (or decreased) by the FMP might have responded to the same environmental intraluminal changes triggered by the ingestion of FMP bacteria. We aimed to have a better insight on habitat type, cell cycle and energy metabolism of the FMP modulated species by identifying shared functions of FMP-stimulated species absent in FMP-inhibited species and vice-versa. Available genome information of known species and the reconstituted genetic repertoires of MGS were used to perform comparative genomics and identify Clusters of Orthologous Genes (COGs)21 discriminating between FMP-stimulated and inhibited species (Table 2, TableS2).

Genes encoding pili were present in all FMP-modulated species indicating a potential for epithelial adhesion. MGS126 and MGS203 possessed enzymes allowing utilization of L-fucose which is a mucin derived monosaccharide. These 2 species also carried genes encoding the syntheses of flagellin and spores, which have both been reported to be involved in mucus colonization31,32. Taken together, this data informed us on the possible adaptation of these species to a mucous environment.

The presence of genes involved in oxidative phosphorylation (i.e. cytochrome oxidases bd) in B. wadsworthia and P. distasonis genomes indicates a shared capacity of these species to inhabit niches where oxygen tension is low (i.e. epithelial interface).

We further investigated whether inhibited species shared common features that would be absent in stimulated species. Genes encoding a putative carbon monoxide dehydrogenase (CODH) were found to be present in B. wadsworthia and Clostridium sp. HGF2 genomes predicting their capacity to use carbon monoxide (CO) as an electron donor33. Heme oxygenase is a known source of intestinal CO and its activity has been shown to be up-regulated during intestinal inflammation34. This suggests that CO utilization might be an advantage in inflamed conditions for organisms able to use it. B. wadsworthia also possesses the set of genes (i.e. HMP0179_00240-00243) allowing anaerobic respiration of nitrate, a substrate that is also enriched during inflammation35.

Discussion

Fermented foods have been a part of the human diet since ~10 000 BC comprising different matrixes such as milk (i.e kefir, yoghurt, dahi, cheese) or vegetables (i.e. nato, sauerkraut, pickles)1. Beneficial effects of fermented milks were empirically assessed by the Nobel Prize laureate Elie Metchnikoff in 1907 who predicted their role in the enhanced health and longevity of Bulgarian populations36. In the light of our increased understanding of the impact of the gut microbiome on host health, the question arises how food-ingested microbial communities modulate the autochtonous microbiota.

The emergence of molecular tools based on the 16S rDNA gene brought about a revolution in understanding the immense diversity of gut microbial communities and their links with host health and diet. However, variations of 16S rDNA sequences do not provide basis for species-level resolution8. Consequently, studies using 16S gene as a phylogenetic marker and reporting unaltered gut microbiota upon probiotic consumption2,3,4,5,6 might have failed in detecting changes of the microbiota due to the lack of species-level resolution. Such hypothesis can be accurately studied now with the emergence of novel technologies allowing species-level gut microbiota mapping.

In the current work, whole genome sequencing analysis coupled with a gene-centric approach was used to investigate the effect of a FMP on the human gut microbiota at the species-level resolution. Initially, only 27% (356 of 1320) of genes modulated by the FMP could be assigned to known species; illustrating the limited representation of bacterial commensals in publicly available genome databases. By employing gene clustering methodologies14,15,16, the species-level assignment of FMP-modulated genes was increased to 51%. This clustering algorithm was further used to reconstitute the bacterial gene repertoires of the unknown species used for comparative genomics.

Upon FMP consumption, two previously uncharacterized butyrate producers MGS203 and MGS126 were increased. MGS126 was among the 10 most abundant butyrogenic bacteria of the human gut microbiota. This argues for a positive effect of the FMP on the net production of butyrate in the colon. Importantly, this hypothesis was supported by our in vitro results obtained with the SHIME® fermenter and in a mouse model of intestinal inflammation37. MGS203 and MGS126 were likely stimulated by the FMP-mediated increase of Bifidobacterium species resulting in a higher production of acetate and lactate, used by butyrate producers23,38.

Recent studies have shown that B. wadsworthia and P. distasonis were increased in western diet enriched in saturated animal fat39,40 or RS4, a resistant starch used in processed foods41, respectively.

Although an unlikely potential effect of dietary changes between groups cannot be excluded to explain the decrease of these 2 species in the FMP group, we privileged the more likely hypothesis that the FMP modulated-species responded to FMP-mediated signals. We thus reasoned that species inhibited in the FMP group shared common metabolic pathways that we investigated by comparative genomics.

B. wadsworthia is a gram negative δ-Proteobacteria first isolated from an appendicitis patient and is often associated with inflamed and/or infected clinical samples42. Its genetic potential to utilize the known inflammation-associated metabolites like nitrate and possibly carbon monoxide, suggests that it can be fueled by host-derived products enriched during inflammation, similarly to other pathobionts (i.e. Salmonellaenterica)35. By the same token, the carbon monoxide dehydrogenase genes of another FMP-modulated species, Clostridiumsp. HGF_2, suggests that it could also take advantage of intestinal inflammation34. Both B. wadsworthia and Clostridiumsp. HGF_2 are decreased by the FMP, which could thus have an anti-inflammatory effect as observed in the mouse model of colitis37.

The FMP-induced changes in the GM paralleled improvement of IBS condition in our cohort, suggesting a role of the FMP-modulated species in the amelioration of the clinical parameters18. This hypothesis is compatible with the observations that i) butyrate, known to improve intestinal motility43,44 and visceral sensitivity45, is underproduced by IBS-C gut microbiota compared to healthy controls46 and ii) B. wadsworthia is pro-inflammatory40 and produces sulfide which is toxic47 and nociceptive48.

The MP has reduced effects on gut microbiota compared to the FMP which is consistent with the absence of an active microbial community in this product. Only 2 species, H. parainfluenzae and the Clostridiales MGS204, were shown to be modulated (i.e. decreased) upon the MP intervention. Changes observed in the MP group might originate from a natural variation of the gut community reported to be higher in IBS compared to healthy controls49. Alternatively, it is also possible that the MP induced changes in the gut microbiota. Since H. parainfluenzae is a γ-proteobacteria considered as a gut pathogen and previously associated with IBS in children50, MP-mediated changes might therefore account for the reported MP placebo effect18,51.

In conclusion, our study sheds light on the potential of the bacteria conveyed by fermented milks to stimulate synthesis of beneficial metabolites and decrease abundance of pathobionts. These modifications can potentially improve health and are thus of importance for public health recommendations in western countries. It indicates that the role of food-ingested bacteria in gut homeostasis has been under-estimated, possibly because of methodological limitations that can, today, be overcome. Elucidation of the intricate links between food ingested microbes and human symbionts can thus be addressed.

Methods

Subjects and study design

The study18, was a single centre, randomized, double-blind, controlled, parallel-group design including women (aged 20–69 years), fulfilling the Rome III criteria for IBS-C. Subjects were asked to not consume probiotic-containing products or fermented dairy products. There was no dietary record. After an 11-day wash-out period, 32 subjects (Per Protocol) consumed (125 g/serving) twice a day either the FMP (n = 17) or an acidified milk product (MP) (n = 15) for 4 weeks. Out of 64 stools collected within the PP population, 56 samples (n = 26 and 30, for FMP and MP, respectively) passed DNA quality control and were used for further analyses. Other dairy products and probiotics were excluded from the diet.

Study products

The FMP contained 1.25 × 1010 colony forming units per serving (cfu/serving) Bifidobacterium animalis subsp. lactis (strain number I-2494 in the French National Collection of Cultures of Micro-organisms (CNCM), Paris, France), 1.2 × 109 cfu/serving of Streptococcus thermophilus (CNCM I-1630) Lactobacillus delbrueckii subsp. bulgaricus (CNCM I-1632 and CNCM I-1519) and Lactococcus lactis (CNCM I-1631). The MP was an acidified milk product with low lactose content. FMP and MP were provided by Danone Research (Palaiseau, France).

Stool collection, storage, fecal DNA extraction and sequencing

Stool samples were collected before and after the 4-week consumption period. Immediately after defecation, a fecal sample was collected and stored in RNAlater solution (Ambion). The fecal suspension was homogenized and the volume of RNAlater was adjusted to achieve a final fecal dilution of 1:10 (wt/vol). 200 µl of the 10-fold dilution were added to 1 ml of Phosphate Buffered Saline (Sigma-Aldrich) and centrifuged for 5 min at 5,000 × g. The supernatant was discarded and the fecal pellet stored at −80°C. Fecal DNA was extracted as previously described52. DNA samples were sequenced on a SOLiD v4 NGS (Life Technologies) following the standard protocol for fragment libraries. The raw SOLiD read data was deposited in the EBI European Nucleotide Archive under the accession number PRJEB7171.

Identification of the species modulated by the interventions

Our bioinformatics pipeline, summarized in the Figure S1, aimed to decrease data complexity starting from reads and identify bacterial species modulated by the intervention. In brief, 50-bases tags (reads) were generated with SOLiD v4 sequencer, mapped on the MetaHIT 3.3 million genes catalog and genes count profiles were generated for each individual. Genes potentially modulated upon the intervention (intra-group analysis) were identified with a Kruskall-Wallis test, clustered and assigned to known or unknown species. Relative abundance of a given species was computed by averaging the frequency of 50 genes belonging to the species. DNA from the samples (n = 28) was sequenced with a SOLiD v4 sequencer and an average of 3.97 × 107 50-bases tags (reads) per sample were generated and mapped on the MetaHIT 3.3 million genes catalog13 (average 1.71 × 107 reads/sample; S.D. +/− 7.79 × 106) using METEOR, an in-house software53. Only those tags that mapped uniquely (univocally) to a single gene and to the exclusion of FMP genomes were retained (1.30 × 107 reads/sample; S.D. +/− 5.88 × 106). To quantify FMP species, the catalog of 3.3 million genes was complemented by the non-redundant genes (n = 9030) of the FMP genomes4,54. Genes count profiles were generated for each individual and genes significantly enriched or depleted upon the intervention (intra-group analysis) were identified with a Kruskall-Wallis test. Clusters of genes or species counting less than 9 (or 5) genes in the FMP (or MP) group were disregarded since these clusters are more likely to have resulted from random variations (see supplementary material).

Quantification of mapped reads to the annotated reference was used to identify genes belonging to the endogenous microbiota that were significantly modulated upon intervention. A Mann-Whitney test was performed for FMP and MP groups to identify genes over or under represented after intervention compared to baseline. These are subsequently referred to as “modulated genes.” 1320 and 641 modulated genes were identified for FMP and MP, respectively.

Spearman correlation coefficients were calculated for all gene pairs of this set, using gene abundance as a variable. Clusters of genes were defined as co-varying gene groups at 0.75 spearman correlation threshold. To augment the statistical power of the clustering analysis, we used the gene abundance matrix of 292 individuals from the MetaHIT consortium (Metahit cohort)15. Genes were assigned to a species at a threshold of 95% identity over 90% of the sequence by blastN14.

The genetic repertoire of unknown species was reconstituted through comparison of this study's gene clusters and the co-variation within the 3.3 million gene catalog across the MetaHIT cohort15. Functions for each clustered or reference genome genes were predicted using COG and/or NOG assignments. When available, COG assignment was performed using the Batch Web CD-Search online tool (http://www.ncbi.nlm.nih.gov/Structure/bwrpsb/bwrpsb.cgi). Predicted functions were projected on KEGG metabolic pathways using the Ipath software55 and compared to essential genes identified in the model organism Bacillus subtilis56. When species-level taxonomy could not be assigned, higher taxonomic levels were assigned using MEGAN57. Relative abundance was computed as the average frequency of 50 randomly selected genes of a species. Wilcoxon paired test, corrected for multiple tests by the Benjamini-Hochberg procedure, was used to test the significance of the FMP or MP effects.

Butyrate producers of the human gut microbiome

Genes involved in butyrate production from Roseburia intestinalis or Coprococcus eutactus25 were used as references to retrieve homologs present in the human gut microbiome catalog by using a bi-directional BlastP. Such genes were re-organized into co-varying clusters, hereafter called butyrate modules, which belong to unique species (cfsupplementary material). Relative abundance of the butyrate modules was calculated as the mean of the relative abundance of the genes composing the module. Wilcoxon paired test was used to test the statistical significance of the variation upon interventions (intragroup analysis). For intergroup comparisons, a one-way ANCOVA (Analysis of Co-Variance) was used on the change from baseline for each butyrate module with the group as factor. The relative abundance at baseline was considered as covariate. A Benjamini-Hochberg multiple testing correction was applied to control the False Discovery Rate.

FMP increases SCFA production in an in vitro colonic fermentation system

In vitro colonic fermentation model (SHIME® (Prodigest, Belgium)) mimicking the adult gastro-intestinal tract was used to assess the impact of FMP on production of gut microbial derived SCFA58. The system was inoculated with fecal sample from healthy adults (n = 2). After two-week stabilization period, 50 g of FMP was added daily to the stomach compartment during 3 weeks. The 3 vessels mimicking the ascending, transveral and descending colons were sampled 3 times a week. Concentrations of acetic acid, propionic acid, butyric acid were measured by liquid gas chromatography. A two-way ANOVA (Analysis of Variance) was used to evaluate the impact of FMP consumption on the production of SCFA. The first factor was the period of treatment and the second was the week of sampling nested in the period treatment. Tukey-Kramer post-hoc tests were used to determine which comparisons were significantly different if needed. Normality of residuals and homogeneity of variance were checked to validate ANOVA hypothesis. When the hypothesis of homogeneity of variance between groups was violated, the variance by week was used instead of a pooled variance to allow the use of ANOVA.