The human gastrointestinal (GI) tract is the habitat for hundreds of microbial species, of which many cannot be cultivated readily, presumably because of the dependencies between species1. Studies of microbial co-occurrence in the gut have indicated community substructures that may reflect functional and metabolic interactions between cohabiting species2,3. To move beyond species co-occurrence networks, we systematically identified transcriptional interactions between pairs of coexisting gut microbes using metagenomics and microarray-based metatranscriptomics data from 233 stool samples from Europeans. In 102 significantly interacting species pairs, the transcriptional changes led to a reduced expression of orthologous functions between the coexisting species. Specific species–species transcriptional interactions were enriched for functions important for H2 and CO2 homeostasis, butyrate biosynthesis, ATP-binding cassette (ABC) transporters, flagella assembly and bacterial chemotaxis, as well as for the metabolism of carbohydrates, amino acids and cofactors. The analysis gives the first insight into the microbial community-wide transcriptional interactions, and suggests that the regulation of gene expression plays an important role in species adaptation to coexistence and that niche segregation takes place at the transcriptional level.
The gut microbiota is generally considered an ecosystem with many biological interactions2,
In vitro co-cultivations of selected species pairs have shown that microbes influence each other's gene expression12,17. Transcriptional interactions have also been observed in a co-inoculation of Bacteroides thetaiotaomicron and Eubacterium rectale in germ-free mice13, which suggests that this may be a mechanism for avoiding or reducing competition and for niche segregation.
Although previous gut microbiome transcriptomics studies have provided a global overview of transcriptional activities18, its variability between individuals19 and the influence of external factors7, they have not studied transcriptional interactions in a complex community.
To study transcriptional activity in the gut we obtained microarray-based metatranscriptomics profiles from the 693,406 most-common gut microbiome genes20 and abundance profiles for 741 metagenomic species21 across 233 previously DNA-sequenced human-stool samples21,22 (Supplementary Table 1). We found significantly expressed genes from 322 species, which represented 86% of the species that occurred in at least 10% of the samples (Student's t-test, q value ≤0.05). Overall, we observed a tendency (Spearman correlation coefficient, r = 0.37) for abundant species to have more genes identified as significantly expressed than the less-abundant species (Supplementary Fig. 1), and that the number of expressed genes was robust to the reduced detection sensitivity of the metagenomics sequencing (Supplementary Fig. 2). In agreement with previous studies7,18,19, we observed that the majority of the transcriptionally active species belonged to the Firmicutes and Bacteroidetes phyla with 231 and 51 species, and with 83 and 12% of all the expressed genes, respectively (Supplementary Fig. 3a), and that the archaea Methanobrevibacter smithii was among the top 5% of microbial species with the most expressed genes. Furthermore, we confirm that, although the species composition differed between individual samples, the overall distribution of expressed Kyoto Encyclopedia of Genes and Genomes (KEGG) functions is relatively constant across individuals18 (Supplementary Fig. 3b).
However, any two species that coexist in an environment may affect each other's activity in a number of ways, for example, by direct inhibition or activation, by producing or consuming metabolites and by spatial competition. To identify potential interspecies transcriptional interactions, we tested for differential gene expression associated with pairwise species co-occurrence (Fig. 1a). This was done by comparing the expression of a given gene in a potential responder species across samples in which a companion species was either detected or absent (Fig. 1b (details in Methods)). In total, we identified 4,735 genes with transcriptional profiles significantly associated with the coexistence of specific species pairs (ANOVA, q value <0.1 (Supplementary Table 2)). The majority of these transcriptional adaptations (59%) were found in a small subset of the tested species–species pairs and showed significant enrichment in 249 species–species interactions (Fisher's exact test, P < 0.05, Bonferroni corrected). These encompassed 53 responder species and 142 companion species (Fig. 1c (Supplementary Information and Supplementary Table 3 give details)). The average number of genes affected in a species–species interaction was 31 (±4 s.e.m.), and the interaction that affected most genes was observed between Ruminococcus gnavus and the Clostridiales sp. (MGS:41, from Nielsen et al.21), with 542 R. gnavus genes (39% of the measured genes) expressed differentially. To verify these findings, we subsequently designed a series of co-cultivation experiments. In these, 11 out of 13 genes from five different responder–companion species pairs showed expression behaviour similar to that observed in the microbiome (Supplementary Fig. 4 and Supplementary Table 4).
What we observed as microbial transcriptional interactions could, to a large extent, be the consequences of environmental changes caused by activities of a companion organism, which in turn trigger transcriptional adaptation in the responder organism. The species pair Catenibacterium mitsuokai and B. caccae represents such a case. When observed independently, both species significantly expressed starch phosphorylase (EC 18.104.22.168), a gene that is important for starch degradation. The C. mitsuokai orthologue to this gene was, however, silenced during coexistence with B. caccae, possibly because B. caccae, a known specialist in polysaccharide metabolism23, could have depleted the available starch resources. This may serve as an example of a phenomenon that we observed across 102 species pairs (41% of all interacting pairs) in which coexistence-associated expressed genes are significantly enriched for orthologues to genes that are significantly expressed in the companion species (Fisher's exact test, P < 0.05). Importantly, the majority of these orthologous genes (78%) were downregulated during coexistence (Fig. 2 and Supplementary Table 5). This suggests a decreased overlap in expressed functions in the interacting species pairs and indicates that species of the human-gut microbiome undergo niche segregation at the transcriptional level.
Among the transcriptionally modulated genes we observed a series of functions important for anaerobic fermentation, which is central to the colon-energy metabolism and results in incompletely oxidized nutrient substrates and H2. Continued fermentation depends on removal of H2 to stay energetically favourable and accumulation of H2 and CH4 has been associated with bloating and irritable bowel syndrome24. Part of the H2 is excreted through flatus and breath, but a substantial part is anaerobically respired by a small subset of key hydrogenotrophic gut microbes, which include M. smithii and Blautia hydrogenotrophica24,25, which both showed coexistence-associated regulation of their anaerobic respiration pathways (Supplementary Table 2). In M. smithii the expression of genes involved in the methanogenesis was reduced when it was observed together with either of two Lachnospiraceae sp. (MGS:45 and MGS:91). In B. hydrogenotrophica the Wood–Ljungdahl pathway, which yields acetate under H2 and CO2 consumption, was significantly activated during coexistence with Bifidobacterium bifidum (Fig. 3). The change in activity of the Wood–Ljungdahl pathway observed in B. hydrogenotrophica might be driven by a cross-feeding relationship with B. bifidum, as the latter species is a specialized carbohydrate-fermenting species26 that produces the substrates for the CO2 fixation by the Wood–Ljungdahl pathway. In contrast, the coexistence with each of five other species (Clostridium bartlettii, C. leptum, Alistipes sp., B. pseudocatenulatum and B. dorei) repressed the expression of the Wood–Ljungdahl pathway in B. hydrogenotrophica (Fig. 3). Particularly interesting is the relationship with C. bartlettii, which significantly expressed five orthologues from the Wood–Ljungdahl pathway (Fig. 3b), suggesting that C. bartlettii may substitute this activity.
In addition, we observed silencing of pathways for the biosynthesis of short-chain fatty acids, the key end products of anaerobic fermentation, significantly enriched in specific interacting pairs. This was, for example, observed in B. hydrogenotrophica and a Lachnospiraceae sp. (MGS:75) in response to coexistence with Fusicatenibacter saccharivorans (MGS:37) and E. rectale, respectively. Other transcriptional interactions were significantly enriched for nutrient-uptake functions, such as ABC transporters or phosphotransferase systems, flagella assembly and bacterial chemotaxis, and so on (Fisher's exact test, P < 0.05, Bonferroni corrected (Supplementary Fig. 5 and Supplementary Table 6)). The two most-frequent functional annotations across all the coexistence-associated differentially expressed genes (Supplementary Table 7) were the environment-sensing two-component systems27 and the aminoacyl–transfer RNA biosynthesis pathway, which could be responses to altered nutrient availability in the local environment. For instance, the expression of charged tRNA biosynthesis pathways is known to increases in response to low amino acid concentrations to better scavenge the remaining amino acids28,29.
All environments pose a selective pressure on the species that live in them, and hence enrich for species that share properties essential for survival in the given conditions. In the human-gut microbiome this leads to coexistence of microbial species with both complementary and overlapping functional properties4 that may expose coexisting species to symbiotic or antagonistic interactions. In this study we utilized a microarray-based metatranscriptomics data set, which covers an unprecedented large set of 233 human-gut microbiome samples, to describe in situ transcriptional interactions by studying differential gene expression associated with coexistence between hundreds of specific gut microbial species. Interestingly, a significant part of the coexistence-associated differentially expressed genes shows a reduced expression when companion species express orthologous genes, and in consequence this shows that the functional overlap between species reduces. These observations stress that the activities of different microbes change in association with the community composition and, importantly, show that some microbes undergo niche segregation in the GI tract at the level of gene expression. This may, in turn, explain how closely related species can coexist over prolonged periods of time, rather than being outcompeted and excluded from the ecosystem. This mechanism can be beneficial to the host, as it sustains a diverse and rich gut microbial community with robustness to perturbations, a characteristic that is associated with metabolic health20. Furthermore, transcriptional adaptations may explain why the functional output of the gut microbiome is so consistent across individuals.
Although it is beyond the scope of a mere association analysis to determine causality and the mechanisms behind these adaptations, the bias towards reduced gene expression and enrichment of functional categories such as nutrient uptake or anaerobic respiration suggests that the mechanisms behind many of the observed transcriptional adaptations may be indirect through sensing the availability of local metabolites. The indirect effect of nutrient availability on the gene expression may also explain why two out of five coexistence-associated gene expressions observed in situ failed to reproduce in in vitro co-cultivations under rich nutrient conditions, in which critical nutrients may be in excess. This cross-feeding phenomenon is, however, less tangible as it requires a detailed understanding of metabolic pathways, their interconnections and the participating metabolites, whereas the identification of functional orthology required for our detection of niche segregation is entirely driven by sequence information.
This community-wide mapping of microbial transcriptional interactions was limited to detect only interactions between pairs of species. Therefore, our analysis may miss multispecies interactions (as, for example, is known from soil-biofilm formation) that have been shown to include up to four species30. In addition, we observe a weak tendency for the more-abundant species to have more-significantly expressed genes, which suggests that the analysis best describes the more-abundant species. However, the analysis was relatively robust to sensitivity differences between the microarray and the metagenomic sequencing, as shown in Supplementary Fig. 2. Together, these limitations suggest that the number of expressed genes and interactions presented here are on the conservative side.
In conclusion, we learn that the expression of some functions is less affected by coexistence (for example, DNA replication), whereas central metabolisms, which include the anaerobic respiratory pathway, environmental sensing and uptake of substrates, vary more with the specific community context. This and the observed specific species–species interactions is an important insight that may help constrain future metabolic network modelling and extend it to include species interactions. This study also adds to our understanding of how probiotics, faecal microbiota transplant and bacterial cocktail inocula may depend on the ability of species to adapt transcriptionally to the community context they are placed in. Finally, the insight that species transcriptionally adapt to each other further complicates microbiome-association analyses in that it highlights that species activities are context specific.
The human-gut microbiome microarray NimbleGen HD2 was designed to characterize transcriptional activity in the human faecal microbiome as described by Le Chatelier et al.20 In short, it contains three probes for each of 693,406 human-gut microbial genes, which were selected to represent genes that were observed in 20 or more of 124 human-stool samples from Spanish and Danish individuals22. We chose microarrays over RNA sequencing because the great majority of the output from sequencing would probably originate from the rRNA of highly abundant species. This would render interrogation of mRNA from less-abundant species impractical. Although rRNA-depletion methods exist, such an approach could introduce biases in the resulting data. In contrast, the DNA microarray only interrogates the mRNA transcripts that are probed for.
Samples, RNA extraction and microarray hybridization
RNA was extracted from 233 human-stool samples (Supplementary Table 1) published in two previous metagenomic studies22,21. The frozen faeces (200 mg) were aliquoted into 2.0 ml microtubes using sterile spatulas. To each sample was added 400 µl of Tris-EDTA buffer (1×), 500 µl phenol-chloroform isoamyl alcohol (in a 5:1 mix of phenol (pH 4 from Eurobio) and chloroform:isoamyl alcohol (24:1 from Bioblock)), 25 µl of SDS (20% (Ambion)), 50 µl of sodium acetate (3 M, pH 4.8 (Sigma)) and 0.6 g of zirconia/silica beads (0.1 mm (BioSpec Product)), and then mixed by vortexing. Then, the mixture was shaken using a FastPrep FP220A (MP Biomedicals) at 5 m s–1 for 40 seconds, cooled on ice for 90 seconds and again shaken at 5 m s–1 for 20 seconds. After centrifugation at 13,000g, 400 µl of the supernatant was mixed with 500 µl of chloroform:isoamylalcohol (24:1) and homogenized by returning thoroughly to the tube. The homogenate was centrifuged at 13,400g at 4 °C for 15 minutes and 50 µl was transferred in a new 1.5 ml microtube. The following steps were performed using the ‘High Pure Isolation’ kit (Roche) according the manufacturer's recommendations. Residual genomic DNA was removed in two steps using the RNase-free DNase I (Promega) for 30 minutes at 37 °C, once as recommended in the High Pure Isolation kit and the second time after the elution step. RNA samples with a RNA integrity number over 5 (RIN, RNA Nano LabChip bioanalyzer from Agilent) were reverse transcribed to complementary DNA (Invitrogen, Superscript DS cDNA Synthesis kit protocol with random hexamer primers) and Cy3 labelled, hybridized to custom-designed NimbleGen HD2 arrays, washed and scanned according to the NimbleGen One Color Labeling Protocol for Expression Analysis v 3.0 (24 hours of incubation at 42 °C). The Ethical Committees of the Capital Region of Denmark (HC-2008-017) approved the study.
The microarray data were background corrected and quantile normalized, and an expression index calculation at the gene level was done. To analyse the expression data in the context of the 3.9M gene catalogue generated from 396 stool samples by Nielsen et al.21, we matched microarray design gene set to the new gene catalogue. Previously, the 3.9M gene catalogue was annotated with KEGG orthology and was structured into 741 metagenomic species21, referred to as species in the majority of the manuscript. 201,071 genes could be assigned to one of the 741 species.
Significant gene expression
Transcriptionally active genes were inferred using a one-sided Student's t-test by comparing the microarray signal of each gene between samples in which a species that contained the gene was present or absent based on the metagenomics data described above. Genes from species with at least two genes assigned for species that were present and absent in at least 10% samples were tested. To fulfil the latter requirement for three species (MGS:6, MGS:9 and MGS:25), their abundance in samples below tenth quintile was set to zero. In total, we tested 186,231 genes from 375 species and accepted a gene to be significantly expressed at a q value <0.05 (ref. 31). We then summarized the expressions of genes across samples in matrix X by transforming the microarray signal for significantly expressed genes into Z scores: where E, M and SD stand for microarray signal, mean and standard deviation. Subscript g is the gene index, s is the sample index for samples in which the gene is present and a is the sample index for samples in which the gene is absent. Negative entries and entries for genes absent in a sample were set to 0.
The effect of the metagenomic sequencing depth on the detection of significantly expressed genes was tested by re-running the Student's t-test (described above) at downsizing levels below the original downsizing of 11M sequence reads (that is, from 10M to 1M with an interval of 1M sequence reads), which degenerated the species detection signal and increased the noise-to-signal ratio in calling significantly expressed genes.
Identification of coexistence-associated gene-expression changes
To identify gene-expression changes associated with coexistence between specific species pairs we tested each gene from potential responder species in a species pair, but only if the species was represented by 100 or more genes on the microarray (n = 277). Furthermore, only species pairs (consisting of a potential responder species and potential companion species) for which each species was found independently in ten or more samples, each was found together in ten or more samples and both were absent in ten or more samples (4 × 10) were tested (n = 40,608 species pairs). These filtering criteria were selected to ensure sufficient data for the statistical model to capture the microarray-background signal and the independent signal from the two species. Although the tested gene belongs to the responder species, modelling of the effect of the companion species was intended to capture potential cross-hybridization signals from that species. Each responder-gene expression level (the dependent variable) was tested independently in a linear two-way ANOVA model with the presence/absence of the companion and responder species as main factors, and the abundance of the responder species as an error factor. Importantly, this allowed for the inference of a companion/responder-species-interaction effect on the gene expression independently of the responder-species abundance. In other words, the statistical model corrects for the abundance of the responder species. Genes with a false discover rate31 <0.1 for the interaction effect between the responder and companion factor were considered significantly coexistence associated.
Furthermore, each responder–companion species pair was tested for overrepresentation of genes with the expression changed associated with coexistence (Fisher's exact test, upper tail, P < 0.05, Bonferroni corrected) and significantly interacting species pairs were represented in the network in Fig. 1c using Cytoscape 3.0 (ref. 32).
The gut microbial gene catalogue provided by Nielsen et al.21 was annotated with KEGG orthology identifiers that we linked to the respective KEGG modules and pathways using an in-house version of the KEGG BRITE database. We tested significant interactions between microbial species for overrepresentation of functions belonging to KEGG modules or pathways (Fisher's exact test, upper tail P < 0.05, Bonferroni corrected).
To verify the transcriptional interference observed between microbial species in the natural environment of the human gut, controlled experiments were set up for the following responder–companion bacterial pairs: (a) E. sireaum and B. coprocola, (b) B. coprocola and Dialister invisus, (c) B. hydrogenotrophica and Dorea longicatena and (d) B. eggerthii and D. invisus. Bacterial-type strains were obtained from the German Culture Collection (DSMZ). Bacteria were cultured overnight in 10 ml of brain heart infusion (BHI) broth (Fluka) at 37 °C under anaerobic conditions. The sterile BHI broth (2 ml) was placed in both the top and bottom wells of a six-well Transwell plate (0.4 um polyester membrane, 24 mm insert (Costar)). In triplicate, 20 µl of overnight culture of the companion bacteria was placed in the bottom well, and 20 µl of overnight culture from the responder bacteria was placed in the top well. The same bacteria type (20 µl) was inoculated into both the top and bottom wells of another Transwell plate to serve as a control. Cultures in Transwell plates were grown anaerobically for 18 hours at 37 °C. The responder bacteria culture was then collected, spun down with RNA protector and frozen at −80 °C for later RNA extraction. RNA was extracted using the RNeasy Mini Kit (Cat. No. 74104) from Qiagen. The RNA (200 ng) was used to synthesize cDNA using gene-specific primers and the RevertAid First Strand cDNA Synthesis Kit (#K1621 from Thermo Scientific). Gene expressions for selected genes were analysed by quantitative real-time PCR and differentail expression was tested using Student's t-test (Supplementary Table 4). Each sample was run in triplicate. The average threshold cycle number (Ct) values of the samples were obtained from each case. The relative gene expression was calculated using delta Ct (ΔCt) as an exponent of 2 (2ΔCt). ΔCt was calculated with the average from the triplicates as follows: ΔCt = average Ct (control sample) – average Ct (co-culture sample).
Microarray data can be found at the GEO database under accession code GSE76590.
The research leading to these results has received funding from the European Community's Seventh Framework Programme FP7-HEALTH-F4-2007-201052, MetaHIT and FP7-HEALTH-2010-261376: International Human Microbiome Standards. Additional funding was from the Metagenopolis grant ANR-11-DPBS-0001. E.R., A.N.D.M., M.C.R.E. and M.O.A.S. acknowledge funding from the Novo Nordisk Foundation and the Lundbeck Foundation. The Novo Nordisk Foundation Center for Basic Metabolic Research is an independent Research Center at the University of Copenhagen partially funded by an unrestricted donation from the Novo Nordisk Foundation (www.metabol.ku.dk).
Supplementary Tables 1-3 and 5-7