The gut microbiota and inflammatory bowel disease

The gut microbiome is an ecosystem that involves complex interactions. Currently, our knowledge about the role of the gut microbiome in health and disease relies mainly on differential microbial abundance, and little is known about the role of microbial interactions in the context of human disease. Here we construct and compare microbial co-abundance networks using 2,379 metagenomes from four human cohorts: an inflammatory bowel disease (IBD) cohort, an obese cohort and two population-based cohorts. We find that the strengths of 38.6% of species co-abundances and 64.3% of pathway co-abundances vary significantly between cohorts, with 113 species and 1,050 pathway co-abundances showing IBD-specific effects and 281 pathway co-abundances showing obesity-specific effects. We can also replicate these IBD microbial co-abundances in longitudinal data from the IBD cohort of the integrative human microbiome (iHMP-IBD) project. Our study identifies several key species and pathways in IBD and obesity and provides evidence that altered microbial abundances in disease can reflect their co-abundance relationship, which expands our current knowledge regarding microbial dysbiosis in disease.


Introduction
The human gut harbours a diverse community of microorganisms that interact closely with both the host and each other.1][22] Enthusiasm has thus been rising to decipher these microbial interactions in order to detect key microbes in health and disease. 23,24One way of doing this is to create co-abundance networks based on correlations, a method that has the potential to study interactions between microbes and thereby generate hypotheses for experimental validation at a later stage. 23,246][27][28][29][30][31][32][33][34][35] These studies have identified microbial genera that are potentially key in health and disease, e.g.Porphyromonas and Bacteroides in gestational diabetes. 331][32][33][34] A further limitation of 16S sequencing is that it can only identify microbial networks up to genus level.As different bacterial species can have very different functional properties, analysis at genus level cannot fully capture the biochemical interactions between microbes.In consequence, the importance of metabolic network construction from metagenomics data has recently been highlighted. 24,36,37re we present a metagenomics-based network analysis for bacterial species and metabolic pathways in 2,379 individuals from four cohorts from the Netherlands (Figure S1): an IBD cohort (n=496), an obesity cohort (300OB, n=298) and two population-based cohorts (Lifelines-DEEP (LLD, n=1,135) and 500FG (n=450)).We compare the microbial taxonomic and functional networks under different host health conditions and identify potential key species and pathways that shape host-associated microbial networks (Figure 1).We find that the microbial species and pathway co-abundances vary significantly

Microbial co-abundance strength varies between cohorts
We hypothesized that co-abundance strengths could be different depending on host physiological status.We thus assessed to what extent the correlation coefficients were variable across cohorts and characterized variable co-abundance relationships for 38.6% of the species co-abundances and 64.3% of the pathway co-abundances (Cochran-Q test, FDR<0.05,Data S1 & S3).

Differential microbial co-abundances are reflected in abundance levels
When zooming in on the 100 species and 304 pathways that were involved in variable co-abundances, 76% of these species and 84% of these pathways also showed significant differences in their abundance levels among cohorts (ANOVA test FDR<0.05,Data S5 & S6).This implies that the variable co-abundance relationship is largely reflected by differential microbial abundance.We summarized the number of differential co-abundances between species from the same genus or from different genera (Figure 3a).The genus with the most heterogeneous co-abundances was Streptococcus, and a large number of variable co-abundances were observed not only between different Streptococcus species, but also between Streptococcus species and species from other genera such as Eubacterium and Veillonella (Figure 3a).In particular, Streptococcus species were higher in the IBD cohort, consistent with the results of previous studies. 14,39A similar observation was found for the pathway co-abundances, particularly for amino acid biosynthesis pathways, which showed variability not only within themselves but also with respect to various pathways related to nucleoside and nucleotide biosynthesis (Figure 3b).

Specific microbial co-abundances are enriched in disease cohorts
Next, we analysed whether the variable co-abundance relationships were driven by a particular cohort, i.e. whether the co-abundance strength in one cohort was very different from those in the other three cohorts.After correcting for the age and sex differences between cohorts, 120 species co-abundances (Figure S4) and 1,448 pathway co-abundances (Figure S5) still showed cohort-specificity with an FDR of 7.6%, as estimated by permutation (Data S1 & S3).Interestingly, cohort-specific co-abundances were significantly enriched in the disease cohorts compared to the population-based cohorts: 113 (94%) species co-abundances and 1,050 (72%) pathway co-abundances were specifically related to the IBD cohort (Fisher's test P=1.2x10 -5 ), as compared to only 3 species and 117 pathway co-abundance relationships specific to the population-based cohorts LLD and 500FG Pathway co-abundance

A. B.
(Figure 3c & d).Our results highlight that microbial co-abundances are dependent on host health and disease status.Below we discuss the microbial co-abundance networks in IBD and 300OB in more detail, further replicate our findings in independent cohorts, and assess the relevance of disease subtypes, disease characteristics and medication usage.

Replication of the IBD network in the iHMP-IBD cohort
Of the 2,554 species and 37,699 pathway co-abundances established in our IBD cohort, we were able to assess 2,090 species co-abundances and 37,106 pathway co-abundances in 77 IBD individuals from the integrative Human Microbiome Project (iHMP-IBD). 39In the baseline samples of the iHMP-IBD cohort, 531 species co-abundances (25.4%) and 21,882 (59.0%) pathway co-abundance could be replicated at P<0.05 (Data S7 & S8). 39The relatively low replication rate in species co-abundances is largely a power issue, as we also observed that 1,705 (81.6%) species co-abundances and 24,165 (65.1%) pathway coabundances showed no significant difference in their co-abundance strengths between our IBD cohort and the iHMP-IBD cohort (Cochran-Q test, P>0.05, Figure S6, Data S7 & S8).We then compared the IBD networks between the first and last time points of the iHMP-IBD cohort (~1 year apart) and replicated 90.6% of species co-abundances and 99.6% of pathway co-abundances (Cochran-Q test, P>0.05, Figure S6, Data S7 & S8).This suggests that our estimation of co-abundance strengths in IBD was largely replicable in a different cohort and was stable across time.

Microbial networks of IBD in relation to disease characteristics
Previous studies have shown that observed microbial abundance differences could be explained by certain disease characteristics of IBD. 14 We therefore hypothesized that this could also be the case for co-abundance relationships.We assessed whether IBD coabundances (including IBD co-abundances at FDR<0.05 and IBD-specific co-abundances) could be related to the disease subtypes [ulcerative colitis (UC, n=189) vs. Crohn's disease (CD, n=276)], disease location [ileum (n=212) vs. colon (n=286)] and disease activity [inflammation (n=121) vs. no inflammation (n=377)] (Table S1).Most of the co-abundance relationships were comparable between disease characteristics, and only a few showed significant differences at FDR<0.05 (Figure S7, Data S9 & S10), namely 16 species coabundances related to disease subtypes and 8 species co-abundances related to location.For the pathway co-abundances, 91 were related to disease subtypes, 24 to location and 3 to activity (Cochran-Q test FDR<0.05, Figure S7).Out of these, five co-abundance relationships were related to an important butyrate producer, Faecalibacterium prausnitzii, which showed stronger co-abundance relationships in UC compared to CD.One example here was the negative co-abundance relationship of Faecalibacterium prausnitzii with Haemophilus parainfluenzae, a species known to have pathogenic properties. 40

Microbial networks of IBD in relation to medication
We further tested whether drug usage can affect microbial co-abundance, as usage of antibiotics (20.0%) and proton pump inhibitors (PPIs, 26.5%) was higher in patients with IBD than in the general population cohorts (1.1% and 8.4%) (Table S1).Here we detected 3 no significant difference in species co-abundances between antibiotic users and nonusers (Cochran-Q test FDR>0.05, Figure S7), while 1,049 out of 37,959 (3.7%) pathway coabundance relationships showed statistically significant differences between PPI users and non-users, in particular related to the isoprene biosynthesis and methylerythritol phosphate pathways (Cochran-Q test FDR<0.05, Figure S7, Data S10).

Key species and pathways in IBD
We identified 113 species co-abundances and 1,050 pathway co-abundances that showed significantly different effects compared to the other three cohorts.We then assessed whether these IBD-specific co-abundances were highly connected to a specific pathway or species which may be disease-relevant, and our analysis identified three key species and four key pathways for IBD (Figure 4).Key species included Escherichia coli, Oxalobacter formigenes and Actinomyces graevenitzii.6][47][48][49] Interestingly, Escherichia coli shows positive co-abundance relationships with species with pro-inflammatory properties, like Streptococcus mutans, and negative co-abundance relationships with species with anti-inflammatory properties, like Faecalibacterium prausnitzii (Data S1).
The key species we identified for IBD, Actinomyces graevenitzii, is a microbe that is most often identified in the oral cavity or respiratory tract. 41Key IBD pathways included a C1 compound utilization and assimilation pathway (P23-PWY: reductive TCA cycle I), two vitamin biosynthesis pathways (FOLSYN-PWY: superpathway of tetrahydrofolate biosynthesis and salvage and PWY-6612: superpathway of tetrahydrofolate biosynthesis) and an amino acid biosynthesis pathway (PWY-5505: L-glutamate and L-glutamine biosynthesis) (Figure 4b, Data S6).The top key functional pathway in IBD was the reductive TCA cycle pathway (P23-PWY), which had 76 IBD-specific co-abundances, and 94.7% of these were replicated in the iHMP-IBD cohort (Data S3 & S6).The reductive TCA cycle has been recognized as a primordial pathway for the production of organic molecules for the biosynthesis of sugars, lipids, amino acids, pyrimidines and menaquinone (Figure 5a). 42For instance, one IBD-specific co-abundance relationship was related to the biosynthesis of menaquinone (PWY-5837), which is also known as vitamin K2.The coabundance relationship for this pathway in IBD (r=0.1) was weaker than in other cohorts (r=0.3)(Figure 5b), despite the higher abundance of this pathway in IBD (FDR<0.05, Figure 5C, Data S6).Escherichia coli is known to be an important species for the biosynthesis of menaquinone, a growth-promoting factor for a variety of microorganisms in the gut microbiota. 43In line with this, we found that 18.8% of the menaquinone biosynthesis pathway in IBD patients was contributed by Escherichia coli, two times higher than in the two population-based cohorts (Wilcoxon-test P<3.0x10 -11 ) (Data S11).This finding suggests Escherichia coli as an important contributor to menaquinone biosynthesis in IBD that may promote the growth of other microorganisms.Indeed, our study also revealed Escherichia coli as a key IBD species, exerting IBD-specific co-abundance relationships with 15 species (Data S5).5][46] Accordingly, higher correlations were observed between menaquinone biosynthesis and Streptococcus species in IBD than in the other cohorts (Figure 5e, Data S12).

The microbial co-abundance network in 300OB
Replication of 300OB network in LLD obese individuals.1,107 species and 37,886 pathway co-abundances were detected in the 300OB cohort (Figure 2).These estimated coabundance strengths were largely replicable in 134 obese individuals with matched age and BMI from the LLD cohort, with 991 (89.5%) species co-abundances and 32,963 (87.0%) pathway co-abundances showing no difference (Cochran-Q test P>0.05, Figure S8, Data S13 & S14).

Microbial networks in relation to obesity-related diseases
The 300OB cohort was set up to study cardiovascular disease in obese individuals, including 139 patients with atherosclerotic plaque and 159 obese controls (Table S1).
In addition, 35 300OB participants had diabetes.Here we observed only three species co-abundances related to cardiovascular disease, with all three showing stronger coabundances in patients with plaque than in patients without (Cochran-Q test FDR<0.05, Figure S9, Data S13 & S14).These were positive co-abundances between Dorea longticatena and Dorea formicigenerans and negative co-abundances of Lachnospiraceae bacterium 9.1.43BFAAwith Coprococcus comes and Dorea longicatena.

Key pathways in obesity
When we compared microbial co-abundances in the 300OB to the other three cohorts, we identified 281 pathway co-abundances that showed a significantly different effect, i.e. obesity-specific co-abundances.One key pathway in obesity was degradation of allantoin (PWY0-41, Figure 4b, Data S6), which showed obesity-specific co-abundance relationships with 85 pathways.Allantoin is one of the active principles in various plants, e.g.yams, and is found to enhance insulin secretion and lower plasma glucose. 47,48Its degradation product, oxamate, plays an inhibitory role in oxaloacetate/aspartate amino acids. 49In line with this, we found that the allantoin degradation pathway showed stronger negative correlations with the biosynthesis pathways of oxaloacetate/aspartate amino acids (including lysine, homoserine, methionine, threonine and isoleucine) and the biosynthesis pathway of aspartate (PWY0-781, Figure 6), which were both positively associated with fasting glucose level and negatively associated with fasting insulin level (P<0.05,Table S2).

Discussion
This study is a microbial co-abundance network analysis based on metagenomics data, involving 2,379 participants from two population-based cohorts (LifeLines-DEEP and 500FG) and two disease cohorts (IBD and 300OB).We report 3,454 species and 43,355 pathway co-abundance relationships that were significant in at least one cohort.Among them, the effect sizes of 38.6% of species co-abundances and 64.3% of pathway coabundances were significantly different between cohorts.In particular, 113 species co-abundances and 1,050 pathway co-abundances showed IBD cohort-specific effects and 281 pathway co-abundances had specific effects in the 300OB cohort.Our study provides evidence that microbial dysbiosis can be reflected in alterations in microbial co-abundance.
Our study yielded several findings.We identified three species and four pathways in IBD and one pathway in 300OB that served as key players in disease-specific coabundance networks.Key IBD-associated species included Escherichia coli and Oxalobacter formigenes. 14,50,511][52][53] Consistent with this, we replicated high abundances of Escherichia coli and low abundances of anaerobic metabolism pathways in IBD.5][46] In contrast, these co-abundances were either weak or negative in our population-based and obesity cohorts.We further identified Actinomyces graevenitzii as a key species in IBD.5][56][57] Two case reports have also suggested that Actinomyces may aggravate the intestinal injuries caused by inflammation. 58,59e top key functional pathway in IBD was the reductive TCA cycle pathway (P23-PWY), which had 76 IBD-specific co-abundances.Interestingly, the key IBD species Escherichia coli is known to be an important species for the biosynthesis of menaquinone, a growthpromoting factor for a variety of microorganisms in the gut microbiota. 43In line with this, we found that 18.8% of the menaquinone biosynthesis pathway in IBD patients was attributed to Escherichia coli, which is two-times higher than in the two populationbased cohorts (Wilcoxon-test P<3.0x10 -11 ).Another notable IBD key pathway is the tetrahydrofolate pathway, which is responsible for folic acid derivative biosynthesis 3 and supplementation of folic acid.This pathway has been shown to reduce the risk of colorectal cancer in IBD patients. 60Interestingly, previous research has shown that oral intake of L-glutamine attenuates the colitis induced by dextran sulfate sodium in mice. 61e identified a negative co-abundance with L-glutamine biosynthesis and biosynthesis of other amino acids like L-isoleucine and L-methionine.Previous research showed that both these amino acids play an important role in the immune system. 62,63L-glutamine has been tested as supplement in patients with IBD, but did not show improvements in clinical outcomes like disease activity scores. 64Our results show large numbers of connections for L-glutamine with other pathways such as the biosynthesis of other amino acids.These pathways might also be of interest when exploring L-glutamine as an intervention for IBD.
In obesity, we identified the allantoin degradation pathway as a key pathway, showing obesity-specific co-abundance relationship to 85 pathways, mainly negative correlations with biosynthesis of oxaloacetate/aspartate amino acids.These pathways are related to insulin secretion and glucose metabolism.However, their co-abundance relationships did not show significant differences between patients and non-diabetic individuals, which is likely due to a power issue as there were only 35 diabetic patients in 300OB.Instead, we found three species co-abundances related to presence of artheriosclerotic plaque, involving Dorea longticatena, Dorea formicigenerans, Lachnospiraceae bacterium 9.1.43BFAAand Coprococcus comes.Notably, D. Longticatena and Lachnospiraceae species have been linked to atherosclerotic cardiovascular disease. 9together, our analyses show that microbial dysbiosis in disease may not be driven solely by differences in abundance level, it may also reflect shifts in microbial interactions that are mirrored in co-abundance analyses.Particularly when applied to metagenomics sequence data, pathway-based co-abundance networks provide further insights into functional dysbiosis in IBD and obesity.However, we also acknowledge several limitations of our study.This is an in-silico network analysis based on correlation in bacterial abundance levels.Even with the large sample size, our study is still undersized for making comparisons to the number of interactions assessed.In recent years, many different network tools have been developed to tackle the statistical challenges in inferring networks for compositional data.In this study, we applied two independent methods, SparCC and SpiecEasi, to establish microbial co-abundance networks based on MetaPhlan and HUMAnN2 annotation.Our analysis can thus be biased due to these annotation tools.Other annotation tools, e.g.mcSEED , may yield different pictures of microbial community and functional profile, thereby identifying different co-abundance networks. 65Thus, such in-silico-based network inferences require further functional validation.][68] Thus, in order to understand the microbial ecosystem in terms of functional interaction in diseases, we need complementary approaches like meta-proteomics and metametabolomics that provide a more direct readout of the functional properties of the gut microbiome.Furthermore, the cross-sectional design of this study makes it hard to assess the stability of our findings over time.However, we did observe similar findings for the iHMP-IBD cohort for 98.2% of species co-abundances and 99.4% of pathway coabundances between two time points spanning one year (Cochran-Q test P>0.05).This implies that co-abundance relationships are largely consistent over time.
Additionally, due to our study design, we cannot disentangle cause from consequence.Longitudinal studies are therefore warranted and should be combined with functional validation.Moreover, especially in the context of IBD, which is a heterogeneous disease, we had limited ability to pinpoint co-abundance networks to specific disease characteristics like the subtypes CD and UC.This is probably due to the lack of power to detect this by subgrouping our cohorts.Larger cohorts with well-documented disease characteristics are needed in the future.
This study presents the microbial network analysis to examine both microbial species and functional pathways based on metagenomics sequencing.Our data show that dysbiosis of the gut microbial ecosystem in disease can be assessed by the altered abundance level, but can also be seen at the level of microbial interaction, at least in terms of coabundances.We have also identified IBD-specific and obesity-specific species and pathways that potentially play important roles in regulating the microbial ecosystem in disease, and these disease-specific microbial interactions extend our current knowledge about the role of the microbiome in disease.

Study cohorts
All four cohorts used in this study have been described before. 3,14,69,70In short, the Lifelines Deep cohort (LLD) is a large prospective cohort study from the north of the Netherlands. 71LLD contains 58.20% females and 41.80% males, the mean age (SD) of participants is 45.04 (13.60) years and their mean BMI is 25.26 (4.18) (Figure S1).In this study, we included 1,135 LLD individuals for whom there is metagenomics and phenotype data. 3The 500 Functional Genomics (500FG) cohort consists of 534 healthy adult volunteers from the Netherlands. 69,70In 500FG, 56.50% are women and 43.50% are men, the mean age of participants is 27.43 (12.35) years and their mean BMI is 22.70 (2.72) (Figure S1).8][79] In total, we detected 698 species and 489 pathways present in at least one of the four cohorts.To deal with sparse microbial data in the network analysis, we focused on species/pathways present in at least 20% of samples in at least one cohort.This provided a confined list of 134 species and 343 pathways for use in the network analysis.Together these accounted for, on average, 86.9% and 99.9% of taxonomic and functional compositions, respectively.

Co-abundance network inference
Co-abundance analysis on compositional data is challenging because it is likely to exhibit spurious correlations due to the dependency of fractions (i.e.1][82][83] In particular, the problem can be more serious in a microbial community with low compositionality. 84We therefore first assessed the inverse Simpson index of microbial composition for the effective number of species (n eff ).Our analysis showed high compositionality in both functional pathway composition (2.09, 2.10, 2.11 and 2.08 in LLD, 500FG, 300OB and IBD, respectively) and species composition (10.74, 11.87, 12.30 and 8.80  in LLD, 500FG, 300OB and IBD, respectively).Following the suggestion of Weiss et al. based on their assessment of the performance of eight different methods (Bray-Curtis, Pearson, Spearman, CoNet, LSA, MIC, RMT and SparCC), we decided to use the SparCC method because it has been proven to be able to infer linear relationships with high precision for high diversity compositions with n eff lower than 13. 84Species composition data from MetaPhlan was converted to predicted read counts by multiplying relative abundances by the total sequence counts, and then subjected to a Python-based SparCC tool. 29or pathway analysis, the read counts from HUMAnN2 were directly used for SparCC.Significant co-abundance was controlled at FDR 0.05 level using 100x permutation.In each permutation, the abundance of each microbial factor was randomly shuffled across samples.To reduce indirect associations, we further applied SpiecEasi (v1.0.6), which infers the microbial network underlying graphical model using the concept of conditional independence. 38In this way, we obtained 3,454 species and 43,355 pathway co-abundances that were detectable by both methods (Figure 1).

Co-occurrence network inference
Presence and absence of each bacterial species and metabolic pathway were treated as binary traits.The pair-wise co-occurrence relationship between two microbial factors (species or pathway) in each cohort was assessed using Pearson's Chi-squared test.If the number of co-occurrence pairs was greater than the number of co-exclusion pairs, the two microbial factors were considered to be a co-occurrence.If the number of co-occurrence pairs was less than the number co-exclusion pairs, the two factors were considered to be a co-exclusion.Permutation (100x) was conducted to determine significance at an

Key microbial species and pathway detection
To assess to what extent cohort-specific microbial relationships were linked to a specific species or pathway, we calculated the number of cohort-specific microbial relationships per species/pathway.To define the key species/pathway, we took the maximum number of false cohort-specific relationships per species/pathway from each permutation and determined the key species/pathway cut-off as the upper range of the 95% of confidence interval based on 100x permutations.At this cut-off, there is a 5% probability that a false enrichment could occur by chance.In this way, a species with at least 13 cohort-specific coabundances or a pathway with at least 70 cohort-specific co-abundances was recognized as a key species or pathway.For co-occurrence networks, these numbers were 10 for key species and 45 for key pathways.In such a way, we detected 192 cohort-specific species co-abundances and 2,235 cohort-specific pathway co-abundances.

Assessing impact of confounding factors
The age and sex distributions were different between cohorts (Figure S1).To assess the impact of age and sex, we conducted partial correlation analysis (Figure S11).For example, to assess the co-abundance between species A and B, we first assessed the Pearson correlation of A and B to each covariant, say C, respectively.Then, a pairwise correlation matrix of A, B and C was subjected to partial correlation (Figure S11) using the partial correlation function cor2pcor from the R package corpcor (version 1.6.9).This insured that the partial correlation determined between A and B was independent of the covariant C. To assess the impact, we compared the correlation coefficient between SparCC correlation and partial correlation for all co-abundances and found comparable effect size (Figure S12).After regressing out the confounding effects of age and sex on cohort-specific co-abundances, 120 out of 192 (62.5%) species and 1,448 out of 2,235 (64.8%) pathway co-abundances remained cohort-specific.

Replication of microbial networks
To replicate microbial networks in IBD, we used data from 77 IBD patients from the Integrative Human Microbiome Project (iHMP-IBD) as a replication cohort. 87Given the iHMP-IBD's longitudinal study design, we could examine metagenomics data from the first and the last sample collection for each individual.In all, 91% of the species (123 out of 134) and 99% of the pathways (340 out of 343) found in our IBD cohort were also detected in the first sample collection in iHMP-IBD.The differences in co-abundance strength between the IBD cohort and the iHMP-IBD cohort were assessed using the Cochran-Q test.A significant P>0.05 was applied to define replicable co-abundances.We also investigated the stability of microbial networks in iHMP-IBD by comparing the microbial co-abundances in the first and last sample collection from the same participants using the same approach.
To replicate microbial networks in 300OB, we selected 134 obese individuals from the LLD cohort with matched age and BMI.Here we considered a co-abundance to be replicable if the Cochran-Q test heterogeneity between the discovery and replication cohorts was not significant at P>0.05.

Assessing the relevance of microbial co-abundances to sub-phenotypes
Patients in the IBD and 300OB cohorts have different disease subtypes, and both cohorts had higher proportions of drug users than our population cohorts.In particular, the IBD cohort contained 276 patients with CD and 189 with UC.Within the IBD cohort, 126 patients took PPIs and 97 took antibiotics.In the 300OB cohort, 53.4% (159 out of 298) had an atherosclerotic plaque detected by ultrasound and 35 were diabetic. 72To assess the co-abundance related to disease sub-phenotypes, we split the cohorts based on disease subtypes or medication use and inferred microbial co-abundance using SparCC.The Cochran-Q test was applied to assess the differential microbial co-abundances at FDR<0.05.

Species contributions to pathways and species-pathway associations
Since the pathway abundances reported by HUMAnN2 are computed at both community and individual species level, we further looked into the contribution of species to each pathway and reported the top contributor (species). 76To show the functional relationship between species and pathways (e.g.whether a given pathway has the potential to promote the growth of a species through its metabolic products), we also checked the correlation (Spearman) between microbial species and pathway abundance after adjusting for age, sex and read depth using a linear regression model. 88FDR was further calculated based on 100x permutation.

Network Visualization
Cohort-specific networks based on cohort-specific co-abundances were visualized using a circle plot or heatmap with hierarchical clustering analysis (ward.D clustering based on Minkowski distance).Both species and pathways networks were visualized using the package igraph (v1.2.4.1) in R. 89 For species networks, species belonging to the same genus were clustered together.For pathway networks, pathways from the same metabolic category were presented in a sub-circle, and categories with a limited number of pathways (fewer than 4) were grouped into the other category.Classification of pathways was based on the MetaCyc metabolic pathway database. 78,79n Supplementary Data files.

Figure 3 .
Figure 3. Differential and cohort-specific microbial co-abundances.a. Differential species co-abundances involved in 45 microbial genera.b.Differential pathway co-abundances involved in 41 microbial metabolic categories.Each dot indicates one microbial genus or metabolic category.Each line represents differential species or pathway co-abundances between species or pathways from either the same or different genera or metabolic categories.The width and darkness of the lines represent the relative number of differential co-abundances.c.Pie chart of 120 cohort-specific species co-abundances showing the proportion of specific co-abundances detected in each cohort.d.Pie chart of 1,448 cohortspecific pathway co-abundances showing the proportion of specific co-abundances detected in each cohort.

Figure 4 .
Figure 4. Cohort-specific species and pathway co-abundances.a. Cohort-specific coabundances identified for three key species in the IBD cohort, involving 33 IBD-specific co-abundances.Each dot indicates one species.Red indicates IBD key species.Each line represents one IBD-specific co-abundance relationship.b.Cohort-specific co-abundances identified for four key pathways in IBD and one key pathway in 300OB, involving 385 cohort-specific co-abundances.Each line represents a cohort-specific correlation between two pathways.Yellow lines represent obesity-specific co-abundances.Grey lines represent IBD-specific co-abundances.Each dot indicates one pathway.Pathways belonging to the same metabolic category have the same colour and are clustered as sub-circles.Colour legends are shown in the plot.

Figure 5 .
Figure 5. Menaquinone biosynthesis related to Streptococcus overgrowth in IBD.a. Menaquinone biosynthesis (PWY-5837) from the reductive TCA cycle (P23-PWY) in bacteria.b.The menaquinone biosynthesis pathway shows IBD-specific interaction with the reductive TCA cycle pathway.c.Both menaquinone biosynthesis and reductive TCA cycle pathway abundance are significantly higher (ANOVA test, FDR<0.05) in the IBD cohort than in the two population-based cohorts.Box plots show medians and the first and third quartiles (the 25th and 75th percentiles) of abundance after correcting for age and sex, respectively.The upper and lower whiskers extend the largest and smallest value no further than 1.5*IQR, respectively.Outliers are plotted individually.(Source data is provided as a Source Data file) d.Three Streptococcus species show IBD-specific co-abundance with Escherichia coli.e.The menaquinone biosynthesis pathway shows strong positive correlation with three Streptococcus species in IBD.N=2379 independent samples are involved (N LLD =1135, N 500FG =450, N 300OB =298, N IBD =496).The forest plots show co-abundance strength and direction in each cohort, with square dot for the correlation coefficient and the bar for the 95% confidence interval.

Figure 2 .
Microbial co-abundance networks in each cohort.a. Venn diagram of the numbers of species co-abundances detected in each cohort.In total, we identified 3,454 co-abundance relationships significant at FDR<0.05 in at least one cohort by combing SparCC and SpiecEasi.b.Venn diagram of numbers of species co-abundances detected in each cohort.Similarly, at microbial metabolic pathway level, 43,355 co-abundance relationships were detected at FDR<0.05.