Gut microbiomes of wild great apes fluctuate seasonally in response to diet

The microbiome is essential for extraction of energy and nutrition from plant-based diets and may have facilitated primate adaptation to new dietary niches in response to rapid environmental shifts. Here we use 16S rRNA sequencing to characterize the microbiota of wild western lowland gorillas and sympatric central chimpanzees and demonstrate compositional divergence between the microbiotas of gorillas, chimpanzees, Old World monkeys, and modern humans. We show that gorilla and chimpanzee microbiomes fluctuate with seasonal rainfall patterns and frugivory. Metagenomic sequencing of gorilla microbiomes demonstrates distinctions in functional metabolic pathways, archaea, and dietary plants among enterotypes, suggesting that dietary seasonality dictates shifts in the microbiome and its capacity for microbial plant fiber digestion versus growth on mucus glycans. These data indicate that great ape microbiomes are malleable in response to dietary shifts, suggesting a role for microbiome plasticity in driving dietary flexibility, which may provide fundamental insights into the mechanisms by which diet has driven the evolution of human gut microbiomes.

surprising that the vast majority of Mongolian humans, whose diet is characterized by high consumption of animal products 4 , were assigned to the Prevotella enterotype. However, food frequency questionnaire results indicate that in addition to meat and fermented dairy products, this study population also consumes large quantities of fruits, vegetables, ginseng, and other plant-based food items, which, collectively, account on average for over 50% of total diet 4 .
To further contrast the microbiota between great apes, Old World monkeys, and human populations, we evaluated inter-group BC dissimilarity based on the genus-level relative abundance of the microbiota ( Supplementary  Fig. 4c). The microbiotas of all three NHP species (WLGs, chimpanzees, and baboons) were more dissimilar to U.S. humans than Mongolian humans. Despite the closer phylogenetic relationship between WLGs and humans compared to baboons and humans, the WLG microbiota was more dissimilar to both human populations. This was further supported by PCoA based on the unweighted UniFrac metric, which showed greater separation of WLGs from humans along PC1 compared to baboons and other Old World Monkeys (Supplementary Fig.  4d). Sympatric WLGs and chimpanzees had the lowest dissimilarity. The dissimilarity between these two different sympatric animal species was even lower than the dissimilarity between two human populations ( Supplementary  Fig. 4c). While this further supports other findings suggesting that microbial communities in sympatric great apes converge 5 , it additionally indicates that shared environment, and likely diet, drive greater convergence of the microbiota of two phylogenetically distinct African great apes, than is observed between two populations of the same species (H. sapiens) with highly varied diets.

Supplementary Note 4:
In our analyses using PAM clustering based on the rJSD among genuslevel relative abundance distributions, both the CH index and the average SI score most strongly supported four WLG enterotypes and two chimpanzee enterotypes. For the U.S. and Mongolian humans, the CH index most strongly supported three clusters, while the SI score supported two clusters. It has been shown that clustering based on different distance or dissimilarity metrics and taxonomic levels can affect enterotyping, and it has been recommended to use multiple distance metrics to verify the presence of enterotypes 6 . Accordingly, we also performed PAM clustering on each dataset using the BC dissimilarity among genus-level relative abundance distributions (Supplementary Fig. 5e, 5g, 5i, 5k,  5m), and, in the case of any discrepancies between enterotypes identified based on rJSD and BC metrics, we evaluated the validity of the clusters based on rJSD and BC with weighted UniFrac distance among species-level (97% OTU approximating species-level) relative abundance distributions. The only discrepancy between enterotypes identified based on rJSD and BC metrics was for the WLG dataset, with five or nine clusters identified based on the BC metric by CH index and SI score, respectively, compared to the four clusters identified based on the rJSD metric (Supplementary Fig. 5d-e). For all other datasets, the optimal number of clusters was consistent between the two distance metrics and the two measures of cluster scores, with the exception of the two human datasets, where the CH index supported three clusters and the SI score supported two clusters for both distance metrics (U.S. human) or for only the rJSD metric (Mongolian human) (Supplementary Fig. 5j-m). However, this is consistent with the two enterotypes, defined by Bacteroides and Prevotella relative abundance, with an occasional third, defined by the relative abundance of Clostridial genera, that have been previously reported for humans 3,7,8 . As the Bacteroides and Prevotella enterotypes are most consistently reported for humans and because SI score is an absolute measure of clustering quality, while CH index is only a relative measure 6 , we show only the Bacteroides and Prevotella enterotypes supported by the SI score for the U.S. and Mongolian human datasets.
In order to assess the validity of the different clusterings of WLG samples based on rJSD and BC, we further investigated the enterotype clusters identified with each metric by the weighted UniFrac distance metric. Principal coordinate analysis based on weighted UniFrac distance of species-level (97% OTUs) relative abundance distributions for WLGs was performed, and the distances among samples assigned to each enterotype based on rJSD and BC were assessed (Supplementary Fig. 7b-d). Within-cluster UniFrac distances for each enterotype cluster were compared against distances between each other enterotype cluster to assess the validity of the clusters based on rJSD and BC ( Supplementary Fig. 7b-d). The four WLG clusters based on rJSD were significantly supported (Supplementary Fig. 7b). In contrast, neither the five clusters selected based on the CH index from BC nor the nine clusters based on the SI score from BC were validated by consistent significant differences between within-cluster and between-cluster distances (Supplementary Fig. 7c It should be noted that based on PAM clustering, at best, only weak support was provided based on SI scores for enterotype clusters obtained for any of the NHP and human groups evaluated in this study, but these scores are consistent in range with previous evaluations of NHP and human enterotypes 2,8,9 . Thus, these enterotypes should be viewed as the optimal clustering based on Arumugam et al., rather than fitting the definition of enterotypes as statistically supported clusters. Indeed, recent perspective suggests that even in the absence of strong statistical support, enterotyping can provide useful insights 10 . The WLG enterotypes were widely distributed across the geographical sampling area (Supplementary Fig. 8a), and there was no significant association between pairwise BC dissimilarity and distance between locations of sample collection within enterotype groups (Supplementary Fig. 8b), suggesting that, within this WLG population, enterotypes are not significantly associated with geographic location.

Supplementary Note 5:
In humans, long-term dietary patterns are associated with enterotypes 3 . Given that WLGs must respond to fluctuations in the availability of food resources throughout the year, we investigated monthly variation in the composition of the WLG microbiota. LEfSe analysis identified bacterial taxa that were significantly associated with each month or month group even after adjusting for potentially confounding effects of collection year, collection site latitude and longitude, and WLG gender. Bacterial taxa with relative abundance associated with month of sample collection were spread across seven phyla and were represented at taxonomic levels ranging from phylum to species (Fig. 3ab). Three of the top six most abundant phyla in WLGs were associated with month of sample collection ( Fig. 1c and 3a-b). The relative abundance distributions of the top sampling month-associated bacterial taxa (highest LDA score) suggested that many of these, such as Firmicutes, Clostridiales, and unclassified Mogibacteriaceae, were not simply represented by monthly spikes in abundance, but rather, their abundance undulated with peaks and troughs across months that ultimately peaked in the months for which they were identified (Fig.  3c). Such distributions may be more indicative of a pattern defined by seasonal trends rather than rapid monthly increases and decreases in abundance. Perhaps one exception may be Treponema, the top May biomarker that rose dramatically in abundance from March to May and then declined dramatically in June (Fig. 3c, left panel). While no other taxa whose relative abundance defines enterotypes were identified as monthly biomarkers, the undulating distribution of most of the sampling month-associated bacterial taxa suggests that temporal variation may follow trends that are not simply demarcated from one month to the next. Thus, we evaluated the monthly distributions of all genera that defined enterotypes to further investigate patterns of temporal variation ( Supplementary  Fig. 9a). SHD-231 (enterotype 1) was present at high relative abundance across many months, but dropped in abundance in April and May, when Treponema (enterotype 2) abundance spiked, and was also at relatively low abundance in phosphate deaminase (NagB) 14 . Based on our metagenomic analyses, we found that the WLG enterotype 1 was enriched in NagA at the gene-level, compared to the other three enterotypes, but was not significantly enriched in NagB (Supplementary Fig. 12a). Enterotype 1 was also enriched in glycosulfatase that could provide an advantage to this enterotype for the utilization of sulfated carbohydrate compounds (Supplementary Fig. 12b).
In addition to metagenomic analyses indicating enrichment in plant fiber degrading pathways and genes in WLG Treponema enterotype 2, the highest sequence identity between the most abundant WLG Treponema sequences and a characterized species was found for T. bryantii with 86-88% sequence identity. This is intriguing, as T. bryantii has been shown to act in symbiosis with Fibrobacter succinogenes as part of a fibrolytic consortium to accelerate digestion of plant fiber in ruminants 16 , and Fibrobacter was the second highest ranked biomarker of the Treponema enterotype (Supplementary Fig. 5n).
In addition to Prevotella enterotype 3 enrichment in fucose degradation pathway genes, alpha-and beta-N-acetylglucosaminidases, alpha-Nacetylgalactosaminidase, beta-galactosidase, and beta-glucuronidase ( Fig. 8f and Supplementary Fig. 13a-e), enterotype 3 was also enriched in additional pathways and genes associated with mucin degradation. At the pathway level, mannose metabolism was also enriched in enterotype 3 samples (Supplementary Data 6). Mucin desulfation is thought to be a rate-limiting step in mucin degradation, and some strains of Prevotella, such as Prevotella strain RS2, encode a mucin desulfating sulfatase 17,18 . SHD-231-abundant enterotype 1 samples were enriched in total glycosulfatases (Supplementary Fig. 12b). However, in order to be active, sulfatases require a critical post-translational modification of an active-site cysteinyl or seryl residue to C(α)-formylglycine, which is catalyzed by anaerobic sulfatase-maturating enzyme encoded by certain bacteria, including Prevotella strain RS2 17,19 . Accordingly, enterotype 3 samples were enriched in anaerobic sulfatase-maturating enzyme genes ( Supplementary  Fig. 13f). While we did not find significant enrichment in genes required for mucin N-acetylneuraminic acid degradation and utilization (i.e. genes of the Nan operon) in enterotype 3 samples, genes encoding sialic acid transporters were enriched in enterotype 3 samples (Supplementary Fig. 13g), suggesting an enhanced capacity for bacterial import of sialic acid.
Aromatic compounds abound in nature, and plants are their main producers. Plants produce a wide range of aromatic plant secondary metabolites (PSMs; i.e. tannins, flavonoids, coumarins and other phenolics), as well as the structural aromatic polymer lignin. Both aerobic and anaerobic bacteria are important degraders of PSMs and lignin, as plants and animals are limited in this capacity 20 . PSMs can have a variety of adverse effects on herbivorous mammals, and many mammals adapt their feeding behavior to limit exposures [21][22][23] . Solibacillus/Staphylococcus-abundant enterotype 4 is enriched in bacterial aerobic (i.e. beta-ketoadipate pathway, 3,4-dihydroxybenzoate [protocatechuate] biosynthesis, benzoate degradation via hydroxylation, and anthranilate degradation via hydroxylation) and anaerobic (i.e. benzoate degradation via CoA ligation) pathways of PSM degradation as well as general pathways of PSM aromatic compound degradation (i.e. phenol degradation, p-cresol degradation, 4-hydroxyphenylacetate degradation, vanillyl-alcohol degradation, etc.) 20,24 . In addition, pathways for degradation of environmental pollutants including benzene, toluene, xylene, polychlorinated biphenyl, dichloromethane, and atrazine were enriched in enterotype 4 samples. However, some enzymes that allow bacteria to degrade naturally occurring plant aromatic compounds are indiscriminant and can also mediate degradation of structurally similar man-made pollutants (bioremediation) 24 .

Supplementary Note 7:
The majority of classified Urticaceae marker sequences were divided between two genera, Urera and Poikilospermum (Supplementary Data 9). The fruit, leaves, bark, and pith of Urera spp. are reportedly eaten by Eastern lowland gorillas (Gorilla beringei graueri), but are not typically reported as a major dietary constituent of WLGs. It is thus surprising that we have detected high relative abundance of Urticaceae plant sequences in the feces from WLGs with Prevotella-abundant enterotype 3 in this region. Annonaceae and Urticaceae were found at low relative abundance in feces from WLGs with enterotype 2 (low frugivory-associated Treponema abundant), were more intermediate in relative abundance in enterotype 1, consistent with the potential transitional nature of enterotype 1 (SHD-231 abundant), and were completely absent from the rare enterotype 4 (Solibacillus/Staphylococcus-abundant).
It is important to note that plant metagenomic analyses come with some limitations. For many of the plant taxa identified here, WLGs are known to consume different parts of the plant (bark, fruit, flowers, leaves, pith, roots, seeds, shoots, stem, young leaves), and different plant parts that may be consumed seasonally, such as fruit, cannot be distinguished based on DNA sequences alone. Moraceae for example, including Morus spp., are reported as both a seasonally consumed plant and a fallback food of WLGs at nearly every geographical site across West-Central Africa, where WLGs consume the fruit, flowers, seeds, leaves, bark and stem of Moraceae plants 25 . There is also some indication that very little DNA from ripe fruit survives digestion through the gastrointestinal tract (< 1%), while DNA encased in a protein-fiber matrix (such as is found in leaves and other non-fruit plant parts) may be better preserved 26 . If this is indeed the case, then the distribution of plants identified by sequencing in feces may be biased in favor of fibrous plants compared to ripe, succulent fruits even when such fruits dominate the diet. In fact, for one enterotype 3 sample, too few plant marker genes were identified for adequate analysis. Thus, this sample was excluded (note enterotype 3, n=4 in Fig. 9b-d). Additional work in this area will be needed to assess the impact of these potential biases.

Supplementary Discussion: Distinctions between Human and WLG Prevotella species
89% of all human Prevotella sequences were grouped into OTUs that were classified to the species-level, with the majority of sequences identified as P. copri or P. stercorea. In contrast, 92% of Prevotella sequences from WLGs remained unclassified at the species level, indicating that most Prevotella found in WLGs are different species than those found in humans. Consistency in metabolic potential at the genus-level cannot be presupposed 27 . In addition, the distinction between the human and WLG Prevotella enterotype may further be dependent on the relationship between Prevotella and other mucolytic taxa in the microbiome (e.g., Lachnospiraceae, also associated with frugivory in WLGs and chimpanzees 28 ).

Additional Discussion on WLG Solibacillus-Staphylococcus enterotype
The small number of WLGs (n=4, 4.6% of the sampled population) harboring the Solibacillus-Staphylococcus enterotype may reflect atypical feeding behavior. Our findings based on plant sequences in fecal samples suggest that gorillas with enterotype 4 consumed high relative abundance of Moraceae plants. In addition to nutritionally important components, plants produce a wide range of plant secondary metabolites (PSMs), including aromatic compounds, monoterpenes, alkaloids, and phenolics, such as tannins, lignins, and flavonoids. Many of these PSMs aid in plant defense, serving as toxins and antifeedants, but may also have beneficial/medicinal properties. Compared to chimpanzees, WLGs have been shown to have higher prevalence and load of intestinal parasites that may also fluctuate seasonally and promote dietary self-medication behaviors that have been observed in great apes and other herbivorous mammals 29,30 . Indeed, many plants eaten by great apes are used in traditional medicine, including Moraceae members that have been shown to have antihelminthic, anti-malarial, anti-leishmania, anti-bacterial, anti-ulcerogenic, and analgesic effects 29 . WLGs are herbivore generalists, consuming nearly 200 species of plants 25,31 . Such a generalist feeding behavior, where a variety of plants are consumed, is thought to circumvent toxicosis by limiting the dose of ingestion of any single PSM 22,23 . Little is known about aberrant feeding behaviors in great apes, especially relating to quantitative evaluation of individual plant taxa consumed 29 . Based on at least one study of WLGs in Cameroon, Moraceae sp. contained the highest levels of total phenolics and condensed tannins of the 26 plant species evaluated 31 . Animals are limited in their ability to metabolize aromatic compounds, while some bacteria are well adapted for their degradation and may even use them as a growth substrate 20 . Indeed, PSMs can modulate the composition and function of the intestinal microbiome through bacteriostatic, bactericidal, and selective growth promoting mechanisms [32][33][34] . WLG enterotype 4 was quite different in both bacterial composition and metabolic function compared to the other three enterotypes. Our findings demonstrating enterotype 4 enrichment in predicted bacterial metabolic pathways for aromatic compound and xenobiotic degradation including phenolics-, alkaloids-and lignin-associated pathways combined with the high relative abundance of Moraceae plant sequences in their stool add support to this hypothesis. Figure 1. Composition of the fecal microbiota of other nonhuman primates and humans. Box-and-whiskers plots showing the relative abundance distributions of all bacterial phyla (left of dotted line), as well as the 30 most abundant bacterial genera (right of dotted line) in a) chimpanzees from the Republic of Congo, b) baboons from Kenya, c) black-and-white colobi from Uganda, d) red colobi from Uganda, e) red-tailed guenons from Uganda, f) humans from Mongolia, and g) humans from the U.S. Phyla and genera labels in red with asterisks indicate bacterial taxa identified by LEfSe as associated with each host type compared to other host types (see Fig. 2a). Phyla and genera labels preceded by "u" represent taxonomic groups that could only be classified at higher taxonomic levels.   Supplementary Figure 3. Alpha diversity varies among host types and is dependent on taxonomic level. Alpha diversity rarefaction plots based on the genus (a-b) and phylum (c-e) using the observed taxa (c), Chao1 (a and d), and Shannon (b and e) diversity metrics. Mean diversity for each host type is shown with the standard error of the mean.     8-9 9-9 9-1 9-9 9-2 9-9 9-3 9-9 9-4 9-9 9-5 9-9 9-6 9-9 9-7 9- and chimpanzee (right) samples collected after wet periods (>4.23 mm day -1 ) and after dry periods (<4.23 mm day -1 ). Note the difference (0.77 mm day -1 and 0.76 mm day -1 for WLG and chimpanzee samples, respectively) in total rainfall between the sample with the lowest total rainfall in the wet month group and the sample with the highest total rainfall in the dry month group.    4a). Wet and dry months in b and e were defined by average daily rainfall for the 30 days prior to the date of sample collection (>4.23 mm day -1 for wet months, <4.23 mm day -1 for dry months, see Supplementary Fig. 9d). Asterisks in a, b, and e indicate that Box-Cox-transformed bacterial relative abundance was significantly predicted by season of sample collection in ANOVA analyses (a) or by average daily rainfall for the 30 days prior to sample collection in ANCOVA analyses (b and e), adjusting for the effects of potentially confounding variables. *P<0.05, **P<0.01, ***P<0.001. Figure 11. Samples from WLG enterotypes 1-3 cluster based on gradients of SHD-231, Treponema, and Prevotella relative abundance, the latter two of which are associated with the low frugivory (LF) and high frugivory (HF) seasons. Principal coordinates analysis plots based on Bray-Curtis dissimilarity among genus-level abundance distributions of samples from WLG enterotypes 1-3 that were collected during HF or LF periods colored by a) WLG enterotype, b) the ratio of Treponema relative abundance to Prevotella relative abundance, c) the season (HF or LF) in which the sample was collected, and d) the ratio of SHD-231 relative abundance to the cumulative Treponema and Prevotella relative abundance. Box-and-whiskers plots showing e) the ratio of Treponema relative abundance to Prevotella relative abundance in WLG enterotype 1 samples collected from HF periods compared to those collected during LF periods and f) the relative abundance of the HF-and LF-associated bacterial taxa (from LEfSe analysis of the all WLG samples) in WLG enterotype 1 samples collected during HF (red) and LF (green) seasons. Two-tailed P-values or significance levels based on two-tailed P values from Mann-Whitney tests, where *P<0.05 and **P<0.01, are indicated in e and f. n.s., not significant at α=0.05.