Gut microbial communities can be altered by several factors including host lifestyle [1,2,3,4,5,6,7,8,9,10], host species [11,12,13], and geography [14,15,16]. By comparing microbial variation in wild and captive great apes, we can approximate changes that occurred as human microbiomes were influenced by changes in lifestyle, including an increasingly Westernized diet and increased antibiotic usage [1]. Such comparisons are also important for understanding the impact of humans and their activities on great ape health. Anthropogenic threats such as poaching, habitat encroachment, and disease are the primary causes of population decrease in these endangered Hominidae species, and the risk to great ape populations of exposure to human activities continues to increase [15, 17,18,19,20]. A more complete understanding of the native ape microbiome may allow for its use as an indicator of wild ape exposure to human populations [10, 15].

The taxonomic composition of microbiomes of captive primates from the United States, Southeast Asia, and Costa Rica have been shown to cluster with those of geographically separated non-Westernized humans [2]. By sampling non-Westernized humans and sympatric wild apes, along with a geographically distant cohort of captive apes, we can test if their gut microbiomes vary predominantly by host species, lifestyle, or geographic distance [1, 21]. These findings can help guide conservation efforts in endangered animals aimed at promoting microbiome configurations that are linked with optimal host health and metabolism [22]. Such insight will not only inform the well-being of captive populations but can lead to improved conservation outcomes in rehabilitation and reintroduction programs [23].

Antibiotic exposure represents a key lifestyle-associated factor which can acutely and persistently alter the gut microbiome [4, 10, 24]. Because of their ability to kill or inhibit the growth of virtually all microbes, antibiotic treatments provide strong selection for enrichment of antibiotic resistance genes (ARGs), selecting for and leading to an increasingly antibiotic resistant bacterial composition [25, 26]. However, ARGs can also be found in bacteria from pristine environments naive to human-produced antibiotics, most likely due to the selection pressure provided by natural antibiotic secretion from environmental bacteria and fungi over long evolutionary timescales [4, 27]. The discovery, and often indiscriminate usage, of antibiotics in human and agricultural settings over the past century has selected for dramatic enrichment and transmission of ARGs between environmental, commensal, and pathogenic microbes from diverse habitats [4, 28]. Proactive identification of ARGs in environmental and host-associated microbiomes allows for their characterization, tracking, and potential inhibition before they can be exchanged with pathogens and found in the clinic [24, 29, 30]. Characterization of the antibiotic resistome of wild and captive nonhuman primates can help approximate how the human resistome may have changed in the antibiotic era, as well as assess the impact of human contact and captivity on the primate gut resistome [4, 10].

Here we compare the fecal microbiome taxonomic composition, microbial gene families, and antibiotic resistome of wild and captive chimpanzees and gorillas, as well as humans living in the same geographical region as the wild apes. We also compare these cohorts with human cohorts from around the world representing diverse lifestyles and geographies [3, 5, 6, 24, 31]. We hypothesized that the gut microbiomes, gene families, and resistomes of sympatric humans and apes would be similar to each other since sharing of gut microbiota between humans and wild and domestic animals living in geographic proximity has been shown previously [15, 16, 32]. We further hypothesized that captive chimpanzees and gorillas from zoos in the United States would be most similar to Westernized human microbiomes because of human exposure and antibiotic treatment.

In this study, we used 16S rRNA gene sequencing and high-throughput metagenomic sequencing to characterize the fecal microbiota and resistomes of 18 wild central chimpanzees (Pan troglodytes troglodytes) and 28 wild western lowland gorillas (Gorilla gorilla gorilla) from Nouabalé-Ndoki National Park in the Republic of the Congo. These wild animals live in a remote area of the park that was completely naive to human impact before 2003 and identified as the ape population that was least disturbed by humans in 2014 [33, 34]. We also collected samples from 81 humans living just outside of the park, including some individuals that work within the park, in the Republic of the Congo. We compared these cohorts to 18 chimpanzees and 15 western lowland gorillas from the St. Louis Zoo (St. Louis, MO) and the Lincoln Park Zoo (Chicago, IL) in the United States. To determine the contribution of lifestyle, location, and host species, our study includes three lifestyles (wild ape, captive ape, and non-Western human), two locations (USA and the Republic of the Congo), and three host species (human, chimpanzee, and gorilla). To determine if our results are robust to further differences in lifestyles and locations, we compared our samples to published hunter-gatherer, rural agriculturalist, and urban human cohorts from the USA [5, 35], Peru [5, 24], El Salvador [24], Malawi [3], Tanzania [6, 31], and Venezuela [3], and to published wild and captive gorilla cohorts from the Republic of the Congo, the USA, France, and Switzerland [1, 36].

Results and discussion

Taxonomic composition of the gut microbiome was predominantly influenced by lifestyle

To test our hypothesis that the microbiomes of sympatric hosts would be the most similar, we sequenced 16S rRNA genes from the fecal samples of our wild chimpanzee and gorilla cohorts from the Republic of the Congo, Congolese humans living in close proximity to the wild apes, and captive chimpanzees and gorillas from zoos in the USA. Overall, we generated 10.4 million sequences of the V4 region from 160 samples, which we analyzed as exact amplicon sequence variants (ASVs). Through Principal Coordinates Analysis (PCoA) using Bray–Curtis distances, we found that captive chimpanzees and gorillas clustered more closely and had lower Bray–Curtis distances with the Congolese humans than their wild ape counterparts (Fig. 1a; Supplementary Fig. 1a). Similar results were observed using both weighted- and unweighted-UniFrac distances (Supplementary Fig. 2). These results indicate that captive ape microbiomes from the USA were more similar to that of the Congolese humans than to the wild apes despite the ~12,000 km separating them. In contrast, the wild apes and Congolese humans live in close geographic proximity. Despite this close geographic proximity, the gut microbiomes of Congolese humans, wild chimpanzees, and wild gorillas are subjected to varying lifestyles as characterized by host diet and antibiotic usage. Wild chimpanzees and gorillas both eat natural plants and their diets largely overlap with some differences in the relative amount of fruit and browse (leaves and bark) that they consume [37,38,39]. The humans in this study are rural agriculturists and eat non-Westernized diets, high in plant fiber and low in animal protein, and are exposed to antibiotic treatments that can further alter the gut microbiome [8, 40, 41]. Captive apes also consume human agricultural products and receive antibiotic treatments which most likely contribute to the similarities we observed in Congolese human and captive ape microbiomes [42]. These findings are consistent with previous comparisons between captive and wild douc and howler monkeys that found that the microbiomes of captive monkeys became more human-like [2]. Diet and antibiotic treatment altering the gut microbiome has also been shown in studies of non-Westernized and Westernized humans [8, 41].

Fig. 1: Taxonomic composition of the human, chimpanzee, and gorilla gut microbiomes.
figure 1

a Principal-coordinate analysis (PCoA) plot of Bray–Curtis 16S rRNA gene profiles from individual human, chimpanzee, and gorilla fecal samples (adonis, R2 = 0.45, P = 0.001). b Boxplot of ASV richness in host microbiomes (Kruskal–Wallis sum-rank test P < 3.8e−14; Wilcoxon rank sum test with Benjamini–Hochberg (BH) correction, all significant differences P ≤ 7.7e−4). c Relative abundance of major bacteria phyla for all individuals. d Cladogram of discriminatory taxa identified in human, chimpanzee, and gorilla microbiome 16S data sets. Clades significantly enriched in each cohort are indicated by the colors shown in the legend (LEfSe; linear discriminant analysis (LDA) log score >4.5, P = 0.05). Full statistics for all figures are included in the supplementary spreadsheet.

In addition to their higher similarity to human microbiomes, captive chimpanzee and captive gorilla microbiomes were more similar to one another than the wild apes were to each other (Supplementary Fig. 1a). This convergence in captive ape microbiomes happens despite being housed in two zoos ~500 km apart while the wild apes are living sympatrically. This finding is in contrast to a previous study where the gut microbiomes of wild chimpanzees were more similar to sympatric gorillas than to allopatric gorillas [14]. Comparisons of wild sympatric and allopatric apes may be confounded by the underlying differences in plants that are available for consumption in the different locations, while, in this study, captive apes are fed similar diets based on standards provided by the Association of Zoos and Aquariums (AZA). Despite this convergence, the gut microbiomes were still distinguishable by host species and the chimpanzee microbiomes have lower Bray–Curtis distance to humans than the respective gorilla microbiomes (Supplementary Fig. 1a). Chimpanzees are evolutionarily more closely related to humans and consume less browse than gorillas, which may influence their gut microbiomes to be more similar to humans in these findings [11, 12].

To identify differences in alpha diversity, we compared the species richness, Shannon diversity, and Faith’s Phylogenetic Diversity (PD) of microbiomes in our cohort. We found that the captive apes had the highest ASV richness and Faith’s PD (Fig. 1b, Supplementary Fig. 1c). Captive chimpanzees had ~1.5-fold higher richness than wild chimpanzees and captive gorillas had ~1.9-fold higher richness than wild gorillas. However, captivity appeared to have no effect on the Shannon diversity of chimpanzees and gorillas (Supplementary Fig. 1b). Since captive apes have higher species richness but similar Shannon values to wild apes, their microbiomes must contain a less even distribution of ASVs.

Previous studies of monkeys, including doucs and howlers, have reported that captivity is typically associated with a decrease in microbiome richness [2]. However, in the only study to date comparing captive and wild apes, McKenzie et al. found no significant difference in Shannon diversity in the gut microbiomes of captive and wild gorillas [36]. They did not, however, report any other alpha-diversity metrics. This increased richness we found may vary from monkeys due to differences in host-phylogeny and feeding strategies [43]. In addition, captive apes consume a diverse diet in captivity that includes leafy and root vegetable matter, vegetables, high fiber biscuits, and fruit as recommended by the AZA [44, 45]. In comparison, the diet of wild apes over the year is more diverse than that of captive apes but can be severely restricted by seasonality [1]. Our samples were collected in February and March which marks a transition from the end of a dry season to the beginning of wet season. Limited food diversity at this time may have led to decreased gut diversity in the wild apes.

We used linear discriminant analysis effect size (LEfSe) [46] analysis to identify specific taxa that are enriched in each host microbiome and found that captivity is associated with a higher abundance of Prevotella (Bacteroidetes) and Treponema (Spirochetes) for both chimpanzees and gorillas (Fig. 1c, d; Supplementary Fig. 1e, f). Prevotella was also found in high abundance in the Congolese human microbiomes we sampled. Increased colonization of Prevotella and Treponema has been observed previously in captive monkeys and non-Westernized humans and is often attributed to host diet [2, 8, 47]. Species within the genus Prevotella are capable of polysaccharide degradation, and have been correlated with diets rich in plant carbohydrates and fiber [2, 8, 48]. In our analysis, we were able to identify two Prevotella ASVs at the species level: Prevotella copri and stercorea. P. copri in human gut microbiomes is associated with complex carbohydrate degradation in non-Westernized humans and drug metabolism in Westernized humans [49]. Increased abundance of Prevotella in captive apes may be caused by differences in the composition of dietary polysaccharides compared with wild apes [2]. We also identified three Treponema ASVs at the species level: Treponema succinifaciens, bryantii, and berlinense. These species of Treponema are considered nonpathogenic commensals that metabolize complex polysaccharides found in unprocessed plants [50, 51]. Commensal Treponema species are mostly absent in Western humans, but have been found in high abundance in human populations with non-Westernized lifestyles, including Hadza hunter-gatherers [7], Matses hunter-gatherers [5], Tunapuco traditional agriculturists [5], and BaAka pygmies [8]. Due to its presence in non-Westernized human populations and absence in Westernized humans, Treponema has been postulated to be a member of the ancestral human gut microbiome [8]. The much higher abundance in captive apes compared with wild apes in our study indicates that Treponema species may be optimized to metabolize plant polysaccharides from the non-Westernized human diet [5]. However, enrichment of Treponema in our captive apes may also be influenced by seasonality as Treponema has been shown to be less abundant in wild gorillas during dry seasons [1].

In contrast, wild chimpanzees were distinguished by higher relative recovery of Coriobacteriaceae (Actinobacteria) and two families from the phylum Firmicutes: Erysipelotrichaceae and Lachnospiraceae (Fig. 1d, Supplementary Fig. 1e). Wild gorillas had higher relative recovery of Mogibacteriacea (Firmicutes) and SHD-231 (Chloroflexi), an under-characterized bacterial genus found previously in wild western lowland gorilla microbiomes [1] (Fig. 1d, Supplementary Fig. 1f). SHD-231 appears to be found only within the gorilla microbiome, indicating that it preferentially colonizes gorillas which may be due to their dietary niche and physiological adaptations.

Gene family composition was also influenced by host lifestyle but to a lesser extent

To determine how differences in microbiome taxonomy altered functional potential, we performed whole metagenome shotgun sequencing on all 160 fecal samples, rarefied the resulting sequences to 4 million paired-end reads per sample, and annotated UniRef90 [52] gene families using HUMAnN2 [53]. We found that the gene family composition of captive ape microbiomes clustered more closely to human microbiomes than to their wild ape counterparts (Fig. 2a; Supplementary Fig. 4a). The Bray–Curtis distance of captive chimpanzees and humans was lower than the distance between captive chimpanzees and wild chimpanzees. Likewise, captive gorillas had lower Bray–Curtis distances to humans than they did to wild gorillas. However, in contrast to the taxonomic findings, captive gorillas had a lower distance to wild chimpanzees than to humans. Overall, Bray–Curtis distances between captive and wild apes tended to be lower for gene families than for the taxonomic results. Gene families may not be as affected by the differences in host lifestyle since many genes are redundant across microbial taxa [53].

Fig. 2: Gene family and functional pathway analysis of gut microbiota.
figure 2

a PCoA plot of Bray–Curtis distances of the relative abundance of gene families (adonis, R2 = 0.55, P = 0.001). b Relative abundance of functional pathways involved in amino acid synthesis, nucleotide synthesis, and carbohydrate degradation (Kruskal–Wallis sum-rank test for all plots P ≤ 3.2e−9; Wilcoxon rank sum test with BH correction, all significant differences P ≤  0.047).

To better understand the functional differences in host microbiomes we analyzed 308 functional pathways identified using markers from the MetaCyc Metabolic Pathway Database [54]. Using LEfSe, we found that captive and wild apes had higher relative abundance of branched amino acid synthesis pathways (including isoleucine) compared with humans (Fig. 2b; Supplementary Fig. 4b). Congolese humans and captive apes had the highest levels of lysine synthesis. Further, Congolese humans had the highest levels of aromatic amino acid synthesis while the captive ape cohorts had higher levels than their respective wild ape cohorts.

The superpathway of purine synthesis I and adenosine synthesis was highest in the wild gorillas and may be associated with the SHD-231 family, as a previous study also found increased nucleotide sugar biosynthesis and purine nucleotide salvage pathways in gorilla microbiomes dominated by SHD-231 [1]. Congolese human microbiomes were enriched for the degradation of starch, galactose, and stachyose. For starch degradation, both captive ape cohorts had higher relative abundance than their respective wild ape cohorts. For both galactose and stachyose degradation, captive and wild chimpanzees showed no difference in relative abundance, but captive gorillas had higher abundances than wild gorillas. The relative abundance of the starch degradation pathway shows a clear pattern of being influenced by host subsistence. Starch consumption is the main source of calories in humans, while wild chimpanzees and gorillas consume low starch diets [42, 55].

Microbial resistome is primarily influenced by host lifestyle

To probe for novel ARGs, we used functional metagenomics, a method where metagenomic DNA fragments are cloned into expression vectors, transformed into Escherichia coli, and the resulting transformants are tested for phenotypic resistance using antibiotic selection [10, 24, 56]. From our 160 samples, we created 16 functional libraries containing 50–123 Gb of total insert DNA and screened them on 15 different antibiotics or antibiotic combinations resulting in the annotation of 332 ARGs (Supplementary Table 1; Supplementary Fig. 5a).

We combined the resulting ARG protein sequences with ARG protein sequences from the comprehensive antibiotic resistance database (CARD) [57], and used ShortBRED [58] to create unique markers for ARG analysis. We then used these unique markers to identify ARGs in the shotgun sequencing data obtained from each sample. We found that captive ape resistomes, similar to our taxonomic results, clustered more closely to humans than wild apes (Fig. 3a), had lower Bray–Curtis distances to humans than to their wild ape counterparts, and had lower Bray–Curtis distance to each other, signifying a convergence of the resistome in captivity (Supplementary Fig. 5b). To determine if the clustering patterns of microbiome taxa composition correlated with the resistome findings, we performed a Procrustes analysis and found that microbiome taxa is highly correlated with the resistome (PROTEST; Correlation = 0.93; P = 0.001; Supplementary Fig. 6). These results indicate that the distances seen in the resistomes of humans and wild and captive apes are closely linked to the underlying dissimilarity in their microbial taxa. This result indicates that the resistome is largely structured by the community taxonomic architecture, and that antibiotic selection is largely operating on the level of microbial taxa, not necessarily on the resistome as an independent ecological feature.

Fig. 3: The antibiotic resistome of humans, chimpanzees, and gorillas.
figure 3

a PCoA plot using Bray–Curtis distances of ARG RPKM abundance of human, chimpanzee, and gorilla resistomes (adonis, R2 = 0.45, P = 0.001). b Relative abundances of major classes of antibiotic resistance averaged by host cohort. c Sum of all ARG marker abundances per metagenome based on markers generated from the CARD database and functional selections, expressed in log transformed RPKM (Kruskal–Wallis sum-rank test P < 2.2e−16; Wilcoxon rank sum test with BH correction, all significant differences P ≤ 1.2e−5). d Resistome richness per metagenome (Kruskal–Wallis sum-rank test P < 2.2e−16; Wilcoxon rank sum test with BH correction, all significant differences P ≤ 0.049). e Alignment of an ARG containing DNA fragment with high similarity across host cohorts obtained from functional metagenomic selections. f Histogram of amino acid identity to known ARGs for aminoglycoside resistance genes identified through functional metagenomic selections. Colors indicate from which host library each gene was isolated.

Aminoglycoside resistance genes represented more than 50% of the wild ape gut resistomes (Fig. 3b). Aminoglycosides are produced naturally in the environment by several soil bacteria and inhibit protein synthesis by binding to the ribosome [59]. Wild ape microbiomes may acquire these aminoglycoside resistance genes either through selection caused by exposure to naturally produced antibiotics or horizontal gene transfer (HGT) from resistant environmental bacteria. The specific aminoglycoside resistance genes we found enriched in wild apes were identified as AAC(3)-VIIa, an acetyltransferase that inactivates aminoglycosides [59], and FmrO ribosomal RNA methyltransferases [60] that were identified from our functional selections. Ribosomal methyltransferases provide aminoglycoside resistance by methylation of the ribosome that prevents aminoglycoside binding and thus resistance [61]. The overall abundance and richness of ARGs in wild apes was extremely low compared with humans and captive apes (Fig. 3c, d). In general, wild apes are rarely treated with antibiotics [62], so ARG abundance in wild apes may provide insight to sources of antibiotic selection in their environment or diet. For captive apes and humans, tetracycline resistance genes were the most abundant with beta-lactams also making up a large proportion of the resistome (Fig. 3b). Tetracyclines and beta-lactams are both broad-spectrum antibiotics and are popular first-line antibiotics in human infectious disease treatment [63]. Tetracycline resistance in the human and captive ape cohorts was conferred predominantly by six different ribosomal protection genes identified through tetracycline functional selections and beta-lactam resistance was dominated by high recovery of CfxA4, a beta-lactamase. Captive apes had up to ~34-fold higher abundance and up to ~5-fold higher richness of all ARGs compared with wild apes (Fig. 3c, d). The most parsimonious explanation for the higher observed burden of ARGs in captive apes compared with wild apes is the higher selection pressure from antibiotic treatment of the captive apes. We had limited access to antibiotic treatment records for captive apes but the records we have indicate that each captive ape receives on average 1.7 antibiotic treatments per year, providing ample selection for ARGs. Increased ARG abundance in captive apes is more worrisome because it may lead to antibiotic resistant infections that cannot be treated in these endangered apes [64]. Congolese human microbiomes had the highest abundance and richness of ARGs with ~3-fold higher abundance and ~2-fold higher richness compared with the next highest cohort (Fig. 3c, d).

Functional metagenomic selections identified novel antibiotic resistance genes

One major advantage of functional metagenomic analysis is the ability to identify novel ARGs from metagenomic DNA without the need for high sequence identity to known ARGs [10, 24, 56]. This allows for the characterization and surveillance of identified ARGs before they become clinically disseminated [24, 28, 65]. For example, from our functional metagenomic selections of ape microbiome DNA we identified 19 aminoglycoside resistance methyltransferases that belonged to the FmrO family. When compared with known ARGs in the CARD database, we found that they all aligned to 16S rRNA methyltransferases, such as rmtC, rmtD, rmtF, rmtH, or sgm. The percent similarity of these genes to known ARGs ranged from 29 to 98%, indicating that some of these genes have not been identified previously (Fig. 3f).

In addition, we found a total of nine ARG contigs from functional metagenomic selections with high sequence similarity across host cohorts (Fig. 3e; Supplementary Fig. 7). Of these nine contigs, seven were found in both human and ape microbiomes demonstrating that Congolese human microbiomes harbor some of the same ARGs as wild and captive ape microbiomes. One ~2 kb contig that was particularly widespread between different hosts contained a PenP beta-lactamase located between two mobile elements, an integrase and a tnpR resolvase, indicating a high potential for HGT [66, 67]. Highly similar copies (>99% similarity) of this PenP-containing contig were found in functional libraries created from every research cohort in this study except for wild chimpanzees. This gene may be widespread either because it is contained in a widespread bacterial taxon, has spread to several bacterial taxa through HGT, or a combination of these factors. Out of the nine ARG-containing contigs identified, three contained ARGs co-localized with mobile elements. Identification of ARGs in captive and wild apes indicates that ape microbiomes may serve as a zoonotic reservoir of antibiotic resistance that may be exchanged with human associated microbes [30]. Importantly, identification of similar ARGs in human and ape populations also indicates that the human microbiome may serve as a reservoir for ARG acquisition by great ape microbiomes. Transmission of microorganisms between humans and great apes in both directions is well documented, with several pathogens exhibiting zoonotic or anthropozoonotic patterns [15, 17,18,19,20, 32, 68,69,70]. Transmission of microorganisms also results in transfer of the ARGs that those organisms contain. Identification and characterization of reservoirs of ARGs in humans and other closely related Hominidae species, including wild and captive populations, allows for a proactive approach to combat antibiotic resistance via surveillance and targeted infection control measures [71,72,73]. The frequency of these transmission events is likely to rise as humans and great apes are increasingly brought into contact with each other through ecotourism, human population growth near conservation areas, and civil unrest [15, 17, 69, 74, 75].

We also identified an ARG conferring resistance to colistin, an antibiotic of last-resort, from a Congolese human derived functional metagenomic library (Supplementary Fig. 5a). The identified contig contained a gene annotated as a member of the type 2 phosphatidic acid phosphatase (PAP2) superfamily that, when isolated and expressed in a pZE21 plasmid backbone, conferred a minimal inhibitory concentration (MIC) of 4 µg/ml to colistin in E. coli. This represents an ~85-fold increase in resistance compared whit the empty vector control (Supplementary Fig. 8a). An MIC of 4 µg/ml exceeds the 2 µg/ml resistant/susceptible breakpoint in Enterobacteriaceae according to the European Committee on Antimicrobial Susceptibility Testing, indicating this level of resistance could have implications if this gene was transmitted to human pathogens. Colistin works by targeting the lipid A portion of lipopolysaccharide leading to disruption of the outer cell membrane and ultimately cell death [76]. Resistance is typically conferred by modification of lipid A by addition of chemical moieties to LPS, preventing colistin binding. For example, a clinically relevant colistin resistance gene, mcr-1, provides colistin resistance by adding a phosphoethanolamine moiety to LPS [71]. We predict that the PAP2 gene from this study either alters LPS by adding a chemical moiety onto LPS or reduces the amount of LPS in the cell membrane.

Using BLAST, we identified two proteins similar to the PAP2 gene product in the NCBI nonredundant protein database. One was discovered from published functional metagenomic selections on colistin [24] (98.62% identity and 76% coverage) and the other came from a Prevotella sp. genome assembled from metagenomic samples [77] (98.88% identity and 95% coverage) (Supplementary Fig. 8b). After these two similar proteins there is a sharp drop in identity with the next closest proteins also coming from Prevotella hosts (Supplementary Fig. 5b). Bacteria from the genus Prevotella are generally considered to be commensal to humans, so colistin resistance in this genus is not particularly troublesome. However, continued usage of colistin increases the selection pressure and possibility of HGT of this gene to pathogenic strains. Indeed, mcr-1 originated from non-pathogenic species within the genus Moraxella and has since spread to human and animal pathogens in the Enterobacteriaceae family [78,79,80]. The identification of this PAP2 colistin resistance gene from a functional metagenomic selection allows time for the characterization and surveillance of this gene before it is transmitted to human or animal pathogens.

Captive ape microbiomes and resistomes cluster closely with non-Westernized humans

To contextualize the microbiomes from our cohorts within a diverse group of human lifestyles and environments, we compared our samples to publicly available human and ape gut microbiomes. Meta-analyses like this may be influenced by a number of factors including differences in DNA extraction, sample handling, and sequencing runs across the different studies. Despite these limitations, this type of analysis can be used to identify general trends and patterns in microbiome analysis [81, 82]. We included published sequencing data that also targeted the V4 region of the 16S rRNA gene from Westernized and non-Westernized humans. Westernized humans include data from humans in the USA [3] (n = 20) and a peri-urban slum of Lima, Peru [24] (n = 20). The non-Westernized humans can be grouped into a rough gradient of Westernization with the human cohort from this study in the Republic of the Congo (n = 81), rural farmers in El Salvador [24] (n = 20), and farmers in Malawi [3] (n = 20) having intermediate Westernization, and Hadza hunter-gatherers [31] (n = 20) and Guahibo Amerindians [3] (n = 20) being the least Westernized. We also included a study that analyzed captive gorillas from zoos in the USA, France, and Switzerland [36] (n = 8). To normalize for read length and sequencing depth, all 16S DNA sequences were trimmed to 100 bp and samples were rarefied to 9800 single end reads.

Analysis of the microbial composition of the gut microbiome by PCoA using Bray–Curtis distance resulted in Westernized and non-Westernized humans clustering by degree of Westernization despite differences in geographic location (Fig. 4a). Contrary to our initial hypothesis that nonhuman primates would cluster with humans living near them, captive apes clustered near the non-Westernized humans, and wild apes clustered separately. Some samples of the published captive gorilla cohort clustered with our captive gorillas while others were intermediate between captive and wild apes. These clustering patterns are supported by the Bray–Curtis distances between host cohorts (Supplementary Fig. 10). Congolese humans had the lowest distance to other non-Westernized humans, while captive ape distances to non-Westernized humans were lower than the distances to their wild ape counterparts. Alpha diversity, as measured by ASV richness, Shannon diversity, and Faith’s PD, demonstrated no apparent pattern of westernization, but USA humans did have the lowest richness and Faith’s PD, and among the lowest Shannon diversity, as has been demonstrated previously (Fig. 4b; Supplementary Fig. 9a, b) [12, 69, 83]. Overall, we found that captive ape microbiomes were similar to Congolese humans and other non-Westernized human cohorts.

Fig. 4: Comparison of chimpanzee and gorilla microbiomes to human microbiomes across a wide range of lifestyles and environments.
figure 4

a PCoA plot of Bray–Curtis 16S rRNA gene profiles (adonis, R2 = 0.42, P = 0.001). b ASV richness of human, chimpanzee, and gorilla 16S rRNA gene profiles (Kruskal–Wallis sum-rank test P < 2.2e−16; Wilcoxon rank sum test with BH correction, all significant differences P ≤ 0.021). c PCoA plot using Bray–Curtis distances of the relative abundance of gene families (adonis, R2 = 0.48, P = 0.001). d PCoA plot of Bray–Curtis distances of ARG RPKM abundance. Two wild chimpanzee samples and one wild gorilla sample were removed from this PCoA analysis because the lower rarefaction level used for comparisons with public data resulted in no ARGs detected for these samples and no Bray–Curtis distances could be calculated (adonis, R2 = 0.45, P = 0.001). e Sum of all ARG marker abundances per sample expressed in log-transformed RPKM (Kruskal–Wallis sum-rank test P < 2.2e−16; Wilcoxon rank sum test with BH correction, all significant differences P ≤ 0.034). f Resistome richness per sample (Kruskal–Wallis sum-rank test P < 2.2e−16; Wilcoxon rank sum test with BH correction, all significant differences P ≤ 0.049).

We accessed published shotgun metagenomic sequencing data from similar human and gorilla cohorts for gene family and resistome analyses. Westernized human sequencing data were obtained from humans in Norman, USA [5] (n = 20) and the peri-urban slum in Lima, Peru [24] (n = 14). Non-Westernized humans included the Tunapuco community in the Peruvian highlands [5] (n = 10), rural farmers in El Salvador [24] (n = 9), Hadza hunter-gatherers from Tanzania [6] (n = 20), and humans from this study in the Republic of the Congo (n = 81). We also included a cohort of wild gorillas (n = 19) from the Sangha region of the Republic of the Congo [1]. Using these studies, we were able to compare differences in the gut microbiome gene families and resistomes across a gradient of lifestyles and geographic locations. To normalize sequencing reads, all reads were trimmed to 100 bp and rarefied to 3 million single-end reads.

The gene family PCoA showed a similar pattern of clustering to the 16S rRNA analysis with humans and captive apes clustering together by Westernization and wild apes clustering separately (Fig. 4c). At this lower rarefaction of sequencing reads, there was no significant difference between the captive chimpanzee and captive gorilla cohorts (pairwise adonis, P = 0.06) or the wild chimpanzee and wild gorilla cohorts (pairwise adonis, P = 0.35). Wild gorillas from the Sangha region clustered near the wild apes from this study but could still be distinguished as a separate cluster. The gene families present in the microbiomes of Congolese humans had the lowest distance to other non-Westernized humans (Supplementary Fig. 11). The captive ape microbiomes were more similar to non-Westernized humans than to their wild counterparts. However, the distances between wild and captive apes are more pronounced at this lower rarefaction level.

There was no clear separation between the resistome composition of non-Westernized and Westernized humans in the PCoA (Fig. 4d). Captive chimpanzee and captive gorilla resistomes were not significantly different from each other (pairwise adonis, P = 0.23) and the wild chimpanzee, wild gorilla, and Sangha gorilla resistomes were not significantly different from each other (pairwise adonis, wild chimpanzee vs. wild gorilla, P = 0.63, wild chimpanzee vs. Sangha gorilla, P = 0.18, wild gorilla vs. Sangha gorilla, P = 0.10). The Bray–Curtis distance of Congolese human resistomes to USA humans (Bray–Curtis distance of 0.59), was not statistically significant from the Congolese human distance to Tunapuco humans (Bray–Curtis distance of 0.59), and captive gorillas (Bray–Curtis distance of 0.58) indicating that the degree of Westernization was less impactful on the resistome (Supplementary Fig. 12). As with the previous resistome analysis, the captive ape resistomes had lower distance to human cohorts than to their wild counterparts (Supplementary Fig. 12).

Hadza hunter-gatherers had the highest ARG abundance of all groups analyzed, followed by the other human cohorts, captive apes, and then finally wild apes with the lowest abundance (Fig. 4e). For ARG richness, Hadza hunter-gatherers and humans from Lima, the Congo, and Tunapuco had the highest ARG richness. Humans from the USA had similar richness to the captive apes and wild apes had the lowest ARG richness (Fig. 4f). Antibiotic resistance may be high in developing countries for a number of reasons, including: a high burden of infectious disease, poor sanitation, and less regulation of antibiotic compounds [84]. However, we do not have antibiotic treatment data for these subjects so further research is warranted to determine what is driving the high ARG abundance and richness in these developing countries.


Contrary to our hypothesis, sympatric hosts did not have the most similar gut microbiome composition. Host lifestyle appears to be more influential than host geography or host species at structuring taxonomic, gene family, and resistome composition. In addition, captive apes had a much higher richness and abundance of ARGs than their wild ape cohorts. We identified a number of novel ARGs including a novel gene conferring resistance to colistin, an antibiotic of last resort. Further, through comparison with published data, we found that captive ape microbiomes clustered with non-Westernized humans despite living in close proximity to Westernized humans. Our findings indicate that the microbiome of closely related host species may be molded by changes in diet and the degree of antibiotic exposure despite their geographic location.

Materials and methods

Sample collection

Wild ape fecal samples were collected in the Goualougo Triangle region of Nouabalé-Ndoki National Park in the Republic of the Congo in February and March of 2013. “Fresh” fecal samples were collected in the morning from overnight nesting sites of habituated chimpanzees and gorillas. Both species of ape typically defecate upon awakening and emerging from their nests between 5:15 a.m. and 6:00 a.m. Tracking teams usually found the nests by 8:00 a.m. Samples were considered “fresh” if they were judged to be produced within the previous 5 h based on several attributes including intact nature of the sample, color, and appearance of an oily thin shine associated with the outer coat of a fecal sample recently produced [85]. Fecal sample collection also occurred while following individual apes at a distance of 7 m or greater. Human samples were collected from humans living outside of the national park boundaries, the village of Bomassa, and from humans working within the park boundaries. At the time of collection, the species producing the sample was recorded before the samples were immediately transported back to the basecamp, inventoried, and transferred to liquid nitrogen. Captive ape samples were collected from the Lincoln Park Zoo and the St. Louis Zoo in a similar manner; “fresh” fecal samples were collected following the same age definition as in the wild. Collection occurred in the morning after the animal was shifted from its temporary night holding area. Immediately after collection, captive samples were stored in a −80 °C freezer. All sample collection was performed with informed consent for the human participants and in accordance with ethics approval by le Comité d’Ethique de la Recherche en Sciences de la Santé (CERSSA) in the Republic of the Congo. Samples were kept in cold storage either in dry ice, liquid nitrogen, or −80 °C freezers until shipment to Washington University in St. Louis, where they were stored at −80 °C until processing.

DNA extraction and library preparation

DNA was extracted from 100 mg of each fecal sample using a PowerSoil DNA Isolation Kit (MoBio) with a Mini-beadbeater (BioSpec Products) for the lysis step and stored at −20 °C. 16S rRNA gene sequencing libraries were created by amplifying the V4 region of 16S rRNA genes using 515F-806R barcoded primers and PCR protocols described previously [86]. Bar-coded amplicons were pooled and sequenced on an Illumina MiSeq sequencer with 2 × 250-bp paired-end reads. Shotgun metagenomic sequencing libraries were prepared by diluting extracted DNA to 0.5 ng/μl and using a Nextera DNA Library Prep Kit (Illumina) following the modifications described in Baym et al. to generate ~450 bp segments [87]. DNA fragments were then purified using the Agencourt AMPure XP system (Beckman Coulter) and quantified using the Quant-iT PicoGreen dsDNA assay (Invitrogen). For each sequencing lane, 80 samples were pooled together for a final concentration of 10 nM. To reduce stochastic error in sequencing read distribution, we pooled barcoded samples three independent times. The resulting three pools were quantified using the Qubit dsDNA BR Assay and equimolar concentrations were combined into one final pool per lane. Samples were sequenced on an Illumina NextSeq High-Output platform at the Edison Family Center for Genome Sciences and Systems Biology at Washington University School of Medicine in St. Louis, with a target sequencing depth of 4 million 2 × 150-bp paired-end reads per sample.

16S and shotgun metagenomics bioinformatic analysis

Barcoded 16S Illumina reads were demultiplexed and barcodes were removed using QIIME v1.9 [35]. No significant differences were found in sequencing depth between host cohorts, so 16S samples were not rarefied (Kruskal–Wallis, P = 0.18; Supplementary Fig. 1c). Average sequencing depth was as follows: 65,406 sequences for Congolese humans; 73,926 sequences for captive chimpanzees; 65,480 sequences for captive gorillas; 68,281 sequences for wild chimpanzees; and 56,281 sequences for wild gorillas. Phylogenetic analysis was performed using dada2 v1.8 [88] according to the recommended pipeline. The resulting ASVs were assigned taxonomy using the Greengenes database [89] (version 13_8, 97% clusters). For the generation of PCoA plots, we normalized sequence counts by calculating the relative abundance of each taxon or gene for each sample. PCoA significance testing was performed using adonis. UniFrac analysis was implemented using phyloseq [90], based on a phylogenetic tree generated from 16S sequences using phangorn [91]. No normalization was performed before alpha-diversity analyses. Significance testing for alpha diversity was performed using the nonparametric Kruskal–Wallis Rank Sum test and the Pairwise Wilcoxon Rank Sum test using the Benjamini–Hochberg correction for multiple testing. Groups were considered significantly different if the resulting P-value was less than 0.05.

Shotgun reads were trimmed for quality and to remove sequence adapters using Trimmomatic v0.38 [92] with the following parameters: trimmomatic-0.38.jar PE -phred33 ILLUMINACLIP: NexteraPE-PE.fa:2:30:10:1:TRUE LEADING:10 TRAILING:10 SLIDINGWINDOW:4:15 MINLEN:60. Host reads were then removed using DeconSeq v0.4.3 [93, 94] by mapping to the respective host genome (GRCh38 for human; Pan troglodytes, GCA_000001515.5, panTro5 for chimpanzee; Gorilla gorilla gorilla, GCA_900006655.1, gorGor5 for gorilla). To normalize for sequencing depth, all samples were rarefied to 4 million paired-end reads based on species rarefaction curves using MetaPhlAn2 [95] (Supplementary Fig. 3). For gene family and functional pathway analysis, we used HUMAnN2 v0.9.4 [53] to calculate relative abundance of annotated microbial gene families and gene pathways in the MetaCyc database [54]. Since HUMAnN2 does not take paired-reads into account, forward and reverse paired-end reads were combined into one file. The resulting output was normalized to copies per million (humann2_renorm_table) and samples were joined into a combined table (humann2_join_tables).

Downstream statistical analysis was performed in R. Bray–Curtis distances and adonis significance testing were computed using the vegan package [96] and pairwise adonis was computed using pairwiseAdonis [97]. All of the box and whisker plots represent the 1.5× interquartile ranges in the boxes, medians (50th percentiles) are indicated by the bars within the boxes, the whiskers below and above the box indicate the 10th and 90th percentiles, and the outliers beyond the whiskers are indicated with black circles if all data points are not shown. All statistics for boxplot analyses are based on Kruskal–Wallis tests and if significance is indicated two-tailed P-values were calculated using the Wilcoxon rank sum test.

Functional metagenomics and analysis

Due to a limited amount of metagenomic DNA available from individual samples, we pooled samples within host groups to create 16 separate functional libraries (Supplementary Table 1). Functional metagenomic libraries were prepared and created as previously described [56]. Briefly, small-insertion (~3–6 kb) expression libraries were created from 5 μg of pooled metagenomic DNA in vector pZE21 and transformed into E. coli MegaX DH10B electrocompetent cells. We screened libraries on Mueller–Hinton agar plates containing 12 natural and synthetic antibiotics from six different antibiotic classes at concentrations empirically determined to inhibit the growth of non-transformed MegaX cells. PCR was performed on pooled resistant colonies with vector-specific primers, barcoded, and sequenced.

Sequenced reads were filtered, demultiplexed, and assembled into contigs with PARFuMS v1.1 [28]. We removed all contigs from selections where the number of contigs was more than ten times the number of colonies that grew on the selective agar. Contigs smaller than 500 bp were also removed from further analysis and ORFs were annotated using Resfams v1.2 [98]. We identified 887 contigs which resulted in 332 ORFs that could be classified with high confidence as ARGs related to the screened antibiotic class. Using ShortBRED v0.9.4 [58], we clustered identified ARG protein sequences from functional metagenomic selections and ARG protein sequences from the CARD database [57] with a cluster identity of 90% and used Uniref90 as a reference database for marker creation. We generated 3092 unique markers for 907 ARG clusters. To measure the abundance of these markers, relative abundance tables normalized to RPKM were created using the script to identify ARG markers in the shotgun metagenomic reads from all our samples. Procrustes analysis comparing taxonomic and resistome PCoAs was implemented with the procrustes function in vegan.

Microbiome and resistome comparisons to published cohorts

For external analyses, we downloaded public sequencing data from the sources noted in the “Results and discussion” section. After removing all samples from individuals less than 3 years old, we randomly selected up to 20 samples from each cohort. For 16S analysis, all sequences were trimmed to 100 bp. Samples were rarefied to 9800 single-end reads and the same 16S rRNA analysis was performed as described above. For shotgun analysis, we checked for adapters using FastQC v0.11.7 [99]. The sequencing reads appeared to be of good quality and contained no adapters, so we removed host reads using DeconSeq and the human or gorilla reference genomes listed above. The public data contained several samples with less than 4 million reads, the cutoff used for analysis of our samples, so samples were rarefied to 3 million reads and samples with less than 3 million reads were removed from analysis. DNA sequencing reads were trimmed to 100 bp. Some cohorts only had single-end reads so only forward reads were used from paired-end data. Functional analysis was performed using HUMAnN2 as described above. Resistome analysis was performed using ShortBRED with the same parameters as above. A table of the external samples used can be found in the supplementary spreadsheet.

Colistin resistance analysis

The PAP2 gene identified in the functional metagenomic selections was synthesized into the pZE21 vector by Synbio Technologies (Monmouth Junction, NJ). The plasmid was transformed into E. coli DH10B and MIC testing was performed using a Colistin ETEST strip (bioMérieux Inc) and confirmed in triplicate using ComASP Colistin Broth Microdilution Susceptibility Testing plates (Liofilchem).