An integrated analysis of gut microbiota, blood biochemical and metabolome in 52 endurance horses was performed. Clustering by gut microbiota revealed the existence of two communities mainly driven by diet as host properties showed little effect. Community 1 presented lower richness and diversity, but higher dominance and rarity of species, including some pathobionts. Moreover, its microbiota composition was tightly linked to host blood metabolites related to lipid metabolism and glycolysis at basal time. Despite the lower fiber intake, community type 1 appeared more specialized to produce acetate as a mean of maintaining the energy supply as glucose concentrations fell during the race. On the other hand, community type 2 showed an enrichment of fibrolytic and cellulolytic bacteria as well as anaerobic fungi, coupled to a higher production of propionate and butyrate. The higher butyrate proportion in community 2 was not associated with protective effects on telomere lengths but could have ameliorated mucosal inflammation and oxidative status. The gut microbiota was neither associated with the blood biochemical markers nor metabolome during the endurance race, and did not provide a biomarker for race ranking or risk of failure to finish the race.
Endurance athletes endure physiological and biochemical disruptions that require the body to undergo several adaptations to maintain homeostasis1,2,3,4,5, which results in neuromuscular and contractile functions in muscles6, electrolyte imbalance6, decreased glycogen storage6, increased mitochondrial biogenesis in muscle tissue7, increased oxidative stress4, activation of the sympathetic-adrenomedullary and hypothalamus-pituitary-adrenal (HPA) axes, increased intestinal permeability as well as inflammatory responses (reviewed by Clark and Mach1). Consequently, adaptations to endurance exercise are influenced by the transcriptional and translational regulation of genes that encode the proteins and metabolites controlling these processes4,8. Recent metagenomic studies have revealed that intense physical exercise also induces changes in the gut microbiome9,10,11,12,13.
The equine microbiota reaches a concentration of approximately 109 microorganisms per gram of ingesta in the cecum14, and consists of about 108 bacterial genera15,16,17 and at least seven phyla17,18,19,20,21,22. The variation in the gut microbiota composition and functions in endurance equine athletes is not clear. For example, it is unknown whether inter-individual variation results from a continuum of different bacterial compositions or whether individual gut microbiota gather around some stable bacterial communities that can be associated with athletic performance. Studying such questions is complicated due to the complexity of varying physiological (e.g. breed, age, sex, host kinship), nutritional and environmental conditions between individuals. In fact, diet dramatically modulates gut microbiota composition and metabolism in horses16,19,23,24,25,26,27,28,29. Endurance horses often consume diets high in fermentable carbohydrates that provide a continuous source of energy during sustained physical exercise28,30. However, the over consumption of readily fermentable carbohydrates can reduce microbiota diversity and function (e.g. less synthesis of byproducts such as short chain fatty acids (SCFA) and secondary bile acids), which may play a role in the development of common equine intestinal diseases28,30.
Given the gut microbiota’s fundamental role in maintaining host homeostasis during endurance exercise, we first aimed to study similarities and differences between the gut microbiota among endurance athletes. We then sought to identify the possible relationship between blood biochemical and metabolic profiles and the gut microbiota composition to (i) reveal unique biomarkers of the energy and stress response to endurance exercise, as well as athletic performance and (ii) provide insights into the molecular control of this response. We therefore performed an integrated analysis of the blood biochemical and metabolic profiles in 52 endurance horses (racing over 90, 120 and 160 km) before and after an endurance competition as well as analysis of the gut microbiome before the race while considering diverse physiological and nutritional factors.
A total of 52 endurance horses (out of 248) racing over 90, 120 or 160 km were recruited in a volunteer base at an endurance competition in Fontainebleau (France; Supplementary Table S1 and Fig. S1). Fecal samples with time-matched blood and questionnaire data (with emphasis on estimated nutrient consumption) were analyzed. Encompassing an equal range of race distance, age, sex, breeds, heart rate, speed data, athletic performance ranking, genetic background and environmental factors, our cohort was expected to be representative of the average gut microbiota composition of endurance horses. Nevertheless, the endurance rules impose a minimum age of 7 years for 120 km and 8 years for 160 km races, therefore excluding young horses (<7 years old) in these two distance groups. For detailed metadata, see Supplementary Table S1 and Supplementary Data.
Only one Arabian female of 10 years (namely “EcaOmic.201”) was excluded from the cohort as it presented very heterogeneous and distinctive gut microbiota and distorted ordination solutions. This single sample was removed from the study after considering various measures of richness, evenness, diversity, dominance, rarity, divergences and abundance (Supplementary Table S2).
Description of the gut microbiota in endurance equine athletes
Gut phylogenic composition and core microbiota in horse
Fecal samples were obtained before the race to profile the gut microbiome using 16S ribosomal RNA gene amplicon sequencing. The overall phylogenic composition of the sequenced samples confirmed that the Firmicutes (60.1 ± 3.70%) outranked Bacteroidetes (19.3 ± 4.28%), Fibrobacteres (9.6 ± 5.11%) and Spirochaetes (8.9 ± 2.26%) phyla. Lachnospiraceae (27.4 ± 2.51%), Ruminococcaceae (24.78 ± 4.29%), Fibrobacteraceae (9.7 ± 5.26%), Spirochaetaceae (9.1 ± 2.33%) and Prevotellaceae (8.37 ± 2.02%) were the most abundant families across samples. At the genus level, Clostridium XIVa (16.2 ± 2.34%), Fibrobacter (11.0 ± 5.81%) and Treponema (10.3 ± 2.57%) accounted for most of gut microbiota composition. The relative abundance of each genus in our cohort is shown in Supplementary Table S3.
The endurance horse core microbiota (e.g. the genera shared by 99% of samples with a minimum detection threshold of 0.001%) composed of 23 genera, including Clostridium XIVa, Fibrobacter, Treponema, Ruminococcus, and members of the yet unclassified Lachnospiraceae family (Supplementary Fig. S2A,B). Alloprevotella, Anaerovibrio, Desulfovibrio, Paludibacter, Streptococcus, and members of the unclassified Ruminococcaceae family were not detected in the core microbiota although they were between the top 30 most abundant genera in our cohort (Supplementary Table S3).
Inter-individual variation: the gut microbiota was structured into two-community types
The Principal Coordinate Analysis (PCoA) based on weighted UniFrac distance (Fig. 1A) showed a distinct two community structure between samples, numerically confirmed by the partitioning around medoids (PAM) silhouette coefficient using weighted UniFrac distance of the normalized open-reference operational taxonomic (OTUs) counts (Fig. 1B). PAM algorithm with alternative dissimilarity measures produced the exact same communities, assessing the robustness of the community structure of the microbiota. The robustness of the community types was further studied through alternative reduction dimension approaches, which led to identical observations (Supplementary Fig. S3A–D).
The cohort characteristics and the phylogenetic and functional bases of the two gut microbiota community types were substantially different
The first community (community type 1), which contained 15 individuals, was predominantly composed of Arabian geldings that raced over 160 km, individuals with and average age of 9.4 ± 2.41 years, average speed for the entirety of the race of 16.7 ± 1.68 km/h and pulse rate on arrival at the veterinary check of 56.14 ± 3.57 beats/min (Supplementary Fig. S4A–F). Four participants were classified as the top 25% athletes, and 33% of them were eliminated from the race (Supplementary Fig. S4G). In contrast, community type 2, consisting of 36 individuals, mainly comprised females and geldings that primarily raced 90 km or 120 km. The type 2 individuals were 9.2 ± 2.08 years old, and their average speed and pulse rate on arrival were 17.2 ± 1.60 km/h and 54.53 ± 5.7 beats/min, respectively (Supplementary Fig. S4A–G). Eleven individuals were classified within the top 25%, and 16% were eliminated from the race. Trainers, stables and breeding establishments were equally represented between community types.
At the OTUs level, the community type 1 presented significantly reduced microbiome richness, diversity and most of the evenness indices but higher dominance and rarity of non core and rare species (adjusted p-value < 0.05, Mann-Whitney U test) compared with the community type 2 (Supplementary Fig. S5). For example, Chao 1 (Fig. 1C) and Pielou index (Fig. 1D), which is an excellent measure of community structure, were significantly lower in community type 1 relative to community type 2. The significant differences in taxon evenness and rarity were found up to the genera level. By contrast, the Chao1 index based on the genera level was higher (p-value = 3.10−6, Mann-Whitney U test) in the community type 1 relative to the community type 2.
Following the findings of evident differences in the ecosystem diversity, we further investigated the composition and functionality of the gut microbiota through the fecal SCFA and pH. The relative abundance of 76 genera differed significantly between gut microbiota community types (adjusted p-values < 0.05 with Fisher exact test and Mann-Whitney U test combined; Supplementary Table S4). Community type 1 was characterized by a lower proportion of Bacteroidetes and Proteobacteria-associated significant taxa but higher proportions of Actinobacteria and Firmicutes phyla (Fig. 1E). Type 1 individual’s harbored a lower proportion of cellulolytic bacteria (members of the Ruminococcaceae and Lachnospiraceae families) proteolytic bacteria (e.g. Prevotella) and butyrate producer (e.g. Butyrivibrio), but higher proliferation of amylolytic bacteria such as Streptococcus (Supplementary Table S4). Moreover, genera such as Dietzia, Escherichia/Shigella, Saccharopolypsora, Ureibacillus and Weissella appeared exclusively in the community type 1 (Fig. 2A), whereas other genera such as Actinoplanes, Bacteroides, Caryophanon, Corynebacterium, Rummeliibacillus, Staphylococcus and members of the Micrococcineae family were present in less than 3% of the individuals from the community type 2 (Fig. 2A). These aforementioned genera were also detected as discriminant between community types when using the partial least squares discriminant analysis (PLS-DA; Fig. 2B). Supporting the later results, the adjusted p-values of the Fisher exact test and the Mann-Whitney U test combined appreciably decreased with the increased loading values in the PLS-DA model (Supplementary Fig. S6A).
The sparse PLS-DA in a cross-validation framework suggested that 7 genera were sufficient to predict the two gut communities, namely Cellulosilyticum, Corynebacterium, Escherichia/Shigella, Fibrobacter, Flavonifractor, Prevotella and members of the family Peptococcaceae (Supplementary Fig. S6B), while the sparse k-means suggested that Cellulosilyticum, Corynebacterium and Escherichia/Shigella were enough to almost recover the two gut microbiota communities. Nevertheless, when applying an alternative clustering classification method, which removed the k genera with the highest absolute loading values in the PLS-DA, we observed that the exact two community types were maintained even if the 19 genera with the highest loading values were removed (Supplementary Fig. S6C). The PCoA performed on relative genera abundances after deleting the top-rank genera (from k = 10 to k = 50) was able to discriminate the same two community types (Supplementary Fig. S6D).
The phylogenetic differences among the two community types were reflected on the co-occurrence networks (based on partial correlation and information theory, PCIT). The co-occurrence of the community type 1 was more complex and dense (139 associations among 46 genera; Fig. 2C) than that of the community type 2 (64 associations among 31 genera; Fig. 2D). While both networks roughly followed a scale-free degree distribution (0.94 and 0.70 for community type 1 and 2, respectively), the betweenness centrality scores for community type 1 was higher relative to community type 2 (Supplementary Table S5), as well as other node-level topological measures for each node, including degree and cluster coefficient. Such distinct topological features were additionally used to highlight key genera in co-occurrence networks. The genera showing the largest degree and betweenness centrality values in the community type 1 were Prevotella outranked by members of the family Lachnospiraceae, Anaerosporobacter, Anaerovibrio, Gordonia and Coprococcus (Fig. 2C). On the other hand, a large fraction of the putative key genera in the community type 2 were related to the phylum Firmicutes, including members of the family Ruminococcaceae, Eubacteriaceae and Lachnospiraceae, as well as Anaerovorax and Clostridium XIVa (Fig. 2D).
When focusing on the microbiota functionality, we observed that propionate (adjusted p-value = 0.02, Mann-Whitney U test, Fig. 3A) and butyrate (adjusted p-value = 0.05, Mann-Whitney U test, Fig. 3B) proportions in feces were higher in individuals from community type 2, while acetate proportion was higher (adjusted p-value = 0.024; Mann-Whitney U test, Fig. 3C) in the community type 1 relative to community type 2. Iso-butyrate, valerate and iso-valerate were at similar concentrations between communities, as well as the feces pH. More information on gut physiology parameters variations can be found in Supplementary Table S6.
The study of concomitant microbial populations in the gut showed that the anaerobic fungal loads in feces tended (p-value = 0.07, Mann-Whitney U test) to be greater in community type 2 (Fig. 3D), while the protozoan (Fig. 3E) and bacteria (Fig. 3F) loads remained unchanged (p-value = 0.14, Mann-Whitney U test) between gut community types.
The relationship between gut microbiota, host variables and environmental factors
Gut community types were not associated with host variables but rather estimated nutrient intake and composition
Horse sex (p-value = 0.36), breed (p-value = 0.17) and age of the athletes (p-value = 0.91) were neither associated with gut community types (Fisher exact test for categorical variables and Mann-Whitney U test for continuous variables) nor with the relative genera abundance matrix (Fig. 4A–C). Post hoc power analysis showed that for the variable sex and breed, a sample size of 330 and 100 individuals, respectively, should be necessary to achieve a type II error of 0.1 for the observed effect size in our cohort, whereas a sample size of more than 1,000 individuals should be required to reveal a significant association between age and gut microbiota community types with a type II error of 0.1, given the empirical effect size. For the age variable, the sample size of our cohort would allow detecting an effect size 2 times larger than the observed effect for a type II error of 0.1.
The telomere lengths (p-value = 0.58, Fig. 4D) and the host kinship (p-value = 0.67) were unchanged between the two gut community types (Mann-Whitney U test).
Unlike host variables, the subsequent permutational multivariate analysis of variance (PERMANOVA) demonstrated that the factor most strongly associated with community variation was the estimated diet composition (e.g. protein, fiber, fat and ash intake corrected by the horse net energy value of feeds (UFC); p-value = 0.04, PERMANOVA). At the level of single nutrients, estimated protein (adjusted p-value = 0.05, Fig. 4E), fiber (adjusted p-value = 0.05, Fig. 4F) and ash (adjusted p-value = 0.05) daily intakes were significantly higher in community type 2 relative to community type 1 (Mann-Whitney U test, Supplementary Table S6). They were all significant probably because individuals with the community type 2 had a greater forage intake (adjusted p-value = 0.05, Mann-Whitney U test, Fig. 4G) compared to individuals from community type 1, which consequently accounted for a large portion of UFCs (adjusted p-value = 0.05, Mann-Whitney U test, Fig. 4H) in their diet. Because diet composition and intakes might be impacted by the fact that training intensity, duration and frequency were different between endurance horses racing over 90, 120 or 160 km, we assessed whether the differences observed between community types and nutrient and energy intakes were subject to be influenced by the race distance. None of the nutrients or the UFC intakes was significantly associated with race distance covered when corrected by the gut microbiota community type effect (p-value = 0.6, two-way ANOVA).
Other environmental factors such as hours traveled to arrive at the competition site (p-value = 0.49, Mann-Whitney U test), the stable from where they were coming (p-value = 0.74, Fisher exact test), trainer (p-value = 0.74, Fisher exact test) and the breeding establishment where they were born and kept together with their mothers until they were weaned (p-value = 0.13, Fisher exact test) were not significantly related to gut microbiota community types.
The gut microbiota was related to the biochemical and metabolomic profiles before the endurance race but had no role in the endurance performance
The basal profiles of total bilirubin, creatine kinase, aspartate transaminase, serum amyloid A, as well as non-esterified fatty acids and β-hydroxy-butyrate were not affected by the microbiota gut community types (p-value = 0.85; PERMANOVA test, Supplementary Table S7). Similarly, these biochemical variables were not associated with the genus-level community ordination (non-metric multidimensional scaling (NMDS) based on Bray-Curtis dissimilarity) when using the envfit function in the “vegan” R package (10,000 permutations, Benjamini-Hochberg multiple test correction <0.05).
Conversely, the basal metabolome profile (determined using the untargeted broad-window nuclear magnetic resonance; 1H NMR) was significantly associated (p-value = 0.024, PERMANOVA test) with the gut community types. At the level of single metabolites, we observed that blood alanine (adjusted p-value = 0.03) and valine (adjusted p-value = 0.05) significantly co-varied with gut community types, while the adjusted p-value for choline containing compounds (adjusted p-value = 0.07) and isoleucine (adjusted p-value = 0.06) were very close to the defined threshold of 0.05 (Mann-Whitney U test). All these metabolites were higher in the community type 1 relative to the community type 2 (Fig. 5A). Similarly, the blood basal levels choline containing compounds, alanine, valine, tyrosine and isoleucine were associated with the genus-level community ordination (NMDS based on Bray-Curtis dissimilarity) when using the envfit function (10,000 permutations, BH multiple test correction <0.05; Fig. 5B). The aforementioned associations were also confirmed by PLS-DA (Fig. 5C). Further information on blood metabolite profiles is depicted in Supplementary Data, Fig. S7 and Table S8.
Given that basal metabolome profiles are likely related to racing experience, racing training intensity and duration, we assessed whether those differences observed between gut community types on the basal molecules represented an indirect means by which race distance conditioned gut microbiota composition. None of the basal variables were significantly associated with distance, except for the choline containing compounds concentration (p-value = 0.04, two way ANOVA; Supplementary Table S9).
Unexpectedly, the biochemical (p-value = 0.90, PERMANOVA test) and metabolic (p-value = 0.35, PERMANOVA test) profiles during the endurance race were not associated with the gut microbiota communities, not with the genus-level community ordinations.
Beyond the lack of the biochemical and metabolomic regulations of the gut microbiota during the endurance race, we also observed that the microbiota community types were neither linked to the risk of being eliminated during the competition (p-value = 0.26, Fisher exact test; Supplementary Fig. S8A) nor the endurance race ranking (p-value = 0.62, Fisher exact test, Supplementary Fig. S8B). A sample size of 340 (for elimination) to 700 (for race ranking) would be necessary to achieve a type II error of 0.1 with the observed size effect and a probability of 0.9 (Supplementary Fig. S9A-B). Moreover, the sample size of our cohort would allow detecting an effect size 2 to 3 times larger than the observed effect for a type II error of 0.1 (Supplementary Fig. S9C-D). Detailed calculations on statistical power and effect size are reported in Supplementary Data.
Results showed that the constituent phyla, families, and genera within the gut microbiota were congruent with other studies performed in horses15,16,17,18,19,20,21,31,32,33,34. Clostridium XIVa, Fibrobacter and Treponema genera were the most abundant genera, which is in agreement with a study in Standardbred racehorses by Janabi et al.35 that showed that training led to an increase in Treponema and Clostridium spp.
Microbiome analyses showed that the endurance horses’ gut microbiota composition clustered into two functionally distinct groups, revealing possible diverse metabolic adaptations to endurance exercise between the two gut communities. Community type 1 showed lower richness and diversity at the OTUs level, but appeared more specialized to produce acetate as a mean of maintaining the energy supply as glucose concentrations fell during the race. The lower diversity and richness have been associated with increased energy harvest from food in obese humans36. Therefore, it is likely that community type 1 benefits the host during the endurance race proving significant amounts of acetate, the main substrate from fiber fermentation in horses (reviewed by Jansson and Lindberg, 201237).
Conversely, type 2 community presented higher richness and diversity at the OTUs level, anaerobic fungal loads, as well as elevated levels of metabolites that provided an alternate energy source (e.g. propionate and butyrate). Anaerobic fungi, which degrade fiber thanks to their rhizoids that have the ability to physically penetrate plant structural barriers23, could have led to higher concentrations of propionate, the main precursor for gluconeogenesis in animals, as well as butyrate. The higher N-butyrate concentration could have ameliorated gut mucosal inflammation and oxidative status38, common occurrences among athletes as body temperature increases and blood pools away from the gastrointestinal tract to periphery muscles and organs such as the heart and lungs during intense physical activity (reviewed by Clark and Mach1). Nevertheless, N-butyrate was not associated with protective effects on telomere length.
After discovering that the gut microbiota composition and function were clearly divided between the two communities, we investigated the major drivers influencing gut microbiota communities. The measured host properties (e.g. age, sex, breed, host kinship and telomere length) and athletic performance parameters (e.g. pulse rate, average speed during the race, distance covered, ranking and elimination) were not significantly associated with the gut community types. These results may constitute false-negative associations (type II error) as our current study lacks power to properly explain these associations due to the limited number of participants. A greater sample size is required to achieve statistical significance with host and endurance performance outcome measures.
However, a strong correlation occurred between estimated forage and daily energy intake and gut community types. These findings should be interpreted with caution, as nutrient intake estimation could be inaccurate due to owners or riders not remembering the consumption of specific ingredients or incorrectly estimating portion sizes, and the fact that nutrient composition was estimated from INRA food composition tables. The estimated lower forage intakes in community type 1 was in line with the lower relative abundance of important fibrolytic bacteria, including members of the family Ruminococcaceae, Eubacteriaceae and Lachnospiraceae39 (identified as hub-based key taxa in the co-occurrence network in type 2 individuals). However, other fibrolytic bacteria like Fibrobacter along with the Treponema, a hydrogen utilizing microorganism that work with fibrolytic bacteria23, were much higher in community type 1 compared to type 2 and were in line with results from other studies40, suggesting that the community type 1 harbored less complex but more specialized microbiota to produce acetate as a mean of maintaining the energy supply as the glucose concentrations fell during the race. Pan and colleagues41 showed that acetic acid enhanced endurance performance in exercise-trained mice by inducing enzymes involved in fatty acid oxidation.
At basal time, metabolites such as choline containing compounds, alanine and branched chain amino acids positively correlated with the community type 1’s microbiota composition. The link between community type 1 and choline containing phospholipids levels in blood, the most abundant lipids within the cell membrane, further supported that type 1 community metabolic capacity could enhance host energy metabolism during the endurance race. The modulation of the host energy metabolism was also reflected by the relationship between community type 1 and alanine, whose metabolism is involved in the glycolysis pathway. Community type 1 relationship with the branched chain amino acids, which can be converted in a complex mixture of metabolic end products, including ammonia, and SCFA, reinforces its role in regulating the energy balance during the endurance race. The elevated levels of branched chain amino acids (leucine, isoleucine, valine) may have also reduced endurance exercise- induced muscle damage and fatigue42.
Despite the complex relationship between the gut microbiota and energy production during endurance exercise, we found no significant association between blood metabolites (especially those associated with mitochondrial oxidative phosphorylation, fatty acid β-oxidation and gluconeogenesis) and gut microbiota composition during the endurance race.
The lack of association between blood metabolites during the race and the gut microbiota was probably due to some possible limits of the 16S rRNA gene sequencing for microbiota profiling or the 1H NMR-based approach for metabolomics. On one hand, the relatively short read length of Illumina-generated sequences along with the specific combination of primer pairs could substantially affect the proportion of single merged sequences as well as the accuracy and sensitivity of taxonomic annotation against the reference databases43. The resolution of taxonomical classification of sequences based on a limited segment of the 16S rRNA gene, such as the V3-V4 region, is relatively low44. One of the approaches to increase the resolution of taxonomical classification would be the use of whole metagenome shotgun sequencing, which assure non-amplification bias and unbiased estimation of taxonomic profiles44. On the other hand, the 1H NMR approach only detects those metabolites with high concentrations (e.g. ketone bodies, amino acids, glucose, pyruvate, choline, etc.). Additionally, because most of the metabolites had multiple peaks in the 1H NMR spectra, many of which overlapped each other, the estimation of the measure proportional to the concentration of compound could be ambiguous.
Research about the potential role the gut microbiota plays in athletic performance is in its infancy. Further investigation is needed to better understand the role diet plays in modulating the gut microbiota in endurance athletes, and consequently metabolic pathways that attenuate exercise-induced inflammation, oxidative stress, intestinal permeability and energy homeostasis.
Fifty-two horses (8 males, 22 mares and 22 geldings; mean ± SEM age: 9 ± 1.75 years) were recruited on a volunteer basis from a cohort of 248 participants. We obtained informed consent from horse owners or riders. For successful participation in the study, we required the following criteria: (i) written informed consent; (ii) blood sample collection before and after the race; (iii) feces collection before the race; (iv) absence of gastrointestinal disorders during the four months prior to enrollment; (v) absence of antibiotic treatment during the four months prior to enrollment and absence of anthelmintic medication within 60 days before the race, and (vi) a complete questionnaire about diet composition and intake. Prior to completing the questionnaires, owners and riders were provided with a written description of the procedures. The number of volunteers enrolled in the study was limited because a high percentage of potential volunteers declined to participate due to they worried about the mechanical phlebitis or chemical phlebitis after blood sampling (specially before the endurance race).
We categorized participants based on the race distance: 90 km, 120 km and 160 km (Supplementary Fig. S1). Out of the 91 horses ascribed to the 90 km race, 19 horses met all inclusion criteria. For the 120 km category, we enrolled only 17 horses out of 101, whereas a total of 16 out of 56 horses that raced over 160 km met all inclusion criteria.
For each horse included in this study, we detailed the age, sex, breed, distance covered during the race, whether the horse completed the race and reason for elimination if any (metabolic, lameness, tiredness or other), pulse rate on arrival at the veterinary check point, the average speed during the race, and the race ranking over all animals racing the same distance. The athletic performance ranking of the horse was calculated by taking into account the average speed, recovery time, riding time and the horse and rider’s ability to finish the race. We also recorded environmental factors such as the hours traveled to arrive at the competition site, the stable from where they were coming, the trainer and the breeding establishment where they were born and kept together with their mothers until they were weaned. Lastly, the estimated macronutrient intakes per horse were also detailed (see the section Nutrient intake estimation for further details).
Since it has been previously established that gut microbiota profiles might be shaped by host genetics45, we used the “kinship2” R package to create the numerator relationship matrix, which estimates the genetic parameters and predicts breeding values between animals. Both pedigree tree and correlation structure matrix are depicted in Supplementary Figs S10 and S11, respectively.
The weather conditions, terrain difficulty and altitude were the same for all participants enrolled in the study as all races (90, 120 and 160 km) took place during the second weekend of October 2015 in Fontainebleau (France). The average air temperature was 15 °C, with a maximum of 20 °C and a minimum of 11 °C. The average air humidity was 88%. No rain was recorded.
All individuals passed the International Equestrian Federation (FEI)’s compulsory examination before the start. During the endurance competition, all animals underwent veterinary checks every 30- to 40-km during the race, followed by recovery periods of 40 to 50 minutes, which is in accordance with the FEI rules on endurance riding. The veterinarians ensured that horses were not suffering from lameness, metabolic troubles or dehydration, and that their heart rate was low enough to start racing another 30 to 40 km. Horses that failed to complete the race were considered as non-finishers or eliminated.
The local animal care and use committee reviewed and approved the study protocol (ComEth EnvA-Upec-ANSES; reference: 11-0041, dated July 12th 2011). All the protocols were conducted in accordance with EEC regulation (no 2010/63/UE) governing the care and use of laboratory animals, which has been effective in France since the 1st of January 2013. In all cases, the owners and riders provided their informed consent prior to the start of study procedures with the animals.
Blood sampling and measurements
Blood samples were collected the day before the event (Basal, T0) and immediately after the end of the competition (T1). Whole blood samples before and after the race were taken in EDTA and lithium heparin tubes (BD Vacutainer®, 10 mL) for biochemical assays and in fluoride-oxalate tubes for metabolomics analysis. Serum and plasma samples were maintained at +4 °C until analysis. Plasma samples collected for 1H NMR analysis were frozen immediately after centrifugation and stored at −80 °C.
Additionally, at basal time, blood samples for telomere length profiling were obtained from each animal using EDTA tubes (BD Vacutainer®). After centrifugation, the packed blood cells were maintained at −80 °C until DNA extraction.
Blood biochemical assays
Sera were assayed for total bilirubin, conjugated bilirubin, total protein, creatinine, creatine phosphokinase, aspartate amino transferase, gamma glutamyl transferase, non-esterified fatty acids, beta hydroxy-butyrate and serum amyloid A levels on a RX Imola analyzer (Randox, UK).
Telomere length measurement
The telomere length measurement assay was adapted from the method described by Cawthon and colleagues46. For each DNA sample, the ratio between telomere repeat copy number and the single copy interferon γ (IFGM) was calculated. The ratio was proportional to the average telomere length. The forward primer for the telomere PCR was Tel1F [5′-CGGTTTGTTT GGGTTTGGGTTTGGGTTTGGGTTTGGGTT-3′] and the reverse primer was TelR [5′-GGCTTGCCTTACCCTTACCCTTACCCTTACCCTTACCCT-3′]. Values reported correspond to ∆∆Ct between Ct value of telomeric region (T) amplification and IFGM (S) amplification for each sample relative to a random control sample, here the value of the animal “EcaOmic.213”. Further details on the protocol are reported elsewhere17.
1H Nuclear magnetic resonance (NMR) data acquisition
The plasmas were thawed at room temperature. In the 5 mm NMR tubes, 600 µL of plasma was added with 100 µL deuterium oxide for field locking. The 1H NMR spectra were acquired with a Bruker® Advance III spectrometer (Bruker BioSpin, Wissembourg, France) and a 5 mm reversed QXI Z-gradient high-resolution Bruker probe. Proton spectra were acquired at 500 MHz. The water signal was suppressed with a pre-saturation pulse (3.42 × 10−5 W) during relaxation delay (3 s) at the water resonance frequency. The signal processing was performed in an automatic routine with an in-housing code using NMRpipe. The processing method included an integral calculation of 0.001 ppm regions from 9.5 to 0 ppm. Each region was referred as bins. The region between 4.5 and 5 ppm corresponding to residual water signal was withdrawn. The 1H NMR data was normalized according to the spectra using the probabilistic quotient method47 and the bins, corresponding to variables in the statistical analysis, were scaled to unit variance. Further explanation on sample preparation, data acquisition, data quality control, spectroscopic data-pre-processing, and data pre-processing including bin alignment, scaling and normalization are broadly explained elsewhere48.
The metabolite identification was then performed by using structure message of metabolites acquired from other available biochemical databases, such as human metabolome database (HMD), http://www.hmdb.ca; KEGG, http://www.genome.jp/kegg/; METLIN, http://metlin.scripps.edu/; Chemical Entities of Biological Interest (http://www.ebi.ac.uk/Databases/); and Lipidmaps (http://www.lipidmaps.org/) and literature4,48. Metabolite assignment of each peak was considered when chemical shifts of peaks in the samples were the same as in the publicly available reference databases or literature (with a shift tolerance level of ± 0.005 ppm), in order to counter-act the effects of measurements and pre-processing variability introduced by factors such as pH values and solvents. A manual curation for identified compounds was done by an expert in horse metabolomics4,48. Afterwards, the relative abundance of each metabolite was calculated as the area of the peak49.
Feces sampling and measurements
Fresh fecal samples were obtained from all horses while monitoring them before the race (no more than 24 h before starting the race, in all cases). One fecal sample from each animal was collected off the ground immediately after defecation as described by Mach et al.17. Since most of the horses experienced dehydration after the race, the gastrointestinal emptying was significantly delayed and consequently we were not able to recover the feces after the race.
Feces pH and short chain fatty acids (SCFA) concentration
The feces pH was immediately determined after 10% fecal suspension (wt/vol) in saline solution (0.15 M NaCl solution). SCFAs were measured in fecal samples as previously described in Mach et al.17.
Microorganisms DNA extraction
Total DNA was extracted from aliquots of frozen fecal samples (200 mg; 52 samples at rest), using E.Z.N.A.® Stool DNA Kit (Omega Bio-Tek, Norcross, Georgia, USA). The DNA extraction protocol was carried out according to the manufacturer’s instructions (Omega- Bio-Tek, Norcross, Georgia, USA). DNA was then quantified using the Qubit dsDNA HS assay kit (Thermo Fisher Scientific®; Waltham, MA USA).
V3–V4 16S rRNA gene amplification
The V3-V4 hyper-variable regions of the 16S rRNA gene were amplified as previously reported in with two rounds of PCR as previously described in Mach et al.17 and Clark et al.40. The concentration of the purified amplicons was measured using Nanodrop 8000 spectrophotometer (Thermo Scientific) and the quality of a set of amplicons was checked using DNA 7500 chips onto a Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). All libraries were pooled at equimolar concentration in order to generate equivalent number of raw reads with each library. The final pool had a diluted concentration of 5 nM to 20 nM and was used for sequencing. Amplicon libraries were mixed with 15% PhiX control according to the Illumina’s protocol. For this study, one-sequencing run was performed using MiSeq. 500 cycle reagent kit v2 (2 × 250 output; Illumina, USA).
V3–V4 16S rRNA gene sequencing and data preprocessing
First, forward and reverse paired-end sequence reads were collapsed into a single continuous sequence according to the ‘fastq-join’ option of the ‘join_paired_ends.py’ command in QIIME. The fastq-join function allowed a maximum difference within overlap region of 8%, a minimum overlap setting of 6 bp and a maximum overlap setting of 60 bp. The reads that did not overlap (~20% of the total) were removed from the analysis. Anomalously joined reads–reads that were too short or too long were excluded according to the expected size of each targeting region (438–469 bp for region V3–V4). The retained sequences were then quality filtered. De-multiplexing, primer removal and quality filtering processes were performed using the ‘split_libraries’_fastq.py command in QIIME. We applied a default base call Phred threshold of 20, allowing maximum three low-quality base calls before truncating a read, including only reads with >75% consecutive high-quality base calls, and excluding reads with ambiguous (N) base calls52.
Subsequently, the sequences were clustered into OTUs against the GreenGenes database (release 2013-08: gg_13_8_otus)53 by using the uclust54 method at a 97% similarity cutoff. The filtering of chimeric OTUs was performed by using Usearch version 6.155 against the GreenGenes reference alignment53. A phylogenic tree was generated from the filtered alignment using FastTree56. Because the relatively short read length of Illumina-generated sequences could reduce the resolution of taxonomic annotation against the GreenGenes reference database, the resulting OTU representative sequences were then searched against the Ribosomal Database Project naïve Bayesian classifier database (RDP 10 database, version 6)58, using the online program SEQMATCH (http://rdp.cme.msu.edu/seqmatch/seqmatch_intro.jsp). A confidence threshold of 0.80 was required. Singletons, defined here as taxa observed in only one sample, were removed as low-frequency reads offered no meaningful information in our experiment and instead added noise to the statistical models used. They were more likely to represent sequencing errors, contaminants, or transient organism without a biological role at the niche under study57. Singletons were discarded from the dataset using the ‘filter_otus_from_otu_table.py’ script in QIIME. Using OTU abundance and the corresponding taxonomic classifications, feature abundance matrices were calculated at different taxonomic levels, representing OTUs and taxa abundance per sample. The “Phyloseq”59, “Vegan”60 and “microbiome” R packages were used for the detailed downstream analysis on abundance matrix.
In the end, a total of 4,426,099 paired-end 250 bp reads were obtained, 3,611,320 of which were retained as high-quality sequences (Supplementary Table S10). On average, a total of 59,256 sequences per sample were achieved in the study, with a mean length of 441 ± 15 bp. These sequences were clustered into 8,229 non-singleton OTUs using the reference-based OTU-picking process. Among them, 7,185 were classified taxonomically down to the genus level (Supplementary Table S11). OTU counts per sample and OTU taxonomical assignments are available in Supplementary Table S11.
Relative abundance normalization was applied, which divides raw counts from a particular sample by the total number of reads in each sample.
Real-time quantitative PCR (qPCR) analysis of bacterial, anaerobic fungal and protozoan loads
Concentrations of protozoa, anaerobic fungi and bacteria in fecal samples were quantified using a QuantStudio 12 K Flex real-time instrument (Thermo Fisher Scientific, Waltham, USA). Primers for real-time amplification of protozoa (FOR: 5′-GCTTTCGWTGGTAGTGTATT-3′; REV: 5′-CTTGCCCTCYAATCGTWCT-3′), anaerobic fungi (FOR: 5′-TCCTACCCTTTGTGAATTTG-3′; REV: 5′-CTGCGTTCTTCATCGTTGCG-3′) and bacteria (5′-CAGCMGCCGCGGTAANWC-3′; REV: 5′-CCGTCAATTCMTTTRAGTTT-3′). Details on thermal cycles and the creation of standard curve for absolute quantification are reported in Mach et al.17 and Clark et al.40. To generate quantification curves, purified DNA was quantified using the Qubit dsDNA HS assay kit (Thermo Fisher Scientific®; Waltham, MA USA). This DNA was subsequently diluted serially by copy number and amplified using the 16S rRNA, the 18S rRNA or ITS1 qRT-PCR assays. The standard curve was included in each run. After each run, melting curve analysis was performed to the presence of the desired amplicon and to confirm the lack of primer dimers. In all cases, the melting curves analysis did not reveal any contamination due to genomic DNA or to non-specific amplification. Gel electrophoresis analysis of the PCR products also showed a single band of the expected size. The qPCR efficiencies covering the three amplicons were calculated in each run. For the ITS1 amplicon, efficiency values ranged from 1.902 to 1.988 with R2 value (square regression coefficient) > 0.95. In the assays targeting the 16S rRNA gene, efficiency values ranged from 1.861 to 1.918 with R2 value > 0.95, whereas in the assays targeting the 18S rRNA gene, efficiency values ranged from 1.888 to 1.959 with R2 value > 0.95.
Nutrient intake estimation
The detailed dietary records for one month prior to fecal collection were performed through an in-depth interview with the horse’s owner or rider (similar to those reported in human microbiota studies)61,62. Briefly, for each animal, the proportion and type of hay, cereals, and commercial feed supplements were recorded and carefully reviewed by the research staff. The mean intakes of macro- and micronutrient intake from forage (mainly alfalfa-hay) and the different type of cereals consumed were then calculated based on the INRA food composition database for horses63. Additionally, the quantity of macro and micronutrient listed in the commercial feed supplements consumed according to the detailed descriptions of the respondents were obtained. The cereals (mainly based on barley and soybean meal) and the commercial feed supplements were considered as concentrates.
Then, the nutrient intake estimates of fat, protein, fiber, ash, as well as the MADC (g/d)64 and the total amount of UFC64 were estimated from the questionnaire by multiplying the frequency of the ingredient consumption by weight of an estimated average portion and nutrient content of the ingredient in question (Supplementary Table S12).
Alpha diversity indices of the fecal microbiota
The α-diversity indexes were calculated using the “Phyloseq” and “microbiome” R packages from OTU abundance and relative genera abundance tables. The “microbiome” R package allowed us to study other global indicators of the gut ecosystem state, including measures of diversity, evenness, dominance, rarity, divergences and abundance. All samples were normalized using the rarefy_even_depth function in the “Phyloseq” R package, which is implemented as an ad hoc means to normalize microbiome counts that have resulted from libraries of widely differing size. During the alpha diversity step, singletons were kept in the analysis. Subsequently, singletons were removed.
Beta diversity of fecal microbiota
To estimate β-diversity, un-weighted and weighted UniFrac distances, as well as Bray-Curtis dissimilarity, were calculated from the OTU and the genera relative abundance tables. The β-diversity was visualized using PCoA, correspondence analysis (CA) and the NMDS with the “Phyloseq” and “ggplot2” R packages.
Description of the core microbiota
The core microbiome of individual samples was calculated using a detection threshold of 0.001% and a prevalence threshold of 99.9% (e.g. a given genera must be present in 99.9% of individuals with a relative abundance of at least 0.001%) in the “microbiome” R package.
Clustering of feces metagenomic samples into community types
The inter-individual variations in the gut microbiota composition was studied based on the conceptual framework of enterotypes, or more generically, community types65. According to this framework, the samples were clustered into bins based on their taxonomic similarity66. Briefly, clustering was performed with PAM67 using weighted UniFrac distance of the normalized OTUs counts. The optimal number of clusters or communities was chosen by the maximum average silhouette width, known as the silhouette coefficient (SC)68. The quality of those clusters or community types was assessed by the same measure, following the accepted interpretation that SC values above 0.5 indicate a reasonable clustering structure69. The graphical representation was performed through PCoA, CA and the non-negative matrix factorization (NMF) with the Kullback-Leibler dissimilarity70. To reinforce the robustness of the communities, we implemented the PAM with un-weighted UniFrac distance, Euclidean distance and Bray-Curtis dissimilarity on the normalized genera counts.
The contribution of each genus to the community type structure
The PLS-DA was used to identify the key genera responsible for the differences in the gut community types using “mixOmics” R package, coupled with the smallest p-values obtained from the Mann Whitney U test or the Fisher exact test. It is important to note that these p-values are only indicative, since p-values were computed on the same data used to define the two clusters. Furthermore, we explored the community type composition structure through two additional approaches: (i) the sparse-PLS-DA in a 10-fold cross-validation framework to identify the minimum significant discriminant genera that led to the same microbiota community clustering. The performance of PLS-DA, in selecting relevant predictors, was then being investigated by means of the area under the curve (AUC) of a receiver operating characteristic (ROC) curve. A test with a ROC-AUC of 1.0 is perfectly accurate, because the sensitivity is 1.0 (meaning that all relevant predictors were correctly identified, without irrelevant predictors wrongly assigned to the positive class); and (ii) the sparse k-means with the standardized Euclidean distance to identify the small fraction of genera that led to almost identical community clusters. In contrast to sparse PLS-DA, sparse k-means algorithm did not make use of predetermined clusters.
To explore whether the microbiota community structure was supported by a large number of genera, a cluster analysis based on Bray-Curtis distance was performed after removing the k highest rank genus defined by the PLS-DA loadings values. The genera with the highest-ranking cluster-discriminatory value among individuals were removed, starting from k = 1 until k = 100. For each k, the number of misclassified samples from the defined gut microbiota community types and the silhouette index were computed.
Network inference at the genus level for each community type
Networks at the genus level were inferred between community types. In order to prevent the compositional effects bias typical of the classical correlations methods, we calculated the correlations among genera using the PCIT approach, which identifies significant co-occurrence patterns through a data-driven methodology71. The genera with <0.1% mean relative abundances were excluded to acquire the results for the taxa that met the statistical conditions for correlation estimations. Nodes in the network represent the genera and edges that connect these nodes represent correlations between genera. Based on correlation coefficient and p-values for correlation, we constructed co-occurrence networks. The cutoff of p-values was 0.05. The cutoff of correlation coefficients was determined as r ≥ |0.35|. Network properties were calculated with the NetworkAnalyzer plugin in Cytoscape. We used the Cytoscape to visualize the network. Strong and significant correlation between nodes (r ≥ |0.60|) were represented with larger edge width in the network.
Association between gut microbiota, host, gut and environmental variables
The association between microbiota, host, gut and environmental variables was tested using both the microbiota community types and the relative genera abundance table.
First, we tested the hypothesis-driven associations between the gut community types and host, gut parameters and environmental variables using two different statistical approaches, namely: (i) the PERMANOVA test (a non-parametric method of multivariate analysis of variance based on pairwise distances) implemented in the adonis2 function from “Vegan” R package. PERMANOVA allows testing the global association between community types and a group of variables, and (ii) the nonparametric Mann-Whitney U test for continuous variables and the Fisher exact test for categorical variables followed by BH multiple test correction. An adjusted p-value < 0.05 was considered as significant.
The host variables tested included: age, breed, sex, telomere length, host kinship, as well as endurance performance parameters such as biochemical and metabolome parameters at basal time and post-race, distance race covered, pulse rate on arrival at the veterinary check, the average speed during the race, the athletic performance ranking and whether the horse completed the race or was eliminated. The athletic performance ranking was categorized into the top 25% ranked performers, the individuals that arrived between the top 25% and 75% positions, the individuals that arrived in the last 25% positions and those eliminated from the race. The relative athletic performance ranking was computed among all ranked animals for each distance race (90, 120 and 160 km). To eliminate the inter-individual variability, the Δ values of biochemical and metabolic profiles (T1–T0) were considered.
The gut variables analyzed were the feces pH, the SCFA proportions in feces as well as the anaerobic fungal, bacterial and protozoan loads in feces, whilst the environmental variables included daily nutrient intakes corrected by UFC as well as hours traveled to arrive at the competition site, trainers, the stable from where they were coming, and the breeding establishment where they born and were kept together with their mothers until they were weaned. Additionally, we used two-way ANOVA to evaluate the relative effects of microbiota community types and race distance on the total observed variability in host, gut and environmental variables.
Second, covariates of microbiome variation (based on the relative genera abundance table) were identified by calculating the association between continuous host and gut phenotypes and genus-level community ordination (NMDS based on Bray-Curtis dissimilarity) with envfit function72 in the “vegan” R package, with 10,000 permutations73 and BH multiple testing correction. An adjusted p-value < 0.05 was considered as significant. This method enabled the selection of combined covariates with strongest correlation to microbiota variation.
Power analysis and effect size
A post hoc power analysis was conducted for the association between host and performance phenotypes and the gut microbiota community types. The power analysis for sex, breed, and risk of being elimination during the race was tested via Fisher exact test, whereas the Mann-Whitney U test was applied for age and athletic performance ranking. Fist, the power analysis was computed for an increasing sample size, while preserving the observed proportion of each gut community type among the cohort. Second, for the Fisher exact test with two classes (e.g. risk of being eliminated from the race) and for the Mann-Whitney U test applied to variables such as age and athletic performance ranking, the statistical power was computed for an increasing effect size, while preserving the observed sample size. For Mann-Whitney U test, we considered the skewed uniform distribution. The effect size considered was the ratio of the distribution mean over the two community types for Mann-Whitney U-test and the proportion of success over the two community types for Fisher exact test.
The metabolomic data is available at the NIH Common Fund’s Data Repository and Coordinating Center (supported by NIH grant, U01-DK097430) website, http://www.metabolomicsworkbench.org), where it has been assigned a Metabolomics Workbench Project ID: (UrqK1489). The data is directly accessible at: http://dev.metabolomicsworkbench.org:22222/data/DRCCMetadata.php?Mode=Study&StudyID=ST000945.
The targeted locus study project has been deposited at DDBJ/EMBL/GenBank under the accession KBTQ00000000. The version described in this paper is the first version, KBTQ01000000. The bioproject described in this paper belongs to the BioProject PRJNA438436. The corresponding BioSamples accession numbers were SAMN08715709 to SAMN08715760.
Clark, A. & Mach, N. Exercise-induced stress behavior, gut-microbiota-brain axis and diet: a systematic review for athletes. J. Int. Soc. Sports Nutr. 13, 43 (2016).
Clark, A. & Mach, N. The Crosstalk between the Gut Microbiota and Mitochondria during Exercise. Front. Physiol. 8 (2017).
Mach, N. & Fuster-Botella, D. Endurance exercise and gut microbiota: A review. J. Sport Heal. Sci. 6, 179 (2017).
Mach, N. et al. Understanding the response to endurance exercise using a systems biology approach: combining blood metabolomics, transcriptomics and miRNomics in horses. BMC Genomics 18, 187 (2017).
Mach, N. et al. Integrated mRNA and miRNA expression profiling in blood reveals candidate biomarkers associated with endurance exercise in the horse. Sci. Rep. 6, 22932 (2016).
Knuiman, P., Hopman, M. T. E. & Mensink, M. Glycogen availability and skeletal muscle adaptations with endurance and resistance exercise. Nutr. Metab. 12, 1–11 (2015).
Radak, Z., Chung, H. Y., Koltai, E., Taylor, A. W. & Goto, S. Exercise, oxidative stress and hormesis. Ageing Res. Rev. 7, 34–42 (2008).
Mukherjee, K. et al. Whole blood transcriptomics and urinary metabolomics to define adaptive biochemical pathways of high- intensity exercise in 50–60 year old masters athletes. PLoS One 9 (2014).
Barton, W. et al. The microbiome of professional athletes differs from that of more sedentary subjects in composition and particularly at the functional metabolic level. Gut 67, 625 (2017).
Denou, E., Marcinko, K., Surette, M. G., Steinberg, G. R. & Schertzer, J. D. High-intensity exercise training increases the diversity and metabolic capacity of the mouse distal gut microbiota during diet-induced obesity. Am. J. Physiol. - Endocrinol. Metab. 310, E982–E993 (2016).
Estaki, M. et al. Cardiorespiratory fitness as a predictor of intestinal microbial diversity and distinct metagenomic functions. Microbiome 4, 1–13 (2016).
Matsumoto, M. et al. Voluntary Running Exercise Alters Microbiota Composition and Increases n-Butyrate Concentration in the Rat Cecum. Biosci. Biotechnol. Biochem. 72, 572–576 (2008).
Allen, J. M. et al. Voluntary and forced exercise differentially alters the gut microbiome in C57BL/6J mice. J. Appl. Physiol. 118, 1059–1066 (2015).
Mackie, R. I. & Wilkins, C. A. Enumeration of anaerobic bacterial microflora of the equine gastrointestinal tract. Appl. Environ. Microbiol. 54, 2155–2160 (1988).
Steelman, S. M., Chowdhary, B. P., Dowd, S., Suchodolski, J. & Janečka, J. E. Pyrosequencing of 16S rRNA genes in fecal samples reveals high diversity of hindgut microflora in horses and potential links to chronic laminitis. BMC Vet. Res. 8, 231 (2012).
Venable, E. B. et al. Effects of Feeding Management on the Equine Cecal Microbiota. J. Equine Vet. Sci. 49, 113–121 (2017).
Mach, N. et al. The effects of weaning methods on gut microbiota composition and horse physiology. Front. Physiol. 8 (2017).
Costa, M. C. et al. Comparison of the fecal microbiota of healthy horses and horses with colitis by high throughput sequencing of the V3–V5 region of the 16s rRNA gene. PLoS One 7 (2012).
Shepherd, M. L., Swecker, W. S., Jensen, R. V. & Ponder, M. A. Characterization of the fecal bacteria communities of forage-fed horses by pyrosequencing of 16S rRNA V4 gene amplicons. FEMS Microbiol. Lett. 326, 62–68 (2012).
Costa, M. C. et al. Characterization and comparison of the bacterial microbiota in different gastrointestinal tract compartments in horses. Vet. J. 205, 74–80 (2015).
Weese, J. S. et al. Changes in the faecal microbiota of mares precede the development of post partum colic. Equine Vet. J. 47, 641–649 (2015).
Salem, S. E. et al. Variation in faecal microbiota in a group of horses managed at pasture over a 12-month period. Sci. Rep. 8, 1–10 (2018).
Julliand, V. & Grimm, P. The Impact of Diet on the Hindgut Microbiome. J. Equine Vet. Sci. 52, 23–28 (2017).
Daly, K. et al. Alterations in microbiota and fermentation products in equine large intestine in response to dietary variation and intestinal disease. Br. J. Nutr. 107, 989–995 (2012).
Hansen, N. et al. High nutrient availability reduces the diversity and stability of the equine caecal microbiota. Microb. Ecol. Health Dis. 1, 1–8 (2015).
Dougal, K. et al. Characterisation of the faecal bacterial community in adult and elderly horses fed a high fibre, high oil or high starch diet using 454 pyrosequencing. PLoS One 9 (2014).
Fernandes, K. A. et al. Faecal microbiota of forage-fed horses in New Zealand and the population dynamics of microbial communities following dietary change. PLoS One 9, e112846 (2014).
Willing, B. et al. Changes in faecal bacteria associated with concentrate and forage-only diets fed to horses in training. Equine Vet. J. 41, 908–914 (2009).
Harlow, B. E., Donley, T. M., Lawrence, L. M. & Flythe, M. D. Effect of starch source (corn, oats or wheat) and concentration on fermentation by equine faecal microbiota in vitro. J. Appl. Microbiol. 119, 1234–1244 (2015).
Daly, K., Stewart, C. S., Flint, H. J. & Shirazy-Beechey, S. P. Bacterial diversity within the equine large intestine as revealed by molecular analysis of cloned 16S rRNA genes. FEMS Microbiol. Ecol. 38, 141–151 (2001).
Dougal, K. et al. Identification of a Core Bacterial Community within the Large Intestine of the Horse. PLoS One 8 (2013).
Destrez, A., Grimm, P. & Julliand, V. Dietary-induced modulation of the hindgut microbiota is related to behavioral responses during stressful events in horses. Physiol. Behav. 202, 94–100 (2019).
Antwis, R. E., Lea, J. M. D., Unwin, B. & Shultz, S. Gut microbiome composition is associated with spatial structuring and social interactions in semi-feral Welsh Mountain ponies. Microbiome 6, 207 (2018).
Biddle, A. S., Tomb, J.-F. & Fan, Z. Microbiome and Blood Analyte Differences Point to Community and Metabolic Signatures in Lean and Obese Horses. Front. Vet. Sci. 5, 12–14 (2018).
Janabi, A. H. D., Biddle, A. S., Klein, D. & McKeever, K. H. Exercise training-induced changes in the gut microbiota of Standardbred racehorses. Comp. Exerc. Physiol. 12, 119–130 (2016).
Rasmussen, S. et al. Richness of human gut microbiome correlates with metabolic markers. Nature 500, 541–546 (2013).
Jansson, A. & Lindberg, J. E. A forage-only diet alters the metabolic response of horses in training Animal. 1939–1946, https://doi.org/10.1017/S1751731112000948 (2012).
Donohoe, D. The warburg effect dictates the mechanism of butyrate-mediated histone acetylation and cell proliferation. Mol. Cell. 48, 612 (2012).
Biddle, A., Stewart, L., Blanchard, J. & Leschine, S. Untangling the genetic basis of fibrolytic specialization by lachnospiraceae and ruminococcaceae in diverse gut communities. Diversity 5, 627–640 (2013).
Clark, A. et al. Strongyle infection and gut microbiota: Profiling of resistant and susceptible horses over a grazing season. Front. Physiol. 9 (2018).
Pan, J. H. et al. Acetic acid enhances endurance capacity of exercise-trained mice by increasing skeletal muscle oxidative properties. Biosci. Biotechnol. Biochem. 79, 1535–1541 (2015).
Kim, D.-H., Kim, S.-H., Jeong, W.-S. & Lee, H.-Y. Effect of BCAA intake during endurance exercises on fatigue substances, muscle damage substances, and energy metabolism substances. J. Exerc. Nutr. Biochem. 17, 169–180 (2015).
Whon, T. W. et al. Data Descriptor: The effects of sequencing platforms on phylogenetic resolution in 16S rRNA gene profiling of human feces. Sci. Data 5, 1–15 (2018).
Jovel, J. et al. Characterization of the gut microbiome using 16S or shotgun metagenomics. Front. Microbiol. 7, 1–17 (2016).
Lozupone, C., Stomabaugh, J., Gordon, J., Jansson, J. & Knight, R. Diversity, stability and resilience of the human gut microbiota. Nature 489, 220–230 (2012).
Cawthon, R. M. Telomere measurement by quantitative PCR. Nucleic Acids Res. 30, 47e–47 (2002).
Dieterle, F., Ross, A., Schlotterbeck, G. & Senn, H. Probabilistic Quotient Normalization as Robust Method to Account for Dilution of Complex Biological Mixtures. Application in 1H NMR Metabonomics. Anal. Chem. 78, 4281–4290 (2006).
Le Moyec, L. et al. Protein catabolism and high lipid metabolism associated with long-distance exercise are revealed by plasma NMR metabolomics in endurance horses. PLoS One 9, 1–10 (2014).
Zheng, C., Zhang, S., Ragg, S., Raftery, D. & Vitek, O. Identification and quantification of metabolites in 1H NMR spectra by Bayesian model selection. Bioinformatics 27, 1637–1644 (2011).
Caporaso, J. G. et al. correspondence QIIME allows analysis of high- throughput community sequencing data Intensity normalization improves color calling in SOLiD sequencing. Nat. Publ. Gr. 7, 335–336 (2010).
Rideout, J. R. et al. Subsampled open-reference clustering creates consistent, comprehensive OTU definitions and scales to billions of sequences. PeerJ 2, e545 (2014).
Navas-Molina, J. A. et al. Advancing our understanding of the human microbiome using QIIME. Methods Enzymol. 531, 371 (2013).
DeSantis, T. Z. et al. Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Appl. Environ. Microbiol. 72, 5069–5072 (2006).
Edgar, R. C. Search and clustering orders of magnitude faster than BLAST. Bioinformatics 26, 2460–2461 (2010).
Edgar, R. C., Haas, B. J., Clemente, J. C., Quince, C. & Knight, R. UCHIME improves sensitivity and speed of chimera detection. Bioinformatics 27, 2194–2200 (2011).
Price, M. N., Dehal, P. S. & Arkin, A. P. FastTree 2 - Approximately maximum-likelihood trees for large alignments. PLoS One 5 (2010).
Cole, J. R. et al. The Ribosomal Database Project: Improved alignments and new tools for rRNA analysis. Nucleic Acids Res. 37, 141–145 (2009).
de la Cuesta-Zuluaga, J. & Escobar, J. S. Considerations For Optimizing Microbiome Analysis Using a Marker Gene. Front. Nutr. 3, 1–12 (2016).
Mcmurdie, P. J. & Holmes, S. Phyloseq: a bioconductor package for handling and Analysis of High-Throughput Phylogenetic Sequence Data. Pac. Symp. Biocomput. 1, 235–246 (2012).
Dixon, P. VEGAN, a package of R functions for community ecology. J. Veg. Sci. 14, 927 (2003).
De Filippo, C. et al. Impact of diet in shaping gut microbiota revealed by a comparative study in children from Europe and rural Africa. Proc. Natl. Acad. Sci. 107, 14691–14696 (2010).
Fernandes, J., Su, W., Wolever, T. M. S. & Comelli, E. M. Adiposity, gut microbiota and faecal short chain fatty acids are linked in adult humans. Nutr. Diabetes 4, 121 (2014).
Martin-rosset, W. Equine Nutrition: INRA nutrient requirements, recommended allowances and feed tables. Acad. Publ. Wageningen (The Netherlands) 696 (2015).
Martin-Rosset, W., Vermorel, M., Doreau, M., Tisserand, J. L. & Andrieu, J. The French horse feed evaluation systems and recommended allowances for energy and protein. Livest. Prod. Sci. 40, 37–56 (1994).
Arumugam, M. et al. Enterotypes of the human gut microbiome. Nature 473, 174–180 (2013).
Ding, T. & Schloss, P. D. Dynamics and associations of microbial community types across the human body. Nature 509, 357–60 (2014).
Kaufman, L. & Rousseuw, P. J. Finding Groups in Data: An Introduction to Cluster Analysis. Biometrics 47, 788 (1991).
Rousseeuw, P. J. Silhouettes: A graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987).
Calinski, T. & Harabasz, J. A dendrite method for cluster analysis. Commun. Stat. - Theory Methods 3, 1–27 (1974).
Brunet, J. P., Tamayo, P., Golub, T. R. & Mesirov, J. P. Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci USA 101, 4164–4169 (2004).
Reverter, A. & Chan, E. K. F. Combining partial correlation and an information theory approach to the reversed engineering of gene co-expression networks. Bioinformatics 24, 2491–2497 (2008).
Clarke, K. R. & Ainsworth, M. A method of linking multivariate community structure to environmental variables. Mar. Ecol. Prog. Ser. 92, 205–219 (1993).
Benjamini, Y. & Hochberg, Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple. Testing. J. R. Stat. Soc. Ser. B 57, 289–300 (1995).
We are grateful to Marine Beinat, Julie Rivière, Emmanuelle Rebours, Jordi Estellé, Caroline Morgenthaler and Arni Janssen for participating in the sample collection and organization during the project. We also thank Alizé Nevot for helping compiling information about individual’s training intensity, duration and frequency as well as diet, and Valentine Ballan for helping gathering data associated with the genetic relativeness between individuals and metabolite identification. Lastly, we are grateful to the INRA MIGALE bioinformatics platform (http://migale.jouy.inra.fr) for providing computational resources.
The authors declare no competing interests.
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.