Gut microbiota (GM) have an indisputable effect on different host phenotypes, including immunity, digestion efficiency and cognitive functioning (Turnbaugh et al. 2006; Round and Mazmanian 2009; Cryan and Dinan 2012). As such, host–GM symbiosis can represent an important extension of current evolutionary theory (Zilber-Rosenberg and Rosenberg 2008). Most host vs. microbiota interaction studies over the past two decades have focused on the microbiota of animals held in captivity or humans. Such a narrow focus limits possibilities for gaining insights into the nature of such interactions from an evolutionary perspective (Hird 2017). For example, both captive-bred animals and westernised human populations harbour highly derived microbial communities (Clayton et al. 2016; Gomez et al. 2016; McKenzie et al. 2017; Rosshart et al. 2017), not necessarily reflecting mutual adaptations that have taken place during the co-evolutionary history of the GM and its host. Relaxed natural selection in captivity also restricts measurements of GM fitness effects.

To overcome these limitations, a growing number of studies have been describing the composition and function of symbiotic microbiota in wild animal populations. It has now been shown that GM are involved in modulation of a number of key ecophysiological and life-history traits in wild animal hosts and have the potential to increase their adaptation capacity (Alberdi et al. 2016). For example, GM allows for the utilisation and detoxification of a range of dietary components (García-Amado et al. 2007; Wang et al. 2014; Gomez et al. 2015; Kohl and Dearing 2016; Kohl et al. 2018; Kartzinel et al. 2019), and thus affects dietary niche width. Correlated changes between GM and other variables, such as host physiology (Dill-McFarland et al. 2014; Weldon et al. 2015; Amato et al. 2019), stress level (Vlčková et al. 2018; Stothart et al. 2019) and pathogenic infection (Kreisinger et al. 2015; Vasemägi et al. 2017; Aivelo and Norberg 2018) have brought valuable insights into interactions between the host, its GM and the environment under natural conditions. Some of these patterns have been further validated by transplantation of wild GM into experimental animal models (Sommer et al. 2016; Miller et al. 2016; Warne et al. 2017; Rosshart et al. 2017). As GM dysbiosis can have serious negative consequences on the viability of wild populations, there is an increasing need to integrate monitoring and maintenance of wildlife-associated microbial diversity into management and conservation practices (Trevelline et al. 2019).

Microbiota composition varies along the length of vertebrate gastrointestinal tracts (GIT; Zhao et al. 2015; Suzuki and Nachman 2016; Ericsson et al. 2016; Li et al. 2017; Yan et al. 2019; Tang et al. 2019) in connection with the different functions of each gut compartment (Donaldson et al. 2016). Intestinal compartments rich in diet-digesting bacteria are usually chosen for GM analysis, with the luminal contents or the caecum wall preferred for rodents, lagomorphs, most birds and herbivorous reptiles. However, GM sampling from live animals is invasive and, apart from cloacal/rectal swabs, difficult to manage in the field (Tang et al. 2020), while the killing of animals to sample their gut content is often illegal, e.g. in endangered species, and generally questionable from an ethical point of view (Zemanova 2020). Further, post-mortem sampling does not allow for repeated sampling of the same individual, which is advantageous when studying GM changes during ontogeny or when comparing with temporarily changing environmental variables (Kreisinger et al. 2017; Björk et al. 2019). Sampling of faecal microbiota (FM), on the other hand, has potential as a widely applicable non-invasive substitute of wildlife GM sampling (Zemanova 2019, 2020), assuming that it is feasible in the field and that FM constitutes a good proxy of GM.

While truly non-invasive collection of abandoned faeces is feasible for some large vertebrates (Menke et al. 2015), the majority of small animals need to be trapped to obtain faecal material. In such cases, the animals can be left to defecate after capture, or rectal/cloacal swabs can be taken, both of which, however, prolong handling and cause additional stress to the animal. Collecting faecal material directly from traps clearly represents a less invasive and time-consuming option; however, the main concerns with this approach are contamination of the faeces by microbes from non-sterile traps (e.g. from previously trapped animals) and progressive changes in microbiota after defecation, as addressed by Kohl et al. (2015).

A number of recent studies have focused on the comparison of faecal and GM in various mammals (Stearns et al. 2011; Zhao et al. 2015; Yasuda et al. 2015; Li et al. 2017; Ingala et al. 2018; Stalder et al. 2019), including house mouse Mus musculus (Gu et al. 2013; Weldon et al. 2015; Suzuki and Nachman 2016; Tanca et al. 2017), as well as in birds (Stanley et al. 2015; Videvall et al. 2017; Yan et al. 2019). In those studies that included multiple intestinal compartments, the general trend appears to be that both alpha diversity and composition of microbiota from faeces tends to be more similar to those from the lower GIT (colons and caeca) than those in the small intestine (Stearns et al. 2011; Gu et al. 2013; Zhao et al. 2015; Yasuda et al. 2015; Suzuki and Nachman 2016; Li et al. 2017; Yan et al. 2019). This can be explained by the closer physical proximity of faeces and lower GIT compartments. Moreover, faecal and lower GIT microbiota tend to share the majority of their bacteria; hence, FM would appear to be suitable for detecting bacterial taxa present in the intestines. On the other hand, FM and lower GM appear to differ in taxonomic composition as they show substantial abundance shifts in bacteria at various taxonomic levels. In line with this, house mouse FM and GM are characterised by distinct functional profiles (Tanca et al. 2017).

In-depth systematic analysis of GM–FM abundance shifts, using high numbers of individuals, have been performed on chickens (Stanley et al. 2015; Yan et al. 2019) and (with fewer samples) ostriches and rhesus macaques (Yasuda et al. 2015; Videvall et al. 2017). Overall, there was a moderate bacterial abundance correlation between FM and GM, with the strength of correlation varying significantly between individually analysed bacterial taxa and the taxonomic level examined.

A profoundly important aspect of FM–GM resemblance for wildlife studies is that differences in microbiota composition and diversity between individuals must follow the same pattern when analysing faecal and intestinal microbiota. Wildlife microbiome research tends to be explorative in nature and is usually based on observing correlations between inter-individual differences in microbiota composition or diversity and variation in the traits of interest. Some studies across or within wild species have evaluated consistency in FM vs. GM changes in response to several variables (Weldon et al. 2015; Ingala et al. 2018; Stalder et al. 2019); however, their results do not address the general consistency of inter-individual GM and FM patterns. Indeed, there have been no straightforward analyses to date evaluating correlation strength between inter-individual differences in FM vs. GM (including composition and alpha-diversity measures), possibly due to the relatively high sample sizes needed for such comparisons.

Under some circumstances, such as when sampling road kills or during targeted decreases in population density (e.g. invasive species or animal pests), direct sampling of wildlife intestinal microbiota is appropriate, or at least justifiable (Dubois et al. 2017). GM may often be sampled as part of larger research projects with specific aims requiring an invasive approach, e.g. when sampling the inner organs or during parasitological examination. In live-trapped animals, GM samples can be taken immediately after scarification; however, it remains unknown whether changes occur in GM due to stress induced by the trapping process. If the animal is killed during the trapping process, as with snap-traps commonly used on rodents, the time delay between the animal’s death and sampling can be as long as that between setting the trap and dissection of the animal. The few studies investigating links between ante-mortem and early post-mortem GM in humans indicate stability in GM composition for at least 2 days after death, with the caecal microbiome changing more quickly than rectal (Tuomisto et al. 2013; Pechal et al. 2018). However, animals with smaller body sizes are likely to show different timing of post-mortem changes compared to humans, e.g. a more rapid decline in corpse body temperature, which may substantially influence microbial dynamics (Brooks 2016). In line with this, Heimesaat et al. (2012) documented an early onset of post-mortem illeal microbiota changes in mice, with some bacterial taxa starting to overgrow as soon as 3 h after death. On the other hand, Lawrence et al. (2019) showed that rabbit caecal microbiota (CM) remained stable for 48 h after death at two different temperatures.

In this study, we conduct a controlled experiment to evaluate potential bias in GM due to (i) the trapping process, and (ii) to delayed post-mortem sampling, and examine the suitability of FM as a non-invasive proxy for GM. In particular, we assess differences in CM profiles in house mice from a conventional breeding facility that were (i) unexposed to the environment in live-traps, (ii) were kept alive in the trap overnight and sampled next morning and (iii) those that were euthanised, left lying on the trap and sampled with a one-night delay. Moreover, faeces from individuals included in the second treatment group were collected from the traps and used for subsequent comparison of FM vs. CM.

Material and methods

Experimental animals

This experiment used 52 house mice (Mus musculus) from eight families and four different genetic backgrounds (strains C3Ha, C3Hb, PWD and PWK [Laukaitis et al. 1997; Gregorová and Forejt 2000]) kept at the breeding facility of the Institute of Vertebrate Biology in Studenec (licences for keeping and experimental work 61,974/2017‐MZE‐17214 and 62,065/2017‐MZE‐17214, respectively). For three families, we included breeding pairs (parents) and their offspring (brothers and sisters) into the experiment, while the other families were represented by offspring only. The breeding pairs were co-housed in a single cage before the experiment, while the offspring were caged separately after weaning at 21–24 days. Parental age ranged between 272 and 274 days, while offspring ranged between 34 and 45 days.

In a previous study, we showed that GM alpha-diversity and GM inter-individual variability were similar in wild mice and mice from our breeding facility, despite the two groups clustering separately based on GM composition (Kreisinger et al. 2014). Note, however, that this compositional difference was not due to differences in the presence/absence of bacterial taxa, but rather to moderate shifts in abundance. Importantly, the experimental setup allowed for us to control other variables, which would be hard to achieve in the wild. In particular, we introduced defined variability for biologically relevant traits, i.e. genetic background, family relationships and age, and the experiment was performed over a single night under stable conditions to minimise environmental variability. Despite this, we could not exclude the effect of some other factors, such as decreased stress susceptibility in captive animals, which could have resulted in an underestimation of GM changes due to stress induced by the trapping process. It is also possible that under environmental conditions different from those set in the experiment (e.g. temperature), there could be differences in the magnitude of microbiota changes.

Experimental procedures

The mice were divided into three treatment groups, (1) alive-in-trap (21 ind.), (2) dead-on-trap (22 ind.) and (3) control-in-cage (9 ind.). Each family comprised one to several alive-in-trap/dead-in-trap pairs and 1–2 control-in-cage individuals (detailed in Table S1). Each of the alive-in-trap mice was kept at room temperature overnight (16–18 h) in a wooden live trap previously used for trapping rodents. The mice were provided with ‘breeding facility chew’ but no water in order to mimic live-trapping conditions. After an alive-in-trap individual was put into the trap, its dead-on-trap counterpart was euthanised by cervical dislocation and left lying on the trap to simulate snap-trapping conditions. The traps were then placed on shelves in a single breeding facility box in a random order and ca. 40 cm apart. The control-in-cage individuals were left in their breeding cages. After 16–18 h, the alive-in-trap and control-in-cage mice were euthanised. Mice of all three groups were then dissected and the caecum tips flash-frozen in liquid nitrogen and stored at −80 °C. Faeces were collected from each trap immediately after dissection of the trapped individual was completed. This resulted in a mixture of faeces ranging from 0 to 18-h old, as would be expected for over-night trapping in the wild. The faecal samples were then stored in the same manner as the caecum tips. The experiment was performed over a single night in order to avoid any effect of temporal changes in the microbiota.

DNA was extracted from the caecum tips and from five randomly chosen faecal droppings per individual using the DNeasy PowerSoil HTP 96 Kit (Qiagen, Netherlands), following the manufacturer’s instructions. Bacterial 16S rRNA amplicon sequencing libraries were constructed using two-step PCR, with technical duplicates prepared to account for noise due to PCR and sequencing stochasticity. A barcoding PCR was performed with the 16S rRNA primers S-D-Bact-0341-b-S-17 (CCTACGGGNGGCWGCAG) and S-D-Bact-0785-a-A-21 (GACTACHVGGGTATCTAATCC) amplifying the V3–V4 variable region. The primers were extended at the 5′ ends with inline barcode sequences to increase multiplexing capacity and with ‘tail’ sequences serving as priming sites for the second PCR. The 10-µl PCR reactions consisted of 1x KAPA HIFI Hot Start Ready Mix (Kapa Biosystems, USA) and each primer at 0.2 µM and 4.6 µl of DNA template. These were incubated at 95 °C for 3 min, followed by 28 cycles of 95 °C/30 s, 55 °C/30 s, 72 °C/30 s and a final extension at 72 °C/5 min. Dual-indexed Nextera sequencing adapters were reconstructed during the second PCR, which followed the first PCR conditions except that it was performed in 20-µl volume, the concentration of each primer was 1 µM, 1.5 µl of the first PCR product diluted × 12.5 was used as a template and the number of PCR cycles was 12. The products of the second PCR were quantified by 1.5% agarose gel electrophoresis, pooled equimolarly, purified with SpriSelect beads (Beckman Coulter, USA) and size-selected with Pippin Prep (Sage Science, USA) at 520–750 bp. The pool of libraries was sequenced using MiSeq (Illumina, USA) and v3 chemistry (i.e. 2 × 300-bp paired-end reads) at the CEITEC Genomics Core Facility (Brno, Czech Republic).

Bioinformatic and statistical analysis

We used dada2 (Callahan et al. 2016) for quality filtering (reads with >1 expected error eliminated) and denoising of fastq files. The resulting Amplicon Sequence Variants (hereafter ASV) were checked for presence of chimeric variants using uchime (Edgar et al. 2011) and only those that were detected in both technical duplicates were retained (e.g. Pafčo et al. 2018). Taxonomic assignment was based on dada2 implementation of the RDP classifier (Wang et al. 2007; posterior confidence threshold set to 80%) and the Silva reference database version 132 (Quast et al. 2013). The final dataset included 1468 ASVs and 788,238 reads (average coverage per sample: 10,798; range: 7269–18,533).

We rarefied the resulting abundance matrix to account for uneven sequencing depth (rarefaction threshold: 7269 reads per sample) and calculated Shannon diversities and number of observed ASVs in each sample for further alpha diversity comparisons. Differences in microbiota composition were analysed based on pairwise Bray–Curtis (accounting for ASV abundance) and binary Jaccard (accounting for ASV prevalence) dissimilarities, both being calculated for the rarefied dataset. We decided to use rarefaction for these analyses as (i) it is the most simple and one of the most transparent means of data normalisation, (ii) it is probably the only valid standardisation method for prevalence-based betadiversities and (iii) it is unlikely to be associated with any adverse effect on both Type I and II statistical errors (at least in the case of our data). The latter claim was supported by a nearly perfect correlation between Bray–Curtis dissimilarities for rarefied data vs. unrarefied data with read numbers converted to ASV proportions (Mantel test: r = 0.9988, p = 0.0001). The same was true for alpha diversity estimates, where diversity of rarefied dataset correlated perfectly with the unrarefied data if log-transformed sequencing coverage was included as a covariate (linear regression: estimate [±S.E] = 1.0104 [±0.0064], F2,70 = 13010, R2 = 0.9973, p < 0.0001 for ASV richness, and estimate [±S.E] = 1.0.9919 [±0.0073], F2,70 = 10880, R2 = 0.9968, p < 0.0001 for Shannon diversity).

Differences in GM alpha diversity between treatment groups were analysed using linear mixed-effect models (LMM), where the treatment group or sample type (i.e. CM vs. FM) were considered as categorial explanatory variables and family membership or individual identity as random effects. Observed ASV numbers were log-scaled to achieve normal distribution of residuals. Variation in GM composition between the treatment groups, genetic background and families was visualised using unconstrained Nonmetric Multidimensional Scaling (NMDS). Next, we applied constrained distance-based redundancy analysis (db-RDA; Legendre and Anderson 1999), where treatment group identity was included as a predictor and family membership and genetic background were considered as conditional variables. This means that GM differences due to families and genetic background were removed prior to statistical testing of the predictor variable (i.e. the treatment group) and their effect was also eliminated from the resulting ordination plots. In this way, even subtle GM differences caused by the treatment could be revealed. It is worth noting that this kind of model cannot be fitted by more commonly used community microbiology methods, e.g. PERMANOVA.

Candidate ASVs, whose abundances differed among treatment groups, were identified based on generalised LMM (hereafter GLMM) with negative binomial distribution. GLMMs included per-sample read count for each ASV as a response and the same fixed and random effect structure as described above. GLMMs were fitted for unrarefied ASV read counts to avoid inflation of Type II statistical errors (McMurdie and Holmes 2014). To account for variation in sequencing depth between samples, log-transformed sequencing depth was specified as an offset. In the case of GLMM, the method of Benjamini and Hochberg (1995) was used for multiple testing corrections.

Correlations between FM and CM alpha diversity sampled in the same individual were tested using Pearson correlations (i.e. the number of observations equalled the number of individuals sampled). Correlations between microbiota compositions of the two sample types were tested by Mantel’s tests, which directly compared composition-based distance matrices (either Jaccard or Bray–Curtis) and therefore the number of observations equalled to (n2 − n)/2, where n = number of individuals sampled. All statistical analyses were conducted in R version 3.4.4. (R Core Team 2018). Annotated scripts used for data analysis are available in Supplementary materials S3 and S4.


Effect of sampling delay and trapping on CM

Consistent with most previous studies on house mouse GM, the CM of all experimental individuals was dominated by bacteria from Clostridia, Bacteroidea and Campylobacteria (Fig. S1).

There were no significant differences in caecal alpha diversity between control-in-cage individuals vs. alive-in-trap or dead-on-trap individuals (LMM: ΔD.f = 2, χ2 = 0.9271, p = 0.629 for observed number of ASVs, and ΔD.f = 2, χ2 = 4.2078, p = 0.122 for Shannon diversity). However, both Shannon diversity and number of ASVs exhibited a slight, non-significant increase in dead-on-trap mice (Fig. 1).

Fig. 1: Alpha diversity variation of caecal microbiota between control-in-cage (control), alive-in-trap (alive) and dead-on-trap (dead) mice.
figure 1

Alpha diversity comparisons were based on A number of observed ASVs and B Shannon diversity index. All differences between treatment groups were non-significant (p > 0.05) according to linear mixed-effect models.

According to unconstrained NMDS, variation in GM composition between mice families and genetic backgrounds prevailed over systematic differences between treatment levels (Figs. 2 and S2). However, if we applied db-RDA to eliminate GM variation given by families/genetic backgrounds, we detected significant differences between the three groups (db-RDA: pseudo-F(2,64) = 1.7499, p = 0.005 for prevalence-based dissimilarity, and pseudo-F(2,64) = 1.8875, p = 0.005 for abundance-based dissimilarity). The first db-RDA axis tended to separate control samples from both live and dead groups (F(1,64) = 2.2338, p = 0.012 for prevalence-based dissimilarity, and F(1,64) = 2.4706, p = 0.007 for abundance-based dissimilarity), while the second db-RDA axis was non-significant (F(1,64) = 1.2865, p = 0.113 for prevalence-based dissimilarity, and F(1,64) = 1.3145, p = 0.122 for abundance-based dissimilarity). GLMMs identified two ASVs from the genus Helicobacter and an unassigned Prevotelaceae, whose abundance increased in alive-in-trap mice compared to controls (Fig. S3). However, there was no ASV-level difference between dead-on-trap vs. alive-in-trap mice.

Fig. 2: Compositional variation of caecal microbiota between control-in-cage (control), alive-in-trap (alive) and dead-on-trap (dead) mice.
figure 2

Analysis was based on unconstrained ordinations (NMDS) and constrained ordinations (db-RDA) that maximise gradients between treatment groups and eliminate variation due to family/strain membership. Calculations were based on community divergence accounting for ASV prevalence (Jaccard) or abundance (Bray–Curtis).

Comparison of faecal and CM

Alpha diversity of caecal samples did not differ significantly from that of faecal samples (LMM: ΔD.f = 1, χ2 = 2.7641, p = 0.0964 for ASV richness, and ΔD.f = 1, χ2 = 3.1185, p = 0.07741 for Shannon diversity). But, the observed number of ASVs did not correlate between the faeces and caecum of the same individual (Pearson r = 0.2909, p = 0.2007), while the Shannon index did show a significant, though weak, correlation (Pearson r = 0.4821, p = 0.0269; Fig. 3).

Fig. 3: Microbiota diversity and composition in caeca vs. faeces.
figure 3

Correlation in alpha diversity (A, B) and composition (C, D) between faecal and caecal samples obtained from the same individual (A, B) and from the same pair of individuals (C, D). Observed number of ASVs and Shannon diversity were used for alpha diversity analysis, while prevalence- (Jaccard) and abundance-based (Bray–Curtis) dissimilarities were used for compositional analysis.

Composition of faecal samples was tightly correlated with composition of CM (Mantel’s test: r = 0.8473, p = 0.001 for abundance-based dissimilarity, and r = 0.8933, p = 0.001 for prevalence-based dissimilarity; Fig. 3). Importantly, both faecal and caecal samples showed divergence in microbiota composition between mice families and strains, with the effect size consistent between the two sample types (Fig. S5). Similarly, divergence calculated with a mixed dataset (i.e. pairwise dissimilarities between caecal vs. faecal samples, excluding caecum-faeces pairs from the same individuals) was consistent, with divergence in caecal or faecal samples only (Fig. 4).

Fig. 4: Dissimilarity in microbiota composition between and within mouse families.
figure 4

Analysis was based on Bray–Curtis and Jaccard dissimilarities between individuals in A caecal microbiota, B faecal microbiota and C a combination of both sample types (excluding dissimilarities between the same sample types and the same individual). Also shown are actual observations, means and 95% bootstrap-based confidence intervals.

Despite FM and CM composition being tightly correlated, and unconstrained NMDS ordination not suggesting any pronounced compositional divergence (Fig. S5), db-RDA accounting for within-individual covariance revealed systematic differences in microbial content between the two sample types (db-RDA: pseudo-F(1,20) = 7.9361, p = 0.001 for abundance-based dissimilarity and pseudo-F(1,20) = 4.6582, p = 0.001 for prevalence-based dissimilarity; Fig. S5). Based on GLMM, we identified five ASVs from the genera Lactobacillus, Bacteroides and Anaeroplasma that were over-represented in faecal samples (Fig. S4).


To achieve reliable high-throughput profiling of host-associated microbial communities, considerable research effort has been dedicated to the development and optimisation of suitable wet-laboratory, bioinformatic and statistical tools (reviewed in Pollock et al. 2018). Our study extends this endeavour by searching for optimal strategies for host-associated microbiota sampling in wild host populations.

Our data suggest that short-term exposure of live animals to the trap environment has a limited, but still measurable, effect on microbiota structure. In particular, two ASVs belonging to the genus Helicobacter and the Prevotellaceae family (Fig. S3) increased in animals that stayed in the traps overnight compared to those left in their cages. The fact that the difference was due to an abundance shift in bacteria already present in the mouse gut indicates that it was caused by altered physiology, possibly due to stress induced by enclosure in the trap, rather than exposure to a ‘dirty’ trap environment. In line with this reasoning, it has been shown that members of the Prevotellaceae family are involved in many aspects of host physiology, including homoeostasis of energy metabolism, development of autoimmune diseases, response to hypoxia and acute and chronic stressors (De Filippo et al. 2010; Wu et al. 2011; Maslanik et al. 2012; Scher et al. 2013; Palm et al. 2014; Gorvitovskaia et al. 2016; Amaral et al. 2017; Suzuki et al. 2019; Stothart et al. 2019; Iljazovic et al. 2020). Nevertheless, to exclude the possibility of environmental transfer (though this is rather improbable for obligate anaerobes such as Prevotella), it would be necessary to include samples taken directly from the traps. To better distinguish between contaminating and resident gut bacteria, it would also be of great benefit to employ recent metabarcoding protocols based on the whole 16S rRNA region and long-read sequencing technologies (Earl et al. 2018; Callahan et al. 2019; Matsuo et al. 2020). Overall, it appears that overnight trapping with non-sterile live-traps has a negligible influence on the reliability of GM profiling.

Similarly, delayed microbiota sampling after the animal’s death (16–18 h at room temperature) did not affect GM profiles. Clustering of GM samples followed the natural structure of the dataset (family membership, genetic background), with no visible effect of sampling delay. At the same time, there was no ASV over- or under-representation in the overnight-dead animals compared to those kept alive in traps. It follows, therefore, that sampling from fresh cadavers or snap-trapped individuals should provide reliable information on their GM content. When choosing between kill traps or live traps for the purposes of invasive GM sampling, animal welfare should be the main point considered as GM sample quality appears to be comparable under both settings. In this sense, use of snap traps prevents undue suffering from long-term enclosure in the trap, the resulting stress of which can sometimes prove fatal. Furthermore, snap trapping can be more efficient in terms of trapping success for many animal species. On the other hand, live traps allow for selection of the animals to be invasively sampled, i.e. non-target species, juveniles or redundant individuals can all be released unharmed. Consequently, each study should carefully consider which traps to use based on their particular circumstances.

As in previous studies (Zhao et al. 2015; Yasuda et al. 2015; Weldon et al. 2015; Li et al. 2017; Ingala et al. 2018; Yan et al. 2019), FM composition differed from microbiota found in the caecum, which plays a crucial role in microbial fermentation processes in many mammalian species (Karasov and Douglas 2013), with faeces displaying enriched levels of the genera Lactobacillus, Bacteroides and Anaeroplasma (Fig. S4). Importantly, despite this difference, FM and CM were tightly correlated, both for ASV abundance and prevalence (Fig. 3). The proportionality of inter-individual composition differences between faeces and the caecum was further documented by the similarity in their association patterns. As with caecal samples, faeces followed the same biologically relevant variation patterns of the host, especially as regards divergence within and between social groups and genetic backgrounds, with divergence being similar for both sample types (Fig. S5). Somewhat surprisingly, similarity was even retained for mixed samples (i.e. divergence using faecal–caecal dissimilarity), suggesting that, for this type of analysis, the two sample types may be interchangeable (Fig. 4).

The average alpha-diversity of caecal samples, measured as number of observed ASVs and as Shannon index, did not differ significantly from that of faecal samples. However, at the individual level, alpha diversity in the caecum showed only weak or no correlations with alpha diversity in faeces. Taken together, these results suggest that faecal alpha diversity says little about caecal alpha diversity in the same individual, despite the population means of both communities being comparable.

Overall, we were able to show that faecal samples collected several hours after defecation were a suitable proxy for inter-individual differences in CM composition, but were a much worse proxy for differences in caecal alpha-diversity. Also, FM showed significant shifts in bacterial abundance compared to CM. Researchers should be aware of these two caveats if using the FM as a proxy of GM, and should consider their possible influence, depending on the type of analysis performed. Considering that FM sampling does not require the death of the animals studied, however, we encourage researchers to use FM, at least in correlative, divergence-based GM studies in wild animals similar to the house mouse. We further suggest that future studies should pay more attention to comparisons of FM–GM alpha-diversities in other wild-living species.