Persistence of functional microbiota composition across generations

Holobionts are defined as a host and its microbiota, however, only a fraction of the bacteria are inherited vertically and thus coevolve with the host. The “it’s the song, not the singer” theory proposes that functional traits, instead of taxonomical microbiota composition, could be preserved across generations if interspecies interaction patterns perpetuate themselves. We tested conservation of functional composition across generations using zooplankton, mosquito, and plant datasets. Then, we tested if there is a change of functional microbiota composition over time within a generation in human datasets. Finally, we simulated microbiota communities to investigate if (pairwise) interactions can lead to multiple stable community compositions. Our results suggest that the vertically transmitted microbiota starts a predictable change of functions performed by the microbiota over time, whose robustness depends on the arrival of diverse migrants. This succession culminates in a stable functional composition state. The results suggest that the host-microbiota interaction and higher order interactions in general have an important contribution to the robustness of the final community. If the proposed mechanism proves to be valid for a diverse array of host species, this would support the concept of holobionts being used as units of selection, including animal breeding, suggesting this has a wider applicability.

microbiota composition through a constant crosstalk that involves its innate and adaptive immune system as well as the epithelial cells in direct contact with symbionts 13 . The interactions or lack thereof between symbionts are typically described with competition models borrowed from ecology, that incorporate or omit time structure 14 . For instance, the generalized Lotka-Volterra models use a matrix to represent the interactions between every pair of species in a community (pairwise species interactions) and describe the population dynamics over time of each species with a differential equation 14 . The self-organized instability (SOI) model also describes interspecies interactions using a matrix, but the population dynamics are stochastic and represented with a mechanistic set of rules that allows the simulation of a community 15 . Such descriptions do not capture higher order interactions nor account for the migration between hosts 12 . The process of maturation of a community within a host sometimes displays clear sequences, such as the transition from facultative to strict anaerobic bacteria in humans 16 ; this process is analogous to an ecological succession, which in this case leads to a stable community 17,18 .
Whether or not the concept of holobionts is needed or redundant is still debated 2,4,12,[19][20][21] . Given the diverse array of symbionts within a host, selection at the holobiont level cannot account for all the host-symbiont interactions, nor the interactions between symbionts, as the requirements for partner fidelity are unlikely to be met for every interaction 20 . Yet, there are developmental, physiological, anatomical, and immunological traits that rely on host-symbiont interactions 12 . Given that the microbiota shows redundance at the functional level 22,23 ; Doolittle and Booth 19 proposed that interspecies interaction patterns can perpetuate themselves over time through "recurrence" (as opposed to replication), regardless of the specific species that form part of the interaction. The mechanism of mutual perpetuation between the interaction patterns and the specific traits of the interacting species is called the "It's the song, not the singer" (ITSNTS) theory 24 , where their focus is on novel mechanisms of recurrence across generations.
The ITSNTS theory applied to holobionts 19 assumes that the interaction patterns within a single generation remain constant across time, but it is possible to extend the idea through a developmental approach. Chang et al. 25 and Shaw et al. 26 showed that if we consider microbiota as a dynamical system, then the communities display metastability: there are several possible discrete stable configurations, that correspond to alternative late succession communities. For instance, coral reefs can display bi-stability 27 and semiarid/arid communities are two configurations of a bi-stable system that depends on the aridity level 28 . The coexistence of alternative late successional states is also displayed by ecological systems with asymmetrical competition 29 and by metacommunities far from equilibrium (steady state) or with variation in local community quality or species traits 30 . Given that the attractor (i.e. a stable discrete configuration) reached by a community depends on the initial state, as shown e.g. for coral reefs 27 , metastability combined with the aforementioned ecological succession could account for the conservation of functional composition (the biochemical functions performed by the symbionts) across generations predicted by Doolittle & Booth 2 : The conserved functional composition would correspond to an attractor, which is stable because of the interaction patterns (Fig. 1a). The vertically inherited fraction of the community could define a starting configuration, from which the community converges to a specific attractor in a functional space, regardless of the taxa of the migrants. Therefore, our objectives were to investigate to what extent functional composition is preserved across generations in experimental lab conditions (1), and to propose a mechanism for this preservation based on the ITSNTS theory. For the latter, we tested if there is an ecological succession functional space, and if it depends on the initial community composition (2); we then searched for metastability in the functional composition of communities (3); and tested if pairwise interactions can account for that metastability (4).
These objectives were addressed by investigating different publicly available datasets that matched the requirements for one or more of those objectives. For (1), we compared the functional composition of zooplankton, mosquito, and plant microbiota across generations. For (2), we compared the functional composition of monthly faecal samples from babies born either vaginally or with caesarean section. All datasets can display metastability and are used for (3). We simulated communities based on stochastic mechanistic models for (4). Whereas (1) tests directly functional recurrence in holobionts, (2) and (3) together test a possible mechanism of transmission across generations and (4) narrows down the set of conditions needed by that mechanism.

Datasets.
To test functional persistence across generations, we used the zooplankton dataset from 31 , the mosquito dataset from 32 , and the plant dataset from 11 (Table 1). The first dataset comprises of five generations where offspring of the control group was placed in the control group, and offspring of the antibiotic treatment to remain in the antibiotic group or recover in the control group (leading to a total of four recovery groups). The second dataset comprises generations 0, 5 and 10 of two isolated mosquito populations bred with lab or field water, originally used to test if the microbiota is preserved or not under different lab breeding conditions (with the different water sources). The third dataset comprises generations 0, 1 and 2 of plants cloned via stolons used to infer a core set of taxa transmitted vertically common to 10 representative ecotypes of the species. Hence, horizontal transmission was completely inhibited and both roots and stolons were sampled.
To test the ecological succession on the functional landscape, we used the human datasets of 33,34 . The first one was originally used to describe how different perturbations (antibiotics, birth mode, diet) affect the early community development and the second one is part of the DIABIMMUNE project, aimed at testing the role of the microbiota in the development of autoimmune diseases. These two human datasets comprise the faecal microbiota of approximately 40 and 200 human babies (Table 1) repeatedly sampled from birth and up to three years, born either naturally or with a caesarean section.
Data processing and analysis. All the raw samples were processed using the DADA2 pipeline 35  www.nature.com/scientificreports/ 250 and 200 bp respectively and both forward and reverse plant samples were truncated at 250 bp. Every other filter and trimming parameter was kept at the standard configuration for all samples (maxN = 0, maxEE = 2, truncQ = 2, rm.phix = TRUE). The unique sequences were inferred without pooling across samples. The corresponding amplicon sequence variant (ASV) tables were functionally annotated using Tax4Fun2 using the default reference database and parameters 36 . The second human dataset was already processed through the MGnify pipeline 37 version 4.1 38 , and the Gene Ontology (GO) composition of each community was used directly.
Landscape analysis of the functional composition. The functional predictions were then used to run the landscape model 25 . The frequencies of the predicted functions, instead of the frequencies of ASVs, were used to calculate the Jensen Shannon divergence between communities. The resulting dissimilarity matrices were then used to run a Principal Coordinate Analysis (PCoA). The two first Principal Coordinates (PCos) were used as the filter functions for the Mapper algorithm ( Fig. 1a) 39 , moreover we calculated the percentage of variance explained by the PCos (Table 2) for all datasets. The Mapper algorithm was performed with the follow- Each node is assigned a k-nearest neighbours (kNN) value (inversely proportional to the density of points) that is later modified to get a potential. The arrows are then directed from higher to lower potential. The nodes with the lowest potentials correspond to attractors, and the nodes that lead to them form their corresponding basin of attraction (all the community states that eventually converge to the attractor). The outcome shows two attractors that correspond to either X or Y with high abundance and the other excluded. Table 1. Datasets included in this study. *All the samples belong to a single generation, ** 5 individuals were pooled, *** Number of treatment replicates instead of individuals. www.nature.com/scientificreports/ ing hyperparameters (kept exactly as in 25 ) for all datasets: Number of intervals for rank = 15 for both PCos, % overlap = 70% and number of bins = 10. Finally, the mapper output was transformed in a directed graph to find the attractors and corresponding basins ( Fig. 1b) 25 . We are running the Mapper algorithm to find attractors in the functional space, and therefore different communities that perform the same functions will belong to the same attractor. Hence, the conservation of the attractors through generations is an indirect test of functional recurrence.

Simulations of the Hubbell and SOI model. Simulations were run to test if pairwise interactions can
account for metastability in the taxonomical community composition. The communities were simulated using either the SOI model 15 or the Hubbell model 40 using the R-package seqtime 14 . While the Hubbell model simulates communities without pairwise interactions, the SOI model includes the fraction of all the possible pairwise interactions as a parameter (connectivity). The chosen connectivity values were 0 (the Hubbell model), 0.01 (as used in the simulations of 14 ), and 0.1 (as used in 15 ). In addition, a fraction of individuals (hosts) from the initial composition was fixed: every simulation started with a randomly chosen and a fixed group of individuals. Either 0, 100, or 200 of all 500 simulated individuals were fixed. 450 replicates were simulated for every parameter combination. All replicates from a given parameter combination shares the interaction matrix, the initial fixed individuals, and the migration probabilities. When generating the interaction matrices, the positive edge percentage (the percentage of non 0 values of the interaction matrix that are greater than 0) was kept at most at 30%, switching randomly the signs of the matrix elements 14 . All communities were simulated for 600 timesteps. For all the communities the metacommunity species number was kept at 50, the migration probability per species was drawn from a standard uniform distribution.
The similarity between pairs of communities with the same parameters was measured using the Morisita index (which decreases with higher beta diversity and approaching 0 for completely different communities) 41 . From the 450 replicates per treatment, we grouped them in pairs, and each community was used only once, resulting in 225 pairs. The effect of the connectivity, the number of fixed individuals and the interaction between both on the Morisita index was tested using a generalized linear model with an inverse link and a gamma distribution, using the package stats from base R: where c i is the ith connectivity value (0, 0.01 or 0.1), x j is the jth number of individuals (0, 100 or 200) fixed at the start of the simulation, cx ij is the interaction term and y ijk is the Morisita index of the kth replicate. The main and nested models were ranked according to their Akaike information criterion corrected for small samples (AICc), using the MuMIn package 41 . That is, we ranked them based on how well they fit the data, according to their log-likelihood given the data, while penalizing models with more parameters.

Results
Persistence of functional traits across generations. Results from the plant, zooplankton, and mosquito mapper plots show metastability ( Fig. 2A-C), i.e. there are groups of nodes with different colours, and each colour represents a different attractor (Fig. 2D-F). Additionally, populations show an increase in the diversity of occupied attractors across generations (Fig. 2G-I), even when accounting for the plant tissue, antibiotics, or the water type, respectively ( Fig. 2D-I). The antibiotics treated zooplankton in generation 1 resulted in a single attractor in the fifth generation compared to two attractors for the zooplankton in the control group (Fig. 2E,H). The lab water-bred mosquitoes belong to a single attractor, or do not belong to an attractor at all (Fig. 2F,I). There are fewer counts on generation 5 of lab water mosquitoes and not a single count from generation 10 mosquitoes, because every corresponding vertex is a singleton (has a single sample) and thus is removed from the mapper plot. The increase of diversity of occupied attractors and the increase of singletons in the datasets imply that the

Intraspecies interactions as a source of metastability.
Having found evidence suggesting metastability in the plant and mosquito datasets, we simulated communities to investigate if pairwise interspecies interactions can account for it. As the amount of interactions increases, the differing starting conditions could either lead to convergence or divergence of community composition between pairs of communities. This would be reflected as a change in the Morisita index explained by the interaction between the two parameters (connectivity and initial fixed individuals). After the simulations, most Morisita index values lie between 0 and 0.75 (Fig. 5), and all are negative. Hence, the alpha (within communities) diversity is higher than the beta (between communities) diversity for all simulations, i.e. the diversity within hosts is higher than the diversity between hosts. The best model (according to the AICc ranking) only includes connectivity (Table 3) and the complete model only appears in third place; in other words, connectivity alone explains the differences in the index values. As there is not a significant difference between the best and complete models (Deviance = 1.2672, df = 2, P = 0.2853), there is no evidence that the interaction effect predicts the beta diversity. Therefore, the results of the simulations do not support metastability. In other words, the different initial conditions do not converge to a limited number of community states over time, even when there is high connectivity. www.nature.com/scientificreports/

Discussion
The ITSNTS theory 24 proposes that interaction patterns in the microbiome behave as units of selection. In other words, such patterns 'recur' across generations, there is variation between patterns, and selection can act upon those variations. Here, we tested the recurrence across generations, and we will conclude by proposing a possible mechanism. Functional annotations of metagenomic and 16S samples are used as a proxy of interactions; This assumes that the predicted functions of the genes found in the microbiota reflect the underlying metabolic networks. Our approach has two main limitations: not all the functionally annotated genes play a direct role in the interactions, and 16S samples have limited resolution. We acknowledged this by focussing on community level comparisons rather than specific taxa/functions, although an intrinsic similarity can be expected because only published genomes are used. Nevertheless, the one WGS dataset does corroborate the findings of the 16S datasets, i.e. preservation of functional composition. These afore mentioned limitations may still persist in the results and the interpretation thereof, because we used publicly available data wet-laboratory validation is very difficult to perform. However, in our study we focused on mathematical and statistical approaches of proving the underlying theory. We carried out three different methods: first, we tested if there is conservation of microbiota functional composition across generations using zooplankton, mosquitoes and plants as model organisms. Thereafter, we explored possible mechanisms to explain this conservation across generations. Two human datasets were used to investigate if the functional composition can be understood as an ecological succession; in other words, if www.nature.com/scientificreports/ there is a robust change of functional composition over time, common to every host. Finally, we simulated communities to observe if pairwise interactions between bacteria can generate the metastability needed to explain the succession. This section starts with discussing the implications of the results of each of those individual analyses. Thereafter, we make a synthesis of our results, which allows to propose a mechanism that could explain the inheritance of the functional composition of microbiota across generations. We end with a brief discussion of possible implications for animal breeding.

Conservation of functional composition across generations.
Contrary to our expectations, neither plants nor mosquitoes showed conservation of functional composition across generations, thus failing to support the ITSNTS hypothesis. This could imply that there is functional recurrence across generations only when there is a wide enough metacommunity (several communities connected by migration) to draw migrants from, as zooplankton, mosquitoes, and plants were bred under conditions with decreased or even no horizontal transmission. Similarly, plant monocultures show decreased taxonomical diversity 42 , and decreased vertical and horizontal transmission is leading to cumulative microbial extinctions in humans, affecting the phenotype and development of the immune system 43 . In neutral metacommunities, as the migration rates decrease, beta diversity increases at the cost of alpha diversity 44 . That increase in beta diversity could explain the functional divergence if stochastic processes can override deterministic processes in zooplankton, mosquito, and plant populations. The publicly available datasets that we used in our study, did not allow to test if horizontal transmis- An alternate explanation of our results is that only a core subset of functional traits gets preserved across generations. Assuming competition between functionally similar bacteria, Jiang et al. 45 showed that there is a subset of genes that shapes the structure of a community, denoted as community structure and shaping genes. Given that those genes are carried by a minority of the community 45 , the communities across different generations would not belong to the same attractors. Finally, (microbial) gene composition and community structure do not define function on their own 46 . There could be conservation at a metabolome or gene expression level instead of higher biological levels (i.e. by using KEGG pathways and GO terms). Thus, the actual proteins and/ or metabolites produced by the microbiota in a host could be preserved even if the gene composition or the predicted functions of the genes change. This could be tested by using metabolome or gene expression datasets (instead of the KEGG or GO annotations used here) to build the dissimilarity matrices used to find the attractors.
Ecological succession in the functional space. The initial composition of human microbiota is vertically transmitted, through contact with the mother 33 . Hence, we used human samples to test if that initial composition leads to an ecological succession. Both human datasets showed an ecological succession across the functional phase space, in agreement with the bacterial traits based succession from 47 . This succession can be understood as an extension of the development of the host 46 . This succession was not disrupted by the differing modes of delivery, suggesting that it is robust for a wide array of initial community compositions, which is also true for most traits in the succession from 47 . Similarly, the phylum level composition of the gut microbiota during the first weeks has a trajectory over time that is independent of the delivery mode and is at least partially driven by interphylum interactions 48 . Given that breastfeeding affects microbiota alpha and beta diversity 33 , as well as functional composition 49 , it could be argued that it contributes to the stability of the ecological succession. Since both datasets used here contain breastfed and non-breastfed individuals and all samples still converge to a single attractor at the end, this contribution is not the only stabilizing mechanism. Additionally, the crosstalk between innate immune system, the epithelial layer, and the microbiota control the community composition 13 . In other words, both the resilience and resistance of the community 46 and host mediated mechanisms could contribute to the stability of the ecological succession.  www.nature.com/scientificreports/ Pairwise interactions as a cause of metastability. The SOI simulations did not display metastability, which suggests that pairwise interactions alone do not account for community metastability. The generalized Lotka-Volterra (gLV) models also emphasize first order interspecies interactions 14 , so the analytical results of gLVs should coincide with the results of the SOI model. It has been shown that as community richness increases, the probability of gLVs of having fixed points (community configurations that remain constant over time) decreases exponentially 50 . Moreover, the probability of stability of the fixed points increases asymptotically to one 50 , which means that the communities will be able to reach those points. If instead of drawing parameters for the species within a community they are drawn for a pool of species from a metacommunity, the communities display metastability for specific regions of the parameter space, particularly for large metacommunity species pools 51 . Furthermore, those attractors are non-fixed points, and the community composition is history dependent and can be changed with perturbations 51 . Nevertheless, gLV models can fail to capture the qualitative dynamics of microbiota community when the mechanisms of interaction cannot be represented by additive pairwise effects in the fitness of the populations 52 . Goyal et al. 53 developed a model that only assumes that each species will only consume one resource at a time and prefers some nutrients over others, and it also displays metastability and transition between local attractors. Experimentally, the fruit fly (Drosophila melanogaster) gut microbiota displays metastability, caused by the different colonization strategies of each symbiont (e.g. the preferential attachment to a tissue) and stochasticity 54 . Therefore, metastability and history dependence can arise from pairwise interactions if there is a large metacommunity species pool or with higher order interactions, as with nutrient preference or preferential attachment to a tissue.
A proposed mechanism for functional recurrence applied to holobionts. Taking all the results together, we propose the following mechanism (Fig. 6) to explain inter-generational functional recurrence 19 . Given a population of holobionts and a rich metacommunity, the vertically inherited fraction of the microbiota defines the initial composition of a host. From that starting composition, there is a robust trajectory in the functional space analogous to an ecological succession, showed in the human datasets. This succession requires horizontal transmission and relies on the (high) richness of the metacommunity, as suggested by the mosquito and plant datasets that lacked those and did not show succession. In our simulations the pairwise interactions alone were insufficient to achieve stability of the trajectory and the mature community, suggesting that this additionally requires higher order interactions between symbionts and with the host. If the trajectory always converges to the same attractor in the functional space, there would be conservation of functional composition across generations. Further experiments are needed to test the complete mechanism, particularly if the predicted gene functions are the adequate level of resolution to look for functional convergence and if there is a robust trajectory in the functional space leading to that convergence for other organisms. The robustness of the functional succession and the final composition need to be tested experimentally. Moreover, there is not yet a model that explains how and when the host could induce or promote the community level metastability and robustness. From an ecological perspective, it remains to be seen if there is a characteristic function-abundance Figure 6. A mechanism for the "It's the song, not the singer" theory applied to holobionts, created with biorender (https:// biore nder. com/). The vertically transmitted microbiota initiates an ecological succession in the functional space, which requires the arrival of horizontally transmitted symbionts. The final stable state reached is the same as in the previous generation, which implies the conservation of the functional composition across generations. www.nature.com/scientificreports/ distribution, and if there is, how it could be influenced by stochastic and deterministic assembly processes. From an evolutionary perspective, that function-abundance distribution and the (functional) composition could also be affected by demographic (host population growth and decrease), microevolutionary (host selection, drift, migration and mutation) and holobiont specific (symbiont mutation, horizontal gene transfer, horizontal transmission) processes.
Implications for animal breeding. The holobiont concept has practical implications for animal breeding 4 .
If the functional (instead of the taxonomical) composition is preserved across generations, then the mixed models used in animal breeding could be adapted to account for that, hence potentially increasing the explained phenotypic variance. A similarity matrix between individuals has been calculated based on the taxonomical composition 5,55-57 , therefore, individuals with different bacteria that perform similar functions will still have a high dissimilarity. Nevertheless, if the functional composition is vertically inherited according to the proposed mechanism, then using similarity calculated based on the functional composition could increase the progress over time. Finally, if the community of a given holobiont follows a clear ecological succession, then the optimal moment to sample microbiota composition would be when this equilibrium is reached. The trait turnover in the succession from Guittar et al. 47 decreased before the taxonomical turnover, suggesting that the functional composition could also reach a mature state earlier than the taxonomical composition.

Conclusions
The main limitation of considering holobionts as units of selection is that it does it not directly account for the recurrence across generations of horizontally transmitted bacteria, which gave rise to the idea that the functions performed by the symbionts could self-perpetuate over host generations regardless of who is performing the function. In this study we propose that vertically inherited symbionts start an ecological succession that reconstructs the final functional composition of the community, which requires the arrival of horizontally transmitted migrants to be robust. This cannot be explained with pairwise interspecies interactions only, hence host-microbiota interactions and higher order interactions are required to explain the stability of the final community. To establish whether the concept of holobionts as units of selection have a wider applicability, including animal breeding, requires investigating this mechanism for a wide array of host species and applicability.