Assessment of congruence between co-occurrence and functional networks: A new framework for revealing community assembly rules

Describing how communities change over space and time is crucial to better understand and predict the functioning of ecosystems. We propose a new methodological framework, based on network theory and modularity concept, to determine which type of mechanisms (i.e. deterministic versus stochastic processes) has the strongest influence on structuring communities. This framework is based on the computation and comparison of two networks: the co-occurrence (based on species abundances) and the functional networks (based on the species traits values). In this way we can assess whether the species belonging to a given functional group also belong to the same co-occurrence group. We adapted the Dg index of Gauzens et al. (2015) to analyze congruence between both networks. This offers the opportunity to identify which assembly rule(s) play(s) the major role in structuring the community. We illustrate our framework with two datasets corresponding to different faunal groups and ecosystems, and characterized by different scales (spatial and temporal scales). By considering both species abundance and multiple functional traits, our framework improves significantly the ability to discriminate the main assembly rules structuring the communities. This point is critical not only to understand community structuring but also its response to global changes and other disturbances.

A fundamental question in ecology is whether and which assembly rules determine the structure of natural communities 1 . This knowledge is essential to understand processes and drivers structuring spatio-temporal distribution of communities (i.e. deterministic processes versus stochastic processes [2][3][4] . This understanding is particularly important in the current context of the ever-increasing pressure exerted by human activities at both local/regional (e.g. land-use modifications, pollution) and global scale (e.g. acidification of oceans, global warming) which imperils integrity of most ecosystems and capacity to deliver services to people 5 .
Nowadays, assembly rules of communities are mainly assessed in taking into account the functional diversity of organisms (i.e. the value and range of functional traits of the organisms in a given ecosystem 6 ). Indeed, some studies have demonstrated that functional traits strongly contribute to determine species distribution in a complex environment [7][8][9] . Currently, three methodological approaches based on the use of functional diversity indices are mainly used in the literature for the common aim of assessing the relative influence of structuring (stochastic or deterministic) processes on communities: (i) FD index based on the construction of dendrograms from the distance matrix between species pairs [10][11][12] , (ii) FRic index based on the computation of a convex hull 13,14 , (iii) the n-dimensional hypervolume index 15,16 . However, while all these approaches have greatly improved our understanding of functioning of communities, they also suffer from strong limitations [17][18][19][20] . For example, FD index of

Materials and Methods
Methodological development. Our approach is based on the comparison of two networks: the functional network (based on the functional trait values of species) and the co-occurrence network (based on the number of co-occurrence between species). In the two networks the edges among groups do not represent interactions, such as species interactions used for most traditional networks in ecology, but the degree of functional complementary/redundancy (i.e. individuals sharing or not the same combination of values of functional traits), and species co-occurrence, for the functional and co-occurrence networks, respectively. Considering two distinct networks (i.e. functional and co-occurrence networks) and comparing then should allow to assess to which extent the species belonging to a given functional group also belong to the same co-occurrence group. When the two networks are similar, it means that deterministic processes such as environmental filtering are dominant 33 . Indeed, if functionally closed species tend to live in the same place, environmental conditions are expected to act as a major filter on species distribution 34,35 . In contrast, when the two networks are very different, it means that other deterministic processes occur, such as competition (i.e. limiting similarity process, following the theory of Mac Arthur & Levins 36 ). Indeed, competitive exclusion will result in a pattern where species that are functionally similar are negatively associated 33,37,38 . Finally, if the distribution of species within both networks is not different from a random pattern of distribution, it means that either stochastic processes (i.e. neutral) are the main factors structuring the community 33,39,40 or we have no clear dominance of one of the two deterministic processes (i.e. environmental filtering and competition).
The methodology of our framework consists of three successive steps illustrated in Fig. 1 and detailed thereafter.
Step 1: Detection of the functional groups and the co-occurrence groups within each network The first step of our approach consists in computing two different networks: the functional network and the co-occurrence network. Here, a network represents species as nodes, and the degree of species functional differences (i.e. if they share more or less the same functional trait values) or co-occurrence (i.e if they tend to co-exist) as edges. More precisely, the computation of these two different networks is done as follows: • We defined the "functional network" by computing the functional trait resemblance matrix between species (computed from the species trait matrix, i.e. species in row, functional traits in column) with (1-standardized Euclidean distance) or (1-Gower index), according to the nature of variables, i.e. quantitative continuous or quantitative and qualitative mixed, respectively 17,41 . The values of this functional matrix are used to weight edges between species in the functional network where each species represent one node. • We defined the "co-occurrence network" by computing the co-occurrence similarity matrix. This matrix is obtained by transposing the abundance matrix (i.e. the sites * species matrix) and then, by computing the similarity of Bray-Curtis between samples (i.e. 1-Bray-Curtis dissimilarity index, the number of times that these species have been seen together, weighted by their abundance 42 ). The values of this similarity matrix are used to weight edges between species in the co-occurrence network. It is important to note that our method is not restricted to the use of Bray-Curtis index. Here we choose to use the similarity of Bray-Curtis for these desirable properties in ecology (e.g. combining the structural information on presence/absence with quantitative counts of species, non-consideration for double zero 43,44 ) but other metrics could also be used according to the nature of available data and the objectives of each study (e.g. Jaccard, Canberra or Cao indices, etc.).
www.nature.com/scientificreports www.nature.com/scientificreports/ We first built the "functional network". We computed the functional trait resemblance matrix among species, which embeds the degree of difference from a functional point of view among species pairwise. This matrix was computed based on the species trait matrix (i.e. species in rows, functional traits in columns) using the 1-standardized Euclidean distance or 1-the Gower index according to the nature of variables, i.e. quantitative continuous or quantitative and qualitative mixed, respectively 17,41 . The values of this matrix were then used to weight edges and their strength among species in the functional network where each species represents one node. Secondly, we built the "co-occurrence network". We computed the co-occurrence similarity matrix using the similarity Bray-Curtis on the species abundance matrix (i.e. species in column, sites in rows). 1-Bray-Curtis values represent the number of times that these species have been seen together, weighted by their abundance 42 . The values of this similarity matrix are used to weight edges and their strength among species in the co-occurrence network. Note that our method is not restricted to the use of the Bray-Curtis index. Here we choose to use the similarity of Bray-Curtis for these desirable properties in ecology (e.g. combining the structural information on presence/absence with quantitative counts of species, non-consideration for double zero 43,44 ) but other metrics could also be used according to the nature of available data (e.g. species presence/absence data) and the objectives of each study (e.g. Jaccard, Canberra or Cao indices, etc.).
Finally, to define the respective groups of species for functional and co-occurrence networks, we use the concept of modularity in searching to optimize this modularity via the optimization algorithm of Louvain 45 . The modularity of a partition (Q) is a scalar value between −1 and 1 that measures the density of edges inside groups compared to edges between groups 32,46 . Thus, the higher value of modularity, the better the classification of species (or individuals) into groups is. We chose the algorithm of Louvain because it unveils hierarchies of groups and allows the discovery of sub-groups, sub-sub groups, etc. within groups 45 . This algorithm has been identified as one of three best algorithms for modularity detection (according to a comparative analysis that included 12 different algorithms 47 ) and this method is currently used with success for several networks of different types, e.g. human brain network 48 , social network 49 , mobile phone networks 50 and for large-sized networks (e.g. 4 M nodes and 100 M edges 50 , 21 M nodes 51 ).
It is important to note that if no modular structure is found in the co-occurrence network, it means there is no significant structure in the dataset (i.e. in the co-occurrence matrix). Such a result assumes that no determinist processes prevails in the community (i.e. we have a dominance of stochastic processes or strong influence both environmental process and interspecies competition). Otherwise, if no modular structure is found in the www.nature.com/scientificreports www.nature.com/scientificreports/ functional network, it means that the choice of assessed functional traits is not accurate because it does not allow to differentiate the species/functional units between them.
Step 2: Assessing whether species affiliated to a given group of functional network also belong to the same group in the co-occurrence network.
The second step of this framework is to assess the congruence (i.e. the similarity) between the two networks (co-occurrence and functional networks, Fig. 1). This allows to assess whether the species affiliated to a defined functional group tend to be also in the same co-occurrence group. For this purpose, we measure an index of module diversity for each functional group g f (derived from Gauzens et al. 25 Where g s is the number of functional entities (or species) of the functional group that belong to the co-occurrence group S and |g f | is the number of species in the functional group g f . We divide this index by − 1 g 1 f in order to range our index between 0 and 1. Then, we computed the Dg M index which represents the mean of all Dg f indices. Dg f is 0 if all species of a functional group g f belongs to the same co-occurrence group, and is 1 when all species of the functional group g f belongs to different co-occurrence groups.
Step 3: Comparison of Dg M index with null models To investigate assembly rules driving the structure of communities, we compare the observed values of Dg M to values distribution from a null model (Fig. 1). Null models allow comparing the deviation of empirical networks from random expectations and are expected to provide a more mechanistic understanding of the factors shaping ecological networks 52 . In null models, the partition into functional groups is identical to that obtained with our model (i.e. keeping the same number of co-occurrence groups and their respective sizes as the original dataset), but species are randomly distributed among co-occurrence groups. We then calculate a p-value from the iterations of null models (999 iterations in our study 40 ). More specifically, we derive the p-value as the proportion of the null distribution of Dg M index that is more extreme than the observed Dg M . If deterministic factors are lowly represented or if we have not a clear dominance of one of these factors, the observed Dg M is expected to range between 5% and 95% of the null distribution 53 . In contrast, if niche-based processes (e.g. environmental filtering processes) prevail, the observed Dg M should be significantly different from the null distribution. Dg M index should be lower than expected at random (lower than 5% of the null distributions) if environmental filtering dominates. In contrast, if limiting similarity is the dominant process, we expect the Dg M index to be higher than 95% of the null distribution 40 .
All computations implemented in this study were performed with R software (R Development Core Team, 2018) and in particular with 'louvain' function of the 'modMax' package. Randomizations were performed using the 'sample' R function. Code used to compute Dg M index is available in Supplementary Material.

Study cases.
We illustrate the application of our approach on two datasets available in the literature. The datasets concern two different taxa (bee's communities and aquatic invertebrate communities) and represent two different ecological situations frequently assessed (i.e. variation of communities' structure along different spatial scales and after a disturbance event respectively).
Detection of assembly rules along different spatial scales. We used the data provided by Forrest et al. 54 dealing with bee communities present in three types of habitats: four conventional farms (C), five organic farms (O) and seven natural areas (N) on the western slope of the Sacramento Valley, California. For each species, six functional traits were analyzed: two of them were continuous (intertegular distance and median day of year of flight season) and four of them were categorical (nest location, nesting behaviour, sociality and parasitic lecty). All these traits are known to affect the life-history of bees (see 54 for more details regarding the sampling design). We applied our framework at regional scale (i.e. in considering all the types of habitats in the analysis) then at local scale (i.e. in considering each habitat separately) in order to assess which type of community-assembly factors dominates according to the spatial scale considered.
Temporal variation of assembly rules in response to a disturbance. We used published dataset of Bogan & Lytle 55 , also used in Boersma et al. 56 . An aquatic invertebrate community in a small and isolated stream (French Joe Canyon) in southeast Arizona was sampled before and after a severe drought and resultant stream drying event (8 years separating the two sampling periods). Seven categorical functional traits were selected that are associated with biological responses to drought in arid-land streams: body size, functional feeding group, dispersal ability, locomotion, voltinism, respiration and diapause [56][57][58] . We assessed the impact of this drying event on the community structure and composition in applying our framework on the invertebrate communities before and after the disturbance. The orders of magnitude of species abundance being strongly different (from few individuals to several thousand individuals), we applied a log(n + 1) transformation on the abundance dataset.

Sensitivity analyses.
We first assessed the robustness of our framework to the number of traits by rerunning 1000 times all analyses using all combinations of two to N-1 traits for each dataset. We did not reduce the number of traits below two because we might have missed important dimensions of the functional space defining www.nature.com/scientificreports www.nature.com/scientificreports/ species niche. Secondly, we determined how number of species influences our framework in varying the number of species by steps of 10 species for invertebrate communities and by steps of 20 species for bee communities.

Results
Detection of assembly rules along different spatial scales. At regional scale (i.e. inter-habitats). We obtain a functional network composed of 3 groups (containing respectively 62, 55 and 23 species) and a co-occurrence network composed of 10 groups (algorithm of Louvain, Table 1, Fig. 2). The Dg M index is 0.78. It is inferior to 5% of values obtain with null model (p-value = 0.04, Table 1), meaning that environmental filtering play a leading role in structuring bee communities at regional scale (i.e. inter-habitats).

At local scale (i.e. intra-habitat).
For natural habitats, we obtain a co-occurrence network composed of 6 groups (containing 20, 24, 12, 22, 15 and 14 species, respectively). In combination with functional network (computed only from species present in natural habitat), the Dg M index observed is 0.83 and not different from null model ( Table 1). For both types of farms, the Dg M index is respectively 0.81 for organic farms and 0.76 for conventional farms. As for natural habitats, these Dg M values are not different from those obtained under null models (Table 1). These results mean that, at local scale, the dominance of one of deterministic processes is not highlighted. temporal variation of assembly rules in response to a disturbance. Before the drying event. We obtained a functional network composed of 6 distinct groups (containing 11, 7, 5, 5, 3 and 1 species). For the co-occurrence network, it is composed of 4 groups (containing 12, 8, 6 and 6 species, see Fig. 3). The Dg M index associated to these two networks (i.e. functional and co-occurrence networks) is 0.72 (see Table 2). It is inferior to 5% values obtained with null models (p-value = 0.04) supporting the hypothesis of a strong dominance of environmental filtering process.
After the drying event. We obtain a functional network composed of 6 groups (containing 11, 10, 7, 4, 4 and 3 species). In contrast, we found no modular structure for the co-occurrence network (see Table 2 and Fig. 3). This  Sensitivity analysis. The sensitivity analyses related to the species number and the number of functional traits highlighted no relationship between these two parameters and the modularity of the network obtained, ie. no effect of species number and traits on the network modularity (all p-values are non-significant, see details in Supplementary Material). These results are valid for both datasets analyzed (i.e. bee communities and invertebrate communities). The variation observed for the network's modularity for each level of each parameter (i.e. species richness or number of functional traits) is due to the fact that the communities slightly differed at each randomization. After the drying event Functional network 0.08 6

Discussion
Assessment of ecological networks can provide important insights into the relative importance of environmental filtering and coexistence mechanisms behind community assembly 4,59-63 . The framework developed here, based both on a co-occurrence and functional networks, allows to improve the assessment of processes of assembly occurring at different spatial and temporal scales. First, compared to the traditional approaches using functional diversity indices (e.g. convex hull volume of Cornwell et al. 13 , FD index of Petchey & Gaston 10 ), the use of a co-occurrence network based on species abundance allows to take into account quantitatively the frequency of occurrence of different species that is particularly innovative.). This aspect worth emphasizing, because taking into account and quantifying the co-occurrences of species along the spatial and/or temporal dimensions is essential for understanding how the whole community is organized and functions 59 . Secondly, compared to approaches using networks to analyze community structure, our framework allows several complementary functional traits of species to be taken into account, which is very rare so far (but see 31 ). However, our approach does not consider differences between individuals of the same group that can hide some ecological patterns (although these differences were taking into account during the step of network building). It can also be noted that a long time of computations can be observed when the dataset is large (e.g. important number of species) but this limitation only occurs during the computation of the algorithm of Louvain.
Detection of assembly rules along different spatial scales. Through the assessment of bee communities in three types of habitats (i.e. natural habitat, organic farm and conventional farm), we highlight the fact that, at regional scale (i.e. between habitats), the environmental filtering process dominates (i.e. species with similar functional traits tends to co-occur in the same habitat). Conversely, at smaller scales (i.e. intra-habitat or local scale), we do not observe the predominance of determinist patterns. Thus, our results support the idea that the spatial scale has a great impact on the process dominating the community structure 4,64 . More precisely, these results support the theory according to which environmental filtering will constrain the pool of species co-occurring in a given region 65,66 . Sharing ecological traits, like adaptations to particular environmental conditions, is often a perquisite for two species to interact 9 . Thus, species turnover between habitats with different environmental conditions is recognized to be responsible for a large fraction of variation in community composition in space 67 . For this study case, we assume that the land-use modifications (due to the agricultural human activity) selected the bee species according to their trait values. Our results also support the results of other studies showing that, at local scales, interspecific competition is added to niche filtering and species tend to be more functionally distant 22 . By instance, Stubbs & Wilson 38 have demonstrated the presence of limiting processes at local scale where some plants should differ in their traits values for the uses of water to persist and avoid strong competition between them. The lack of evidence concerning niche filtering process for both farms (i.e. organic and conventional farms) could be due to the degradation of natural habitat. Disturbances may modify assembly rules and blur assembly patterns. In disturbed habitats, the pressure applied by environmental conditions can be strong enough to allow the production of real competition patterns 68 . Unfortunately, in this case of the non-differentiation with the null model, it is not possible to know if it is due to stochastic processes (e.g. dispersal, natural disasters) or an overlap of determinist process. However, this point is not exclusive to our approach and concerns a great majority of methods based on the comparison of values with those a null model 33,69 . Developing an important methodological effort will be needed to resolve this drawback in the near future (but see 70 for noticeable advances for presence-absence matrix). temporal variation of assembly rules in response to a disturbance. We also demonstrate that our approach allows distinguishing patterns due to the impact of environmental disturbances on communities. In assessing the assembly rules structuring stream invertebrate communities before and after a severe drought in southeast Arizona for eight years 55 , we highlight differences in the structure of these communities. This difference could be explained by the fact that the drying event generates a more restricted access to resources for the species. Thus, the effects of competition between species are added to the effects of environmental conditions and modify the structure of communities. Indeed, changes of environmental conditions may strongly influence the identity and strength of species interactions by altering species' spatial distribution 22 .
Overall, we believe that our framework should pave the way for a better understanding of the spatial and temporal structure of communities while considering co-occurrence, abundance and functional traits. Furthermore, our approach is applicable at multiple spatial scales that allow a more complete vision of patterns structuring the community following the different scales considered. Moreover, our results strongly support those of many studies arguing that the relative importance of determinist processes (i.e. environmental filtering and limiting similarity processes) structuring the communities vary according to the spatial scale considered 71 . Indeed, it is considered that the environmental filtering tends to predominate at larger spatial scale in constraining the establishment and the persistence of species according to their traits values in a given environment 34,72,73 . Conversely, the limiting similarity process is believed to occur at smaller spatial scale because this process translates the competition occurring between species and is added to the niche-filtering process 22 .
The approach proposed here, could be used for a wide range of situations, taxa and contexts, in both marine and terrestrial realms. It should contribute to several lively and emerging issues in community ecology. For instance, in considering each node as functional entities in the functional network (i.e. individuals sharing the same combination of values of functional traits) and not as species, it could be possible to integrate intraspecific functional information in our framework highlighting mechanism of co-existence between and within species. This is a critical point that enriches the potential of use of our approach in a context where a lot of studies have demonstrated that intraspecific traits variation (e.g. due to life stage, adaptation, etc.) can affect specific interactions such as competition, as well as overall ecological dynamics 9,74,75 . www.nature.com/scientificreports www.nature.com/scientificreports/ Identification of rare functions and their roles in ecosystems is another emerging issue 76 . In assessing the number of species (or functional entities) and the composition of each functional module, our approach enables us to identify the species (or functional entities) which play rare functions in ecosystems. Moreover, analyses of positions of rare species and functions within co-occurrence and functional networks could provide insight on how rare functions shape structure of species co-occurrence 76,77 . It thus may help to predict the consequences of rare species extinction on ecosystem functioning [77][78][79] .
The choice of traits used is of primary importance for assessing functional diversity because it is dependent on both the ecological question addressed and the characteristics of the community studied 80 . It has thus to be done as a first step according to the knowledge and expertise of researchers. Then, it might be of ecological interest to assess the contribution of one particular trait on module patterns. An approach already used in functional diversity studies is to rerun analyses using all combinations of N-1 traits out of N 19 which allow identification of relative trait contributions.
Finally, global change imposes modifications in the structure of species co-occurrence network for many ecosystems on Earth 4,81,82 . Our framework reconciling the approaches based on co-occurrence and those that consider functional traits is the first step towards the foundation of a unified framework. By improving our understanding of consequences of human-driven global changes on community assembly, network approaches can provide valuable insights in the assessment of processes and assembly rules structuring ecological communities. Network tools could also improve conservation sciences by bridging the gap between ecology and social sciences, allowing mix of different data in a unified framework.

Data availability
The datasets during the current study are available in