Unraveling Amazon tree community assembly using Maximum Information Entropy: a quantitative analysis of tropical forest ecology

In a time of rapid global change, the question of what determines patterns in species abundance distribution remains a priority for understanding the complex dynamics of ecosystems. The constrained maximization of information entropy provides a framework for the understanding of such complex systems dynamics by a quantitative analysis of important constraints via predictions using least biased probability distributions. We apply it to over two thousand hectares of Amazonian tree inventories across seven forest types and thirteen functional traits, representing major global axes of plant strategies. Results show that constraints formed by regional relative abundances of genera explain eight times more of local relative abundances than constraints based on directional selection for specific functional traits, although the latter does show clear signals of environmental dependency. These results provide a quantitative insight by inference from large-scale data using cross-disciplinary methods, furthering our understanding of ecological dynamics.


Contents of Supplementary Material
1. Box S1-S3 2. Figures S1-8 3. Tables S1-S2 4. Ecological interpretation of the MEF results (S-A) 5. List of packages used 6. References for SI BOX S1 Box S1. Different ingredients necessary for analyses using MEF. Definitions of the most important terms used in the MEF analyses and throughout the main text to provide the necessary framework of understanding, adapted from [1].

Entities:
The basic unit of the MEF model which can exist in different states. Here, this constitutes a collection of genera existing at a site, hence each entity can be considered a single genus.

States:
Classification of different ways any entity can exist. Within the system, states of each entity describe their specific abundance at that site. Microstates constitute the spatial and temporal composition for the states of the entities in the system. Macrostates depict the entities of a system independent of the spatial or temporal composition, e.g., the overall relative abundance distribution but not including processes leading to this distribution (such as dispersal).

Traits or properties:
The measurable attributes for each entity, of which the values can be different for each entity. For example, genera differ in average wood density, seed mass, height etcetera. Here defined by the functional traits as described in the main text.

Maximally uninformative prior:
All the information regarding the states before specific constraints are introduced. Described as maximally uninformative as all empirical information should be introduced in the form of constraints quantify the maximal gain of information regarding the different constraints (e.g. traits or prior distribution in this case).

Prior distribution:
The expected states for the entities, here either constituted by the observed relative abundance of each entity in the summed sample (i.e. summed abundances describing the metacommunity) or by a maximally uninformed (uniform) distribution (see above). The former would be a neutral prior (expected local abundance is equal to the abundance in the larger metacommunity).

Community-weighted mean or variance:
The mean or variance of genus-level trait value over all constituent present species (for each entity) weighted by the relative abundance of each entity at a specific site.

BOX S2
The Maximum Entropy Formalism as applied here works based on a conceptual model called CATS (Community Assembly by Trait Selection [1][2][3]) and makes use of three inputs: i) A trait matrix containing the measured functional traits of each of the S total genera in the total regional pool, these can be of either discrete or continuous form.
ii) A vector of n community weighted trait values, estimating the average trait value over all individuals in the local community for each of the traits iii) A prior probability distribution specifying the regional abundance distribution, quantifying potential contributions of the regional pool of recruits to the structure of local communities. Using these three sources of information, the model predicts relative abundances (pi) in the form of Bayesian probabilities for each genus in each local community without assuming any a priori relations or processes. This is achieved by finding the vector of relative abundances maximizing entropy: with qi the regional species pool abundance of species i and RE (Relative Entropy) subject to the known constraints for j traits and i species.: The solution is a generalized exponential distribution where the λ values measure the importance of each trait when all other traits are constant: Note that when all λ values are zero, i.e. there is no trait based selection, pi = qi The final step is to measure the proportion of total deviance accounted for between observed and predicted relative abundances for each of the fourstep solution. These are the R 2 KL values, a generalization of the classic R 2 index of maximum likelihood estimation using the Kullback-Leibler index [4,5]: i) 02 KL(u): fit of model bias, the model null hypotheses given a uniform prior (i.e. equal distribution in the regional pool of recruits).
ii) R 2 KL(u, t): fit using again a uniform prior but including traits as constraints.
iii) 02 KL(m): fit using the metacommunity prior but excluding traits as constraints iv) R 2 KL(m, t): fit using the metacommunity prior and including traits as constraints The general form of the R 2 KL divergence is calculated by:

With the following parameters:
Oik as the observed relative abundances of the i th genus in the k th community, Pik the accompanying predicted values for the specific model of the four solution step as described in the main text and, Qi,0 the predicted relative abundances given only the maximum uninformative prior.
Further details on the calculation of all separate R 2 KL values and accompanying pure trait, pure metacommunity, joint information and biologically unexplained information can be found Box S3.

Box S2. Mathematical description of the Maximum Entropy Formalism for the four-step solution. Left panel shows necessary ingredients and formulation of the Maximum Entropy Formalism.
Right side panel shows decomposition of the proportion of total deviance accounted for between observed and predicted relative abundances for each of the four-step solution, adapted from [5].
The purpose of using MEF is to decompose the deviance between observed and predicted relative abundances using the four-step solution as described in the main text. The values generated are described below. The R 2 KL value is a generalization of the classic R 2 index of maximum likelihood estimation using the Kullback-Leibler index for a non-linear regression including a multinomial error structure [2,4,5]. In essence, it is a way of measuring the proportion of total deviance accounted for by that specific model from one of the four steps: '2 KL(u): fit of model bias, the model null hypotheses given a uniform prior and permuted traits R 2 KL(u, t): fit using a uniform prior but including observed traits as constraints '2 KL(m): fit using the metacommunity prior but excluding observed traits as constraints R 2 KL(m, t): fit using the metacommunity prior and including observed traits as constraints

ΛR 2 KL(t|φ) = R 2 KL(u, t) -'2 KL(u)
Increase in explained deviance due to traits beyond that due solely to model bias or ΛR 2 KL(t|m) = R 2 KL(m, t) -'2 KL(m) Increase in explained deviance due to traits beyond contributions made by the meta-community 2) The increase in explained deviance due dispersal mass effects via the metacommunity can be calculated by either:

Increase in explained deviance (if any) due to the metacommunity beyond that due to model bias or ΛR 2 KL(m|t) = R 2 KL(m, t) -R 2 KL(u, t)
Increase in explained deviance due to the meta-community given traits, relative to the explained deviance due only to the traits: i.e. information unique to neutral prior

S-A Ecological interpretation of the MEF results
Signals of quantitative environmental selection were found to be highest for podzol forests, whereas its counterpart in the form of the dispersal mass effect from the regional pool of genera had the second lowest value. Podzol forests, having extremely nutrient poor soils could reflect a much stronger selective environment than any of the other forest types. Terra firme forests, presumably reflective of a less strong selective environment in terms of resource availability, showed the opposite, with approximately half of the pure trait effect in comparison with podzol forests (even when rarefied to accommodate for different sample sizes). Traits associated with protection against herbivores such as latex [7] and high leaf carbon content showed higher values associated with greater abundance on podzol soils, whereas traits indicative of investment in growth and photosynthetic ability such as high foliar concentrations of P and N [8] showed strong negative associations on nutrient poor soils. The ability to accumulate aluminium was also strongly positively associated with relative abundance on igapó forests, which can potentially be richer in aluminium. Lambda values also showed strong negative lambda values for wood density in swamp and forests, fitting high tree mortality and many individuals belonging to pioneer species in especially the western Amazonian swamp forests. Várzea and Pebas terra firme forests showed a similar response. As the Pebas consists mainly of Andean sediments it has higher nutrient content, promoting lower wood density, supported by our results whereas várzea forests are also often flooded. There were also traits that showed no specific (strong) signal of selection on certain forest types (either positive or negative), such as latex on igapó and ectomycorrhiza on terra firme forests (see Fig. S1 for all lambda values). Interestingly, terra firme forests in general showed the smallest lambda values overall (positive or negative). This may be indicative of either more pronounced demographic stochasticity or ecological drift eliminating the association between traits and relative abundance. Lower effects of selection in general or more (random) variation due to the larger species pool in comparison with other forest types, however, could also be the result of mixing heterogeneous microenvironments into a single environmental class. Support for such heterogeneity within terra firme forests having influence on distribution of functional traits on valleys or plateaus has recently been found [9]. In addition, natural but also anthropogenic [10] disturbance history affects biotic community composition and can lead to changes in tree community through time, blurring relationships between traits and relative abundances. It should further be noted that, although for terra firme forests we were able to make a distinction by subregion, true within forest type heterogeneity was not taken into account. This might cause an underestimation of the deterministic effect but as of yet cannot be corrected for on this scale and is worth to be investigated in future studies. In addition, podzol forests have a smaller connected surface area and accompanying smaller number of genera in comparison with terra firme forests, adding to the calculated stronger trait effects [11,12]. When more detailed understanding and knowledge of these functional traits would be provided, this would most likely increase the explanatory power of the MEF. The fact, however, that we do not have a very specific knowledge of these interactions and specific traits is precisely the reason why the MEF can provide additional insight.
It should be noted that for species level analyses any micro environmental gradients might prove to also show (stronger) selection at local scales [13,14], as it has been shown that most variation in community composition, due to selection in regard to habitat filtering and niche conservatism, is found at lower taxonomic levels, such as between species within genera [15,16]. In contrast, theoretically it has been shown and tested that immigration numbers are actually very robust across taxonomic scales [17], validating our results of the metacommunity importance using genus level taxonomy. Spatial patterns of metacommunity effects showed shallowest declines in the centre, supporting the suggestion that high diversity of the Amazonian interior could be explained by influx of recruits due to large (overlapping) ranges. This middomain effect [18], however, would also predict lower species richness for the edges due to lower range overlap, assuming a closed community. This is not the case, as there is a strong species richness gradient from West (rich) to Eastern Amazonian forests (poor) [19]. The lower metacommunity effect for the edges then is most likely not due to less absolute influx of genera, but rather less influx from the Amazonian tree community. Influx from the species-rich Andes could account for the high diversity [20], yet low Amazonian metacommunity effect for Western Amazonian forests. In contrast, South-eastern parts of Amazonia receive influx from tree speciespoor biomes (i.e. the Cerrado) resulting in lower diversity but also low metacommunity effect for Amazonian trees in this region.