Network science applied to forest megaplots: tropical tree species coexist in small-world networks

Network analysis is an important tool to analyze the structure of complex systems such as tropical forests. Here, we infer spatial proximity networks in tropical forests by using network science. First, we focus on tree neighborhoods to derive spatial tree networks from forest inventory data. In a second step, we construct species networks to describe the potential for interactions between species. We find remarkably similar tree and species networks among tropical forests in Panama, Sri Lanka and Taiwan. Across these sites only 32 to 51% of all possible connections between species pairs were realized in the species networks. The species networks show the common small-world property and constant node degree distributions not yet described and explained by network science. Our application of network analysis to forest ecology provides a new approach in biodiversity research to quantify spatial neighborhood structures for better understanding interactions between tree species. Our analyses show that details of tree positions and sizes have no important influence on the detected network structures. This suggests existence of simple principles underlying the complex interactions in tropical forests.

Consequently, the small-world behavior of the species interaction network does not depend on the size of the chosen interaction zone.
Testing the scale-free property. Scale-free networks are characterized by a node degree distribution which follows a power law. To test for this behavior, we fit a power law with exponential cut-off to the logarithmic binned frequencies of node degrees and compared it to a power-law distribution. For the truncated power-law fit we conduct the method proposed by Barabási et al. 7 to find the fitting parameters kmin, kcut and γ: This includes the combination of a maximal log-likelihood function to estimate γ with fixed kmin and kcut and the identification of kmin and kcut for which the Kolmogorov-Smirnov statistic is minimal. We set a minimum of five binned data points as a condition for the fitting range and compute the root-meansquare error (RSME). To compare the fit with a power law we use the following fitting function with the same kmin value Software used for the analyses. Different software was used for this study. The construction of all networks and calculation of network measures was done in C++ (Embarcadero RAD Studio XE5). With Matlab we created the network visualizations and adjacency matrices ( Fig. 1 and Supplementary Figs. S4, S10). The truncated power-law fit and analysis was done with the Matlab packages of Virkar et al. 8 . For plotting the results we used Matlab, R and Microsoft Excel.

Supplementary Results
Results of the directed networks. By construction, the average node degree <k> and network density D of the directed tree networks correspond to half of the values in the undirected case. Concerning the in-degrees ('overshadow indices') we obtain a clustering coefficient of C ≈ 0.35 and for the out-degrees ('shadow indices') a coefficient of C ≈ 0.16, which was similar for all forest sites. The outgoing node degree distributions decrease monotonically because there are many small trees that overshadow only few other trees while a few large trees overshadow many small trees ( Supplementary  Fig. S9a). Consequently, the proportion of shade-tolerant species tend to decline at the BCI forest with increasing out-degree ( Supplementary Fig. S9c). In contrast, the incoming node degree distributions rather follow Poisson distributions ( Supplementary Fig. S9b). Note that a maximum value of 0.5 for the clustering coefficient C results from the fact that the directed tree network is acyclic. The difference between the values for the out-degrees and in-degrees can be explained by the typically decaying tree size distribution of undisturbed forests 9,10 . The 'deeper' we look into the forest from above, the more smaller trees and thus, with smaller interaction zones occur. By this, more nodes with an out-degree of kout = 1 or 0 and with a local clustering coefficient Ci = 0 are detected (concerning the 'shadow index') which results in a lower global clustering coefficient C. Considering the species networks, there is no functional relation between the average node degrees of the undirected and the directed networks. The directed species network shows an average node degree of <k> = 57.55 at BCI (50 ha), at Sinharaja <k> = 49.21 and at Fushan <k> = 31.27. For the directed species networks, we obtain a clustering coefficient lying between C = 0.66 and C = 0.74 for the outdegrees ('shadow index') and between C = 0.72 and C = 0.86 with regard to the in-degrees ('overshadow index').
Node degree distribution of the tree networks. As expected for node degree distributions that resemble Gamma (or log-normal) distributions, the analyzed tree networks can be considered as thin tailed but not scale-free 7 . An additional analysis showed that a power law with exponential cut-off approximates the degree distribution better than a power-law distribution ( Supplementary Fig. S3), similar to observations in other geometric networks (e.g., 11,12 ).
Over 30 years, the tree data inventory of BCI shows on average a tree mortality of 11.6 % and around 11.4 % tree recruits in every five years (standard deviations s = 0.009 and s = 0.015, respectively). Nevertheless, there is no significant change in the node degree distributions ( Supplementary Fig. S7).
Influence of plot size on network characteristics. When plot size was changed (from 50 ha to 25 ha), most network characteristics remained unchanged in the example of BCI.
Some global network properties of the tree network scale in a predictable way with plot size: the number N of nodes (i.e., trees) is proportional to plot size and also the number E of edges since connections between nodes are local. As a consequence, network density D scales proportionally to 1/plot size (see equation (3) in Methods). However, local neighborhood properties such as the mean node degree <k>, maximal node degree kmax, node degree distribution and clustering coefficient C are independent from plot size (Supplementary Tables S1 and S2, Supplementary Fig. S14a), although the probability is higher to find a node with a higher degree in plots of larger size. The average path length L and the diameter d of a network scales approximately with the increase of the maximal possible distance among points, as shown in Supplementary Table S1.
The scaling with plot size of characteristics of the species network that is constructed on top of the tree network is difficult to predict, except for the number of nodes N that scales with the species-area relationship. A doubling of plot size caused only a slight increase in network size (the number of nodes N and edges E, Supplementary Tables S1 and S2) and in node degrees (<k>, kmax and node degree distribution). However, the shape of the node degree distribution remained constant ( Supplementary Fig. S14b). More connections between the species resulted in a slightly lower average path length L and a slightly higher clustering coefficient C (Supplementary Table S1). The characteristics that change with plot size do not affect our general conclusions about the small-world and scale-free property in the networks of tree individuals and of tree species. Power-law fit with exponential cut-off (kmin = 16, kcut = 48 and γ = 3.408, RMSE = 3.08e-04) to the logarithmically binned frequencies of node degrees Pt(k) in the tree network at BCI (50 ha). The truncated power law with exponential cut-off fits the node degrees significantly better than a power-law with same starting value kmin (likelihood ratio = -4.52, Vuong test, p = 0.0335, see Supplementary Methods for details). For graphical purposes only, frequencies are normalized (with regard to network size and bin width of node degrees).

Supplementary Fig. S4
Adjacency matrices of the directed tree and species network for BCI (50 ha). The rows and columns stand for the existing trees or species (nodes). Each blue dot represents a directed connection between a pair of trees or species. Consequently, the number of dots in one row represents the out-degree ('shadow index') of the concerning node. Nodes in the tree network in a are ordered by tree sizes starting from the smallest tree (low tree rank). Nodes in the species network in b are ordered by their species abundance starting from the species with lowest number of trees (low species rank). The small panels along the y-axis show the node degrees of a individual trees and b species.  Supplementary Fig. S13 Node degree distributions with noisy interaction zones and tree heights at BCI (50 ha). In a and b noise was added to parameters of the allometric relationship for the interaction zones per tree species of a undirected tree networks and b undirected species networks. In c-f noise was added to parameters of the allometric relationship for tree heights of c-d directed tree networks (out-degrees and in-degrees) and e-f directed species networks (out-degrees and in-degrees). Trees of a species must interact with a minimum number of trees of another species for being considered as interacting. With increasing minimum number the network size N and average node degree <k> becomes smaller, while the clustering coefficient C, average path length L and especially small world property remains unchanged. *CER and LER: clustering coefficient and average path length of random graphs following the ER model of the same network size. average node degree, kmax,in/kmax,out: maximal node degrees of the directed networks, Cin/Cout: clustering coefficients of the directed networks. Subscripted characters denote network attributes with regard to the in-degrees ('overshadow indices') and out-degrees ('shadow indices').