Diversity and carbon storage across the tropical forest biome

Tropical forests are global centres of biodiversity and carbon storage. Many tropical countries aspire to protect forest to fulfil biodiversity and climate mitigation policy targets, but the conservation strategies needed to achieve these two functions depend critically on the tropical forest tree diversity-carbon storage relationship. Assessing this relationship is challenging due to the scarcity of inventories where carbon stocks in aboveground biomass and species identifications have been simultaneously and robustly quantified. Here, we compile a unique pan-tropical dataset of 360 plots located in structurally intact old-growth closed-canopy forest, surveyed using standardised methods, allowing a multi-scale evaluation of diversity-carbon relationships in tropical forests. Diversity-carbon relationships among all plots at 1 ha scale across the tropics are absent, and within continents are either weak (Asia) or absent (Amazonia, Africa). A weak positive relationship is detectable within 1 ha plots, indicating that diversity effects in tropical forests may be scale dependent. The absence of clear diversity-carbon relationships at scales relevant to conservation planning means that carbon-centred conservation strategies will inevitably miss many high diversity ecosystems. As tropical forests can have any combination of tree diversity and carbon stocks both require explicit consideration when optimising policies to manage tropical carbon and biodiversity.

Plots were obtained from a global dataset of forest inventory plots 1 surveyed using standardised field 151 methods 2 . Plots were 1 ha (except for four that were 0.96ha), and were all located in old-growth, 152 closed-canopy, terra firme forests, with mean annual temperature of ≥20°C and mean annual 153 precipitation of ≥1300mm. Thus, montane, swamp, peatland and seasonally flooded forest were 154 excluded. Plots known to have been subject to anthropogenic disturbance were also excluded. This 155 enabled us to focus on carbon-diversity relationships within lowland terra firme tropical forest, 156 avoiding major climatic, anthropogenic and hydrological factors that could confound these 157 relationships. Having accurate measures of diversity was important for the purposes of this study, so 158 plots were only included if >80% of trees were identified to genus level and >60% of trees were 159 identified to species level. Identification rates were similar amongst continents (median identification 160 rates to species level: South America = 92.5%, Africa = 93.5%, Asia = 93.1%). We excluded 161 transects >500m in length or <20m in width, and any plot known to contain more than one soil type, 162 and only included non-contiguous samples if within 500m of each other. In each plot all stems 163 ≥100mm diameter were measured, and identified to species level where possible. Where a plot had 164 been surveyed multiple times we normally used the initial census, as these were typically 165 accompanied by botanists so were expected to have the highest proportion of identified stems, except 166 where there was a specific reason (e.g. failure of first census to meet selection criteria) to use a later 167 census. 168

Environmental variables 169
We used soil data from 0-30 cm depth, and used total exchangeable bases (TEB; measuring soil 170 fertility), carbon: nitrogen ratio (C:N ratio; a useful proxy of available phosphorus) and soil texture as 171 explanatory variables in analysis. Plots were assigned a reference soil group according to the World 172 Reference Base soil classification system 3 , using data from published sources e.g. 4,5 where available. 173 When these data were not available, the reference soil group as mapped in the Harmonised World Soil Database 6 , or SOTER 7 for the Democratic Republic of the Congo, was used. Results are similar when 175 only dominant soil groups are used ( Supplementary Fig. 20). Then, the particle size and TEB data for 176 the nearest soil unit of the same reference soil group were extracted from the HWSD or SOTER. C:N 177 data were extracted from the Digital Soil Map of the World, or SOTERLAC or SOTER where available. 178 We extracted mean annual precipitation (MAP) and mean annual temperature (MAT) from the 179 WorldClim database 8 at 30' (≈ 1km) resolution. Temperature data were corrected using the lapse rate 180 Δ temperature = 0.005°C m -1 to account for differences between plot elevation and the mean elevation 181 of WorldClim grid-cells. We also calculated cumulative water deficit (CWD), a measure of water 182 stress experienced in the dry season. This was done using mean monthly precipitation from 183 WorldClim and mean monthly potential evapotranspiration (PET, 1980-2010 average) from CRU 184 TS3.22 9 . The water balance for each month (t) was calculated as CWDt = min(0, CWDt-1 + 185 Precipitationt -PETt). This model was run recursively over a period of 12 months, starting in the 186 wettest month of the year, with the starting water balance assumed to be zero. The minimum CWDt 187 value across the year represents the greatest drought stress experienced by plants, and is referred to as 188

CWD. 189 190
Estimating diversity 191 Although we applied stringent selection criteria to ensure that the diversity measures included in this 192 study were largely based on fully identified taxa, it was seldom possible to fully identify all taxa in a 193 plot, as local species pools frequently exceed 1000 tree taxa in the tropical forest domain 10 . Some 194 unidentified stems could safely be considered to be additional taxa and added to richness estimates as 195 botanists had assigned them to morphospecies, or had identified them to a higher taxonomic level not 196 otherwise represented in the plot. We assigned remaining unidentified stems to discrete taxa based on 197 the ratio of taxa per stem based on stems that were fully identified to a given taxonomic level. This 198 procedure was necessary to ensure that richness estimates did not simply reflect the proportion of 199 stems that could be fully identified. Where I = richness of stems identified to a given taxonomic level, Ms = morphospecies richness, a = 205 richness of stems unidentified to genus level but unique representatives of a particular family, b = 206 richness of stems unidentified to species level but unique representatives of a particular genus, U 207 =number of stems remaining unidentified at a given taxonomic level, P = number of taxa (at given 208 taxonomic level) per identified stem, and s, g and f subscripts denoting species, genus and family 209 respectively. [] denotes rounding to the nearest integer. 210 These formulas give richness per unit area. Richness per n stems was estimated using individual based 211 rarefaction at both plot (1 ha) and subplot (0.04 ha) scales. At plot scale, richness was expressed per 212 300 stems, while at subplot scale richness was expressed per 10 stems. 213 We calculated diversity metrics representing the three most commonly used Hill numbers 11 , richness 214 ( 0 D), Shannon diversity ( 1 D = exp(H'), where H' = -∑pi log pi, with pi the proportion of stems 215 belonging to species i) and Simpson diversity ( 2 D = 1/λ, where λ = ∑pi 2 ), as these give different 216 weightings to rare versus dominant taxa, with higher Hill numbers giving greater proportional weight 217 to dominant taxa. In addition, we calculated Fisher's α, as it is commonly used to explore diversity in 218 tropical forests. Fisher's α is a constant derived from the log series S = α ln (1+N/ α), where S is the 219 number of species in the sampled community and N is the number of individuals sampled. Analyses 220 with taxon richness ( 0 D) and Fisher's α have been presented in the main text, with analysis of 1 D and 221

Analysing beta diversity 224
We used Sørensen index to quantify beta diversity between pairs of plots. These pairwise similarities 225 were related to the geographic distance between pairs of plots using a generalised linear model with a 226 binomial errors and a log-link function following 12 . Fitting exponential distance decay models as 227 generalised linear models in this way avoids the problem of log-transforming zero similarity values, 228 with a binomial error structure appropriate as similarity values are bounded to vary between zero and 229 one 12 . Models were constructed for each continent. The significance of parameter estimates was 230 assessed by resampling the data 10000 times with replacement. Following 12 we excluded identical 231 sites pairs with zero geographic distance and identical tree communities from bootstrap samples as 232 these lie outside the original sampling frame. 233 We also investigated how Fisher's alpha in each continent increased with the number of samples or 234 the distance around a plot, repeating the methods of 13 on our dataset to investigate whether the 235 patterns of diversity accumulation over space they observe are also evident in our data. 236 Incomplete species identifications pose a challenge to the calculation of beta diversity as it means that 237 not all the species pool have been sampled. A wide range of beta diversity metrics, including 238 Sørensen index, show an approximately linear relationship between undersampling of taxa and bias in 239 the beta diversity metric 14 . Because of this we excluded sites with <90% of stems identified to species 240 level from our analysis of beta diversity; this threshold was a compromise between maintaining a 241 large sample of plots and reducing bias caused by undersampling of taxa. This threshold gave a 242 sample size of 99 plots in South America, 105 plots in Africa and 23 plots in Asia. Synonymous 243 species names pose a further challenge, as treating two synonyms as separate species would inflate 244 beta diversity, and no universal adjudicated list exists for all tropical plants. We used the R package 245 Taxonstand 15 to compare species names with those in The Plant List (www.theplantlist.org) and 246 remove identified synonyms. However, 28.5% of identified stems remained unresolved (i.e. the 247 species name was present in The Plant List but it was uncertain whether the species name was a 248 synonym) in Asia after using Taxonstand, compared to 0.3% in South America and 0.6% in Africa, 249 indicating that further botanical work is required in Asia to resolve these synonyms. We compared 250 unresolved species in Asia against The Asian Plant Synonym Lookup 251 (phylodiversity.net/fslik/synonym_lookup.htm) in a further attempt to remove synonyms. Following 252 this, 5.2% of identified stems in Asia remained unresolved. 253

Statistical analysis 254
We conducted analyses at three spatial scales, firstly comparing carbon and diversity among 255 continents, secondly, assessing relationships between carbon and diversity between 1ha plots within 256 each continent, and finally assessing carbondiversity relationships between 0.04 ha subplots within 257 1 ha plots. 258 Differences in carbon-storage and diversity metrics between continents were assessed by modelling 259 each response variable of interest as a function of continent in a linear modelling framework, where 260 continent was a factor with three levels. Area based taxon richness are count data, so were modelled 261 using generalised linear models with negative binomial errors (due to overdispersion) and a log link 262 function. 1 D (species level), 2 D (species level) and Fisher's α were square root transformed prior to 263 modelling to homogenise variances and ensure normality of residuals. We tested for significant 264 differences between continents using Tukey's all-pair comparisons, implemented in the R packages 265 multcomp 16 . 266 We then conducted Kendall's tau correlations between carbon and each diversity metric to assess 267 univariate relationships, using plot level data from each continent in turn. Kendall's tau was chosen as 268 it is non-parametric, so does not assume bivariate normality, and can handle ties. This analysis 269 involved computing 13 tests for each continent, so there is therefore some risk of significant 270 relationships appearing by chance. We used false discovery rate control to adjust P values for multiple 271 testing, and present both corrected and uncorrected P values. We performed power analysis using the 272 R package pwr 17 to assess the smallest effect size (Pearson's r) that could be detected with 80% 273 power given the sample size in each continent. Values of r were converted to τ using the lookup table 274 The univariate correlations examined whether diversity metrics were spatially congruent with carbon. 276 However, other environmental variables acting on carbon or diversity metrics could enhance or 277 obscure any underlying mechanistic relationship. We therefore conducted a multivariate analysis 278 where carbon was modelled as a function of diversity metrics, climate and edaphic variables. This 279 analysis was performed separately for each continent. Diversity metrics were highly correlated with 280 each other (mean Pearson's r = 0.833), so one model was constructed per diversity metric. We 281 included cumulative water deficit (CWD), mean annual temperature (MAT), mean annual 282 precipitation (MAP) as climate variables; we did not include other variables relating to precipitation 283 seasonality as they were strongly correlated with CWD. No plots in Asia experienced CWD different 284 from zero, so CWD was not included in models for there. We used Principal Component Analysis to 285 collapse variation in soil texture into two orthogonal axes, which collectively explained 95.4% of 286 variation in soil texture. Axis one (PCA1) was positively correlated with the amount of sand, while 287 axis two (PCA 2) was correlated with the amount of silt and negatively correlated with the amount of 288 clay (Supplementary Table 1). We also included the sum of total exchangeable bases (TEB) and the 289 carbon: nitrogen ratio (C:N). Explanatory variables were centred and scaled to have a mean of zero 290 and a standard deviation of one. The basic equation for these models was thus 291 log(carbon) = a + β1Diversity metric + β2CWD + β3MAP + β4MAT + β5PCA1 + β6PCA2 + β7TEB + 292 β8C:N + ε 293 We used MuMIn 19 to fit all valid simplifications of this global model. Each model was ranked based 294 on AICC, from which the Akaike weight of each model i was calculated (ωi). The parameters of the 295 best supported models (defined as the models required for cumulative sum of wi =0.95, known as the 296 95% confidence set) were averaged, while the support for individual explanatory variables was 297 assessed by summing to ωi of models in which that variable appeared. 298 Spatial autocorrelation in residuals of these OLS models was examined by plotting correlograms 299 using the R package ncf 20 . Positive short range and negative long-range residual autocorrelation was 300 evident in South America, suggesting the presence of strong environmental gradients. Residual spatial autocorrelation was less strong in Africa, and weakest in Asia, but was present in all continents. We 302 repeated the above modelling procedures using simultaneous autoregressive error models (SAR), 303 implemented in spdep 21 . These were selected because of good performance in evaluations by 22 , with 304 error models selected, as opposed to lag or mixed SARs, as 23 found they performed better regardless 305 of the mechanism generating spatial autocorrelation. We selected the best neighbourhood distance for 306 each global model by fitting models with maximum neighbourhoods distances varying in 20km 307 increments from 20km to 1000km, and selecting the neighbourhood distance that gave the lowest 308 AICC. Although all SAR models had lower AIC values than OLS models, we present results from 309 both OLS and SAR models, as it has been argued that spatial models are not necessarily more correct 310 than non-spatial models 24 . 311 We assessed fine-scale relationships between diversity and carbon by using multiple regression to 312 model ln (carbon) in 0.04ha subplots as a function of diversity and the number of stems in the subplot, 313 with a second order polynomial used for the number of stems to capture potentially saturating 314 relationships. Explanatory variables were natural log transformed to allow comparison with results of 315 25 . We ran these models in each 1ha plot where subplot level was available (n = 266). We tested 316 whether the mean coefficient was different from zero using one-sample Wilcoxon tests, and 317 calculated 95% confidence intervals from 10000 bootstrap resamples with replacement. Running 318 separate models for each plot allowed us to capture variability in fine scale relationships between 319 plots. However, the overall mean relationship between diversity and carbon at subplot scale could be 320 more robustly assessed using mixed effects models with random coefficients. This assumes that 321 coefficients in plot j come from a normal distribution, βj ~ Normal(μβ σ 2 β), where μβ is the mean value 322 of the coefficient across plots, and σ 2 β is the variance of the coefficient across plots. We relax the 323 assumption of independence between coefficients, so that pairs of coefficients in the same plot are 324 assumed to come from a multivariate normal distribution with correlations between coefficients 325 estimated in a variance-covariance matrix. Mixed effects models were implemented using the R 326 package lme4 26 . Our results show a weak positive relationship between diversity and carbon storage at small spatial 331 scales (among 0.04 ha subplots within 1 ha plots), but no pan-tropically consistent relationship among 332 1 ha plots, even after controlling for potentially confounding environmental variation and spatial 333 autocorrelation. These results pose two questions. Firstly, which mechanisms underlie the positive 334 diversity-carbon relationship between 0.04 ha subplots? And secondly, why do these mechanisms 335 appear to only operate at small spatial scales? These questions are best investigated with long-term 336 experiments in tropical forests, however, we can evaluate whether correlative results from our 337 observational dataset are consistent with the operation of niche complementarity and selection effects 338 at 0.04 ha and 1 ha scales. 339 Evidence for niche complementarity 340 Positive relationships between biodiversity and ecosystem function have been hypothesised to arise 341 through two general mechanisms, niche complementarity and the selection effect. The niche 342 complementarity hypothesis proposes that differences in resource use by species allows diverse 343 communities to use available resources more efficiently than less diverse communities 27 . For 344 example, in low diversity temperate forests, complimentary canopy architecture has been found to 345 drive a positive relationship between diversity and productivity 28 . In tropical forests attempts to 346 assess the role of niche complementarity have focused on relating above-ground live carbon storage to 347 the functional diversity of tree communities 29,30 , with the expectation that more functionally diverse 348 species assemblages should be able to partition resources more effectively. However, these studies 349 found no relationship between carbon storage and functional diversity 29,30 . 350 Quantifying functional diversity in tropical forests is challenging due to the shortage of available trait 351 data. We used two approaches to quantify functional diversity, (i) the standard deviation of wood 352 density (SDWD) in a subplot or plot, and (ii) a multivariate functional diversity metric (FDM) using 353 both the wood density and the maximum diameter of each species in a subplot or plot.
For the SDWD we used published wood density values 31,32 , and commonly used methods to select 355 genera-level wood density in cases when literature values for a given species were unavailable 33-35 . It 356 would be preferable to use local trait data 36 but as these are not available for many plots it is 357 necessary to use literature values for pan-tropical studies 29 . Wood density provides a proven proxy 358 for life history strategy in tropical forests 37 , since denser wooded trees tend to be slower growing, less 359 light demanding and potentially larger than species with lower wood density 38,39 , and variation in 360 wood density is closely related to variation in leaf traits 40 and demographic traits 41 . We therefore 361 expect that the potential for niche complementarity is greater in species assemblages with more 362 variation in wood density. 363 The relationship between carbon storage and SDWD at the 0.04 ha scale within 1-ha plots was variable 364 but significantly negative overall. At the 1 ha scale the relationship was significantly negative in all 365 three continents (Fig. S1). At both scales, SDWD was negatively related to mean wood density ( Fig.  366 S2), indicating that the more variable plots were increasingly composed of species with 'fast' life 367 history strategies. These plots potentially have high rates of stem turnover, and thus shorter biomass 368 residence time 42 . When we included community weighted mean wood density as a covariate to 369 account for this, the negative relationship between carbon storage and wood density standard 370 deviation among 0.04 ha subplots was weaker but still significantly negative (P < 0.001). Negative 371 relationships among 1 ha plots also weakened in all continents (non-significantly negative in South 372 America and Africa (P ≥ 0.177), significantly negative in Asia (P = 0.004). SDWD was also negatively 373 related to the community weighted mean of maximum diameter at both scales ( Fig. S3), indicating 374 that plots with a greater variety of tree life history strategies were increasingly composed of smaller 375 tree species. 376 We then estimated functional diversity using the FDM, calculated following 43 . For this, we follow 377 Cavanaugh et al. 29 and define functional diversity in terms of the wood density and maximum 378 diameter of each species in an assemblage (they worked at genus level). Thus, following Fauset et al. 379 used for species that occurred too infrequently to estimate species level maximum diameter, and 382 family-level estimates used when there was no genus-level estimate. We estimated maximum 383 diameter for each genus and family using the same methods as for species-level estimates. We used 384 this trait data to construct a functional dissimilarity matrix, where the dissimilarity of pairs of species 385 based on their traits was quantified using Gower distance. This dissimilarity matrix was converted 386 into a dendrogram using average linkage. FDM was calculated as the sum of branch lengths of a 387 dendrogram containing all species in a plot or subplot divided by the sum of branch lengths of a 388 dendrogram containing all species in the potential source pool, defined as all species in our dataset 389 found in a given continent. Thus, FDM is equal to one when all the trait diversity in the source pool of 390 species is found in in the subset of species in a subplot or plot, and decreases towards zero as 391 increasingly large amounts of trait diversity are missing from the subset of species. 392 Our FDM metric showed an overall weak positive relationship between functional diversity and 393 carbon storage at the 0.04ha scale (Fig. S1), which remained when community-weighted mean wood 394 density was included as a covariate (β = 2.6, P < 0.001). However, at the 1 ha scale, FD and carbon 395 storage were unrelated, even when community-weighted mean (CWM) wood density was included as 396 a covariate (P ≥ 0.118). This is consistent with results of previous studies at this scale 29,30 , which 397 found no relationship between functional diversity and carbon storage. At both scales FDM was 398 weakly negatively related to community-weight mean wood density (  diversity is quantified either as the standard deviation of wood density among stems within a plot/ 407 sub-plot (SDWD), or using a dendrogram based method where species are clustered according to their 408 wood density and maximum diameter traits (FDM). Relationships are shown for 1 ha plots in each 409 continent (data from South America are shown by green circles, Africa by orange squares, and Asia 410 by purple triangles, regression lines are shown for significant relationships (P < 0.05)), and for 0.04 411 ha subplots in 1 ha plots (regression lines shown for each 1 ha plot, colour scheme same as before).

412
Relationships between wood density SD and carbon are: South America 1 ha, β = -5.0, P < 0.001; 413 Africa 1 ha β = -2.3, P = 0.006; Asia 1ha, β = -9.1, P < 0.001; 0.04 ha mixed effects model, β = -1.1, Previous attempts to evaluate whether selection effects occur in tropical forests have tested the 447 prediction that carbon storage is related to the functional dominance of species with large maximum 448 diameters or dense wood 29,30 . To begin with, we therefore repeated this approach with our larger pan-449 tropical dataset (360 plots), using the community weighted mean of wood density and maximum 450 diameter as a measure of functional dominance. We found that at both scales carbon storage increased 451 with the community weighted mean of maximum diameter, as found by previous studies at 1 ha scale 452 29,30 , and also that it increased with the community weighted mean of wood density (Fig. S4), which 453 the previous studies did not detect as a driver of carbon storage 29,30 . However, while this approach is 454 useful and interesting, strictly it is a test of the biomass ratio hypothesis, by which ecosystem function 455 is related to the traits of dominant taxa 46 , rather than a test of the selection effect per se. between continents, so we also estimated the probability of samples of different species richness 476 containing a potentially large species by sampling the tree species in our dataset 3000 times for each 477 species richness increment, with each sample restricted to contain species from a single continent 478 (1000 samples for each continent). We also repeated this procedure with the probability of sampling a 479 species weighted by that species' frequency in a continent. Both approaches gave a similar rapidly 480 saturating curve (Fig. S4), and with a slightly higher probability of sampling large species when 481 species frequency was maintained. Importantly, the probability of a sample containing a potentially 482 large species increases substantially through the inter-quartile range of 0.04 ha species richness 483 values, but for the whole inter-quartile range of 1 ha species richness values samples were almost 484 certain to contain a potentially large species (Fig. S5). Similar inferences obtain when we modelled 485 the probability of subplots containing potentially large tree species was as a function of species 486 richness using binomial generalised mixed models (with plot identity as a random effect): the 487 probability of sampling a large tree species in a 20x20m subplot increased with species richness ( We find similar results when evaluating the probability of sampling a species with high wood density. 517 Thus, the probability of sampling a species with wood density ≥ 0.8 g.cm -3 increases with species 518 richness through the range of species richness values found in 0.04 ha subplots, but saturates by the 519 species richness values found in 1 ha plots (Fig. S7). All but one 1 ha plot contains a species with 520 wood density ≥ 0.8 g.cm -3 , however at 0.04 ha scale there is a positive relationship in all continents 521 (Fig. S8). 522 Although the choice of 70 cm as a threshold for maximum diameter is supported by previous work 523 demonstrating the contribution of trees of this size class to overall biomass 45 , the thresholds chosen 524 for both maximum diameter and wood density are essentially arbitrary. To explore sensitivity to this 525 choice we also explored the effects using other, substantially different, thresholds. Setting a lower threshold naturally means that the probability of sampling a high functioning species saturates at 527 lower species richness, while setting a higher threshold means that it saturates at higher species 528 richness (Fig. S9, Fig. S10). However, for all the thresholds which we investigated, the probability of 529 sampling a high functioning species increased more rapidly with species richness though the range of 530 species richness values found in 0.04 ha subplots than the range of species richness values found in 1 531 ha plots (Fig. S9, Fig. S10). Our results are consistent with the weak positive relationship between diversity and carbon storage 569 resulting from niche complementarity and/or selection effects, as at this scale we found a weak 570 positive relationship between carbon storage and functional diversity (Fig. S1, consistent with niche 571 complementarity) and between species richness and the probability of sampling a large tree (Fig. S6, 572 consistent with selection effects). We note that positive diversity-carbon relationships at fine scales 573 could also result from density dependent effects, which could arise if pests and pathogens incur a 574 reduced cost on species with low local densities. We found no evidence of either selection effects or 575 niche complementarity operating at the 1-ha scale, which is consistent with both mechanisms being 576 scale dependent. For selection effects this potential scale dependency could arise through the greater 577 number of species as spatial scale increases, as we show that 1 ha plots are already sufficiently diverse 578 for plots to be almost certain to contain a potentially large tree species (Fig. S5). 579 Carbon storage was related to the dominance of wood density and maximum diameter traits in species 580 assemblages (Fig. S3), consistent with the biomass ratio hypothesis where ecosystem function is 581 related to the traits of the dominant taxa. Our results are therefore consistent with previous studies in 582 showing that carbon storage in 1 ha plots is related to functional dominance but not to functional 583 diversity 29,30 , and extend these by firstly showing that selection effects potentially saturate so are 584 unlikely to explain functional dominance at 1 ha scales, and secondly by reporting correlations 585 consistent with the operation of both niche complementarity and selection effects at the 0.04 ha scale. 586 Overall, we find support for the operation of niche complementarity and selection effects at 0.04 ha 587 scale but no evidence for their operation at 1 ha scale, although as firm causal inferences cannot be 588 Expected values were generated by using a null model that randomly shuffles individual trees among 633 plots within a sample area, while maintaining the number of stems in each plot and the overall gamma 634 diversity and relative abundance of species in the sample area 47 . β deviation was estimated for each 635 plot, with the sample area defined as a 50 km radius around that plot. The null model was run for 636 1000 iterations for each plot. Beta deviation differed significantly amongst continents (Kruskal-637 Wallis, χ 2 = 13.7, P = 0.001). Different letters indicate significant differences between continents 638 (pairwise Mann-Whitney tests with false discovery rate correction, P < 0.05). Beta deviation in all 639 continents was significantly lower than zero (one sample Wilcoxon tests, P < 0.001).  Table S4.    as a function of ln (diversity metric) and ln (stem density) with plot identify as a random effect. 812 Coefficients were assumed to vary between plots, with SD showing the estimated standard deviation 813 of this variation. The effect of doubling a diversity metric on carbon storage was calculated as (2 β -1) 814 x 100. 0 D is species richness, 1 D is Shannon diversity and 2 D is Simpson diversity (see SI methods).