Study area

We exemplify our analysis on the mountain rainforest of south-eastern Ecuador that belongs to the biodiversity hotspot of the tropical Andes which is globally most important for conservation support64. We use data of a unique ecosystem, the mountain rainforest of the eastern Andean slopes65. This is essential to avoid confounding effects from multiple biogeographic units, as it is the case in most other studies on the relation of geo- and biodiversity. The field data were collected between 2007 and 2017 along an elevational gradient between 1000 and 3000 m a.s.l. (Fig. 3) with study sites equally distributed along the gradient66,67,68.

Figure 3 Location of the study area in the southern part of Ecuador (left) and its three main elevational levels from 3000 to 1000 m a.s.l. (white boxes from left to right). Map created in QGIS 3.14.16-Pi (https://qgis.org/). Background imagery (right): Landsat-8 true-color composite (in courtesy of USGS, https://earthexplorer.usgs.gov/). Full size image

Species diversity and ecosystem functions

We assessed the species diversity of four taxa and four associated ecosystem functions using standardized methods. The details of the sampling procedure of each taxon and function have mostly been described in their original publications. We sampled species diversity data of trees40,69, testate amoebae70,71, ants72 and birds73. For ants, species abundance data from two field campaigns were pooled and for birds, nine point count records per plot were pooled to obtain the bird species abundance of all species and individuals per plot. Species abundance for trees and testate amoebae was measured based on distinct samples. To compare the Shannon diversity of the four taxa in a standardized way, we used a rarefaction and extrapolation framework based on sample coverage, implemented in the R package iNEXT74,75. In this framework, it is statistically valid to extrapolate the observed Shannon diversity values to the maximal sample coverage (the asymptote of the rarefaction and extrapolation curve). We therefore set the extrapolation ratio for the calculation of the standardized species diversity values to one, corresponding to a complete sample coverage for all taxa76.

Four important ecosystem functions that are associated with the four taxa were measured in the study area. C-sequestration was observed as the aboveground net primary production77, deduced from annual wood production and annual fine litter production (tree diversity/C-sequestration, partial overlap of plots, Pearson r = 0.35, p < 0.01, n = 52). The decomposition of soil organic matter was measured using a Tea Bag experiment78 reflecting the weight loss of tea bags after 21 days (testate amoebae diversity/decomposition, distinct plots). Predation was assessed using the number of bite marks of ants on artificial caterpillars72 (ant diversity/predation, Pearson r = 0.40 p < 0.05, n = 27). Seed dispersal was measured as the number of observed seed removal events of frugivorous birds on fleshy fruited plants79 (bird diversity/seed dispersal, Pearson r = 0.62, p < 0.01, n = 15).

Environmental variables

In this study we investigated a set of 13 environmental variables based on conditions and resources within the three groups climate, habitat and soil (Table 1). We did not include geology which has often been used in other geodiversity research studies due to the geological homogeneity of our study area66. The predictors of the three groups were available as spatial datasets or were spatially predicted with varying spatial resolution (see Table 1). The following predictors were considered for the three groups:

Climate: We used climate stations in combination with a Landsat-8 image classification (recorded on 20/11/2016, see also ref.77) to derive gridded forest temperature and humidity data at a spatial resolution of 30 m following previous approaches80,81. We calculated four climatic predictors: the annual mean, maximum and standard deviation of mean monthly air temperatures over the year as well as the mean annual relative air humidity.

Habitat: Based on the pre-processed Landsat-8 scene, we calculated the Normalized Difference Vegetation Index (NDVI) and its textural metric ‘correlation’ using all directions in the ‘glcm’ package82 in R as well as the Red-Blue Difference Vegetation index (RBVI, see refs.77,83) and the forest cover per pixel in percentage. These habitat variables are known to be related to habitat structure and productivity in the study area77,83. In addition, we used the Sentinel-2 product Leaf Area Index (LAI) available at 10 m spatial resolution. To characterize topography, we calculated the Topographical Position Index (TPI) based on an airborne high-spatial-resolution digital elevation model from the SIGTIERRAS campaign83.

Soil: Due to the lack of spatial soil data products at a moderate spatial resolution, we modeled and spatially predicted a suite of soil variables sampled in the study area84. These soil variables were related to phosphorus and nitrogen availability and pH (in H 2 O) in the mineral soil horizon. We used multi-linear models with a stepwise predictor selection (forward and backward) using spatial predictors (Extended Data Table E1). Only three models explained more than 45% in organic layer depth, Phosphorus content in Ah-horizon and pH in Bv-horizon. The explained variance of their models ranged between 45 and 69% (Extended Data Table E2). Their spatial predictions were subsequently used as soil predictors.

Environmental variables and geodiversity indices

Our study targets to compare models of species diversity and ecosystem functions using combinations of environmental predictors grouped within climate, habitat, and soil variables with models using a single geodiversity compound index which is defined as the summed spatial diversity of the same selected environmental variables. For this purpose, from each group (climate, habitat, and soil) one condition or resource has been selected as predictor. The choice of a predictor within each group (e.g., temperature, rainfall etc. out of climate) was guided by the findings of prior studies (Extended Data Table E3, Supplementary Methods) and the resulting combination of environmental variables, thus, comprises the requirements that the corresponding organisms need for their establishment and survival. A second criterion was the collinearity present in the predictor space. We changed the predictor combination if the correlation between two or more predictors were greater than r = 0.6. Whenever present, different statistics such as mean, maximum, and standard deviation of the climate group have been compared and the environmental variables with the best performance concerning the Akaike information criterion of the corresponding model fit was chosen. For all response variables, the pixel values of the selected environmental variables were used that overlapped with either the point or shape geometries of the corresponding plots. The resulting predictor set, thus, assess the central tendency of three environmental factors at each plot.

The geodiversity index, in contrast, is computed as the summed spatial diversity of the same three environmental factors measured within each plot and its surrounding and follows the recent plea to consistently use diversity indices commonly applied in biodiversity studies7. For this purpose, we classified the spatial grids of the three selected predictors into five classes using intervals following the Fisher style85 in the ‘classInt’ R package86. After the classification, the central pixel of each plot and in addition all adjacent pixels were extracted. This resulted in nine pixels per plot. For these pixels, the frequency of all present classes was calculated. From an ecological perspective each class was used analogous to a present species at the corresponding plot and their frequency was used as their abundance. For the resulting abundance matrix of the three predictors, Shannon diversity was calculated using the diversity function of the ‘vegan’ package87,88. This results in three predictors addressing the spatial variation of the chosen environmental variables. As in some ecological studies where the diversity of multiple taxa has been summed or averaged to study multi-taxa diversity51, we here treated each of the three predictors like a single taxon. Thus, analogously to multi-taxa approaches, we calculated the geodiversity index by summing up the spatial diversity of the three environmental variables (for further explanation see also Supplementary Fig. S1).

Statistical analyses

We used generalized additive regression models (GAMs) to model species diversity, and ecosystem functions. To address the (non-)linear relationships between the index of geodiversity, species diversity and ecosystem functions, univariate GAMs were used. Multivariate GAMs were used to address the relationships to the three environmental variables considering conditions and resources of the three groups climate, habitat, and soil. For all models, GAMs were calibrated using a Gaussian error distribution in the model fitting based on the exploration of residuals and the generalized cross-validation score of the models (GCVS). Non-linear relationships were considered using cubic regression splines with a degree of freedom permitted to vary between one and three. GAMs however maintained the flexibility to model linear relationships wherever present. We performed all GAM analyses using the ‘mgcv’ package in R89. The model performance of combined environmental variables and the geodiversity index among all models was compared (Fig. 1).

For the combination of environmental variables, we further explored the driving conditions and resources within climate, habitat and soil variables considering their pure and shared proportions in explaining species diversity and ecosystem functions. For that purpose, we used Variance Partitioning (VP) implemented in the ‘modEvA’ R package90 in combination with GAMs. All environmental variable combinations from single to three predictors were considered, resulting in 7 different GAMs (3 models with single predictors, three models with two predictors and one model with three predictors). Model parameters have been fixed as in the model using all three climate, habitat, and soil predictors to enhance model comparability.