Introduction

Tropical forests are the most biologically diverse biome on Earth, encompassing an estimated 96% of all tree species1. Although categorized as a single biome, tropical forests differ substantially within and across continents. Differences in ecosystem structure and function corresponding to variation in species composition directly influence ecosystem processes and will likely influence tropical forest responses to climate change. For example, field observations indicate that the carbon sink in tropical Africa has been stable for three decades (1983–2011), which is in stark contrast with a long-term decline in the Amazon2. However, a study using Orbiting Carbon Observatory (OCO-2) satellite data revealed net carbon emissions across Africa, Asia, and the Americas following the 2015 El Niño event, albeit with unique drivers3. The ability to map and characterize distinct tropical forest types is thus critical for identifying where further research can examine mechanistic controls on differences in structure and function, and how different forest types within the tropical forest biome will respond to a changing planet. This is of particular importance as tropical forests are threatened by deforestation, degradation, and climate change, and are critical for carbon sequestration and many other important ecosystem services.

Networks of tropical forest inventory plots offer valuable ground-based insights into mechanisms and processes. However, remote sensing data can be used to scale these insights to entire landscapes and regions and highlight previously unexplored areas for further investigation of mechanism and process. Remote sensing can thus serve as a powerful tool to measure and map functionally distinct forest types across the tropics that may be inaccessible for ground-based investigation. The recent surge in ecologically orientated satellite remote sensing missions, including spaceborne LiDAR (i.e., light detection and ranging) and imaging spectroscopy (i.e., hyperspectral remote sensing), makes this a critical moment to evaluate tropical forest function mapping at spaceborne resolutions (e.g., ~20–30 m) and assess the relative importance of forest structure and canopy leaf traits for characterizing tropical forest function.

Forest types in the tropics differ in canopy structure and function, which vary with climate4, topography5, soil biogeochemistry6,7, and natural and anthropogenic disturbance histories and regimes8,9. Airborne imaging spectroscopy and LiDAR have enabled the measuring, mapping, and monitoring of tropical forest functional and structural diversity at large spatial scales, in ways that inform ecological understanding10,11, support conservation efforts12, and constrain terrestrial biosphere models13. In the tropics, airborne imaging spectroscopy has been used to map patterns of diversity across forests in Borneo11 and the Amazon10,14. Airborne imaging spectroscopy has also been used to characterize leaf traits across the sunlit portions of tropical forest canopies that are detectable by a satellite or airborne sensor (which we refer to here as canopy traits) and identify relationships between these traits and underlying environmental drivers. For example, Asner and coauthors12 identified relationships between imaging spectroscopy-derived canopy traits and variation in geology, topography, hydrology, and climate across the Peruvian Amazon, sorting the region into 36 distinct forest types. In Malaysia, airborne imaging spectroscopy and LiDAR data showed a strong influence of fine-scale topography on forest structure, composition and diversity5, and the role of interacting geomorphology and topography on canopy foliar traits across larger elevation gradients15.

Expanded research opportunities will be made possible by the newly available ecologically oriented hyperspectral and LiDAR satellite missions, including the operational PRISMA16 and DESIS17 spectrometers, NASA’s Global Ecosystem Dynamics Investigation spaceborne LiDAR18, and the planned NASA Surface Biology Geology (SBG)19, and European Space Agency Copernicus Hyperspectral Imaging Mission for the Environment (CHIME)20 spaceborne spectrometers. Satellite instruments overcome expensive and spatially restricted airborne campaign limitations by providing extensive coverage over tropical forest regions. However, the data from these sensors are or will be at spatial resolutions of ~20–30 m (400–900 m2), far coarser than the 1–5 m (1–25 m2) resolutions of airborne remote sensing data used in the studies described above. In addition, lack of knowledge of what traits to prioritize to enable distinguishing between different tropical forest types hinders satellite sensor and algorithm development. Here, we hypothesize that functionally distinct forest types can be mapped using a combination of canopy foliar traits and canopy structure information using moderate (30 m) spatial resolution, equivalent to the resolution available via forthcoming satellite sensors (H1). We evaluate what canopy traits or structural attributes are most critical for mapping distinct forest types. We hypothesize that mapped forest types, distinguishable at the 30 m spatial resolution, exhibit distinct ecosystem function (H2) related to carbon stocks (H2a), growth (H2b), and mortality (H2c).

In this study, we combine airborne imaging spectroscopy-derived canopy trait measurements with airborne LiDAR-derived measurements of canopy structure, resampled to resolutions corresponding to new satellite missions (30 m) to (1) identify, characterize, and map structurally and functionally distinct tropical forests across two landscapes in Malaysian Borneo; (2) determine the key canopy traits and canopy structural attributes that distinguish different forest types; and (3) compare mapped forest types with inventory plot data to explore differences in carbon stocks, growth, and mortality within each forest type. We used canopy trait maps developed by Martin et al.21 that used co-aligned LiDAR and imaging spectroscopy data collected by the Global Airborne Observatory from an aircraft in April 201622, and structural metrics that we calculated directly from the airborne LiDAR data (Supplementary Figs. 1 and 2). Canopy traits and structural attributes were selected to capture two main forest type trait axes, which explain roughly half of the global trait variation. These axes include plant stature and resource acquisitiveness, which are linked to tree growth and mortality tradeoffs23,24. The canopy traits and structural attributes that we evaluated included leaf economic spectrum traits indicative of resource acquisition—leaf mass per area (LMA), leaf nitrogen (N), and leaf phosphorus (P), as well as stature—maximum canopy height, and crown and canopy architecture. To more directly evaluate canopy photosynthetic capacity, we estimated the maximum rate of Rubisco carboxylase activity (Vcmax) from canopy N, canopy P, and the ratio of N:P using a statistical equation described in ref. 25 This enabled exploration of an additional axis of variation related more directly to the impacts of nutrient co-limitation on growth across forest types. However, it is important to note that this is a statistical estimate, and not a process-based model. In addition, we evaluated leaf area index (LAI) and variation in the height above ground of peak LAI in the canopy, given that LAI is an important ecophysiological attribute that is widely used in terrestrial ecosystem and biosphere models to upscale estimates of leaf-level processes26,27.

To map functionally distinct forest types across all pixels, testing hypothesis one, we conducted a principal component analysis (PCA) to reduce dimensionality of all ten canopy traits and structural attributes (hereafter canopy properties), and ran a k-means cluster analysis28 on the first two principal components to categorize pixels into functionally distinct forest types. To first evaluate the influence of spatial resolution on forest type mapping, we conducted analyses on canopy traits and structural attributes at their original resolution (4 m) and 15 coarse-scale resampled resolutions from 8 to 200 m. We used three methods to determine the appropriate number of clusters (k): (1) the gap statistic (Gapk), which defines the number of clusters based on the first local and global maxima; (2) the elbow approach using the within group sum of squares (Wk), and (3) the between cluster sum of squares (BSS) divided by the total sum of squares (TSS), where a higher value of BSS/TSS indicates improved fit of the cluster analysis to the data. To test hypothesis two and evaluate whether clustered forest types exhibited distinct functional dynamics, we compared our forest functional composition maps to 4 ha inventory plot data established within three distinct forest types Sepilok and one large 50 ha inventory plot at Danum. We demonstrate that forest types can be successfully mapped using data at the 30 m spatial resolution corresponding to new hyperspectral satellite missions revealing biologically relevant heterogeneity in tropical forest structure and function. Two canopy traits—LMA and canopy foliar phosphorus—were found to be critical for accurately distinguishing between forest types, while structural characteristics were found to be of secondary importance.

Results

Our analysis identified between two and four distinct forest types in Sepilok, and a single forest type at Danum that could be further distinguished as three distinct forest types (Fig. 1). The influence of spatial resolution on the degree of variance explained for the first 2–4 principal components and the k-means BSS/TSS saturated around 20–40 m resolution for both Sepilok and Danum (Supplementary Fig. 3). Here, we present results from analyses at the 30 m spatial resolution that corresponds to existing and forthcoming hyperspectral satellite missions.

Fig. 1: Mapped forest types.
figure 1

Mapped forest type results from PCA and k-means clustering of 10 variables across forest ecosystems in Sepilok Forest Reserve for k = 2, 3, and 4, and in Danum Valley Conservation Area around the 50 ha Smithsonian ForestGEO plot for k = 2 and 3. Here, we omit the Danum k = 1 map, which would show a single color for the entire area, and instead highlight the difference between k = 2 and k = 3 cluster results. At Sepilok, the partitioning of the alluvial forest into alluvial and mudstone forest types is revealed with k = 4. The canopy height shading scale indicates the top-of-canopy height information from the airborne LiDAR. Black areas in the clustered forest type maps indicate No Data where pixels were omitted, or cloud, cloud shadow, and water masked.

At Sepilok, the Gapk metric identified three clusters (BSS/TSS = 68.5%). However, the Wk elbow and BSS/TSS metrics suggest that Sepilok can also be characterized as two (BSS/TSS = 51.9%) or four (BSS/TSS = 76.7%) distinct forest types based on the magnitude of the decline in Wk, and gains in BSS/TSS before the values of both metrics level-off with increasing k (Supplementary Figs. 4 and 5). Cluster analysis results for differing values of k indicated nested forest types at Sepilok (Fig. 2a–c): the highest level (k = 2) distinguished an alluvial forest type from sandstone and kerangas forest type; k = 3 distinguished sandstone forests from kerangas forests; and k = 4 partitioned the alluvial forest into two forest types, revealing a mudstone forest type as distinct. At Danum, we identified a single cluster using the Gapk metric (BSS/TSS = 0.0%); however, the Wk elbow BSS/TSS methods both indicate that Danum can be characterized as three distinct forest types (BSS/TSS = 61.3%; Figs. 1 and 2; Supplementary Figs. 4 and 5). Two of these forest types (Danum 1 and 2) were found within the 50 ha ForestGEO inventory plot when k = 2 and k = 3 (Fig. 1). The 50 ha inventory plot appears to be dominated by one forest type (Danum 2), although the northeast corner was identified as distinct (Danum 1) when k = 2 and k = 3 (Fig. 1; Supplementary Fig. 8).

Fig. 2: Principal component analysis results.
figure 2

The first two loadings from the principal component analysis at Sepilok (ac) and Danum (d). ac Illustrate the partitioning of pixels into k = 2, 3, and 4 clusters at Sepilok. d illustrates k = 3 clusters at Danum. Cover20: the percent of canopy cover at 20 m height above ground; H peak LAI: height of peak leaf area index, LAI: Leaf area index, LMA: Leaf mass per area, max H: maximum height, N: Nitrogen, N:P: Nitrogen to phosphorus ratio, P: Phosphorus, P:H: the vertical distribution of plant foliage (P) relative to the total canopy height (H).

Distinguishing characteristics of forest types

The first principal component (PC1) corresponded to variation in remotely sensed LMA in grams of dry mass per square meter (LMA, g DM m−2), percent canopy leaf nitrogen (N, %), and percent canopy leaf phosphorus (P, %). The second principal component (PC2) reflected variation in remotely sensed canopy stature—maximum canopy height in meters (Max H, m) and the percent of canopy cover at 20 m height above ground (Cover20, %), which corresponds to field measurements of basal area; a measure of canopy architecture indicating the vertical distribution of plant foliage (P) relative to the total canopy height (P:H ratio); and estimated photosynthetic capacity (Vcmax, µmol m−2 s−1). These patterns were consistent at Danum and Sepilok (Fig. 2; Supplementary Fig. 6). Remotely sensed LAI (LAI, m2 m−2) explained little variation across the forest types, with weak loading values (PC3 at Sepilok, PC4 at Danum; Supplementary Fig. 6). Figure 3 shows variation in canopy properties across forest types, shown for the largest number of forest types identified at each landscape (i.e., k = 3 and k = 4; see Supplementary Figs. 79 for results from other values of k). The sandstone and kerangas forests had the lowest mean canopy leaf nutrient concentrations and estimated photosynthetic capacity (Fig. 3a –N, P, Vcmax). Despite having lower canopy height than other forest types, the sandstone and kerangas forests had the highest fraction of canopy cover above 20 m height, high P:H values, and the highest peak height of LAI (Hpeak LAI, m) (Fig. 3a – Cover20, P:H, Hpeak LAI).

Fig. 3: Canopy trait and structural attribute distributions.
figure 3

Trait distributions by cluster for Sepilok k = 4 and Danum k = 3. Forest types are ordered based on their median leaf mass per area (LMA) to illustrate differences in traits for forest types that vary along the leaf economics spectrum. Identical letters represent clusters where there is no significant difference between forests based on one-way ANOVA tests (p < 0.01). ** = traits that varied significantly between all seven forest types. * = traits that varied significantly between at least five forest types (a). Vertical leaf area index (LAI) profiles for all pixels within each inventory plot (b) and forest types identified based on k = 3 clusters at Danum and k = 4 clusters at Sepilok (c). Shading around lines in b, c indicate 95% confidence interval.

Strong gradients in LMA, canopy N, and canopy P were observed across all forest types. The highest canopy leaf nutrient concentrations and the lowest average LMA were observed in the three Danum forest types, and the Sepilok mudstone and alluvial forests (Fig. 3a – LMA). These patterns were consistent across different values of k (Supplementary Figs. 7 and 8). Average canopy N and P in the mudstone forest were equivalent to or higher than the alluvial forest, yet the mudstone forest had significantly lower estimated Vcmax. Significantly lower maximum canopy heights (max H) and greater foliage density near the ground (lower P:H) also distinguished the mudstone and Danum 1 forests from the alluvial and Danum 2–3 forests. The Danum 1 forest (when k = 2 or 3) was structurally similar to the mudstone forest; however, the two forest types differed in LMA, canopy N, canopy P (Fig. 3a).

While average canopy LAI was similar across forest types (Fig. 3a – LAI), ranging from 5.4 to 6.3, (coefficient of variation (CV) = 0.05), the average height of maximum LAI (Hpeak LAI), canopy architecture (P:H), and canopy cover at 20 m (Cover20) all exhibited much greater variation across forest types (CV = 0.48; 0.12; 0.25 respectively). Vertical LAI patterns further illustrated differences in structure across forest types despite similar total LAI (Fig. 3b, c, Supplementary Fig. 9), with strong clumping in the understory and the upper canopy at the alluvial and Danum forests. Vertical LAI profiles indicated less height heterogeneity in the sandstone and kerangas forests (Fig. 3b, c). Maximum canopy height, which varied significantly across clusters, was correlated with estimated Vcmax between the different forest types (R2 = 0.72, p = 0.017) and at the pixel scale (R2 = 0.24, p < 0.0001) (Supplementary Fig. 10). Given that Vcmax was estimated based on the linear relationship between canopy leaf N and P (i.e., not derived from leaf-level measurements as the other remotely sensed traits were—see Methods), it was thus surprising to find variation in the estimated canopy trait that was uncorrelated with either N or P.

Remotely sensed estimates of aboveground carbon density, an emergent property of ecosystem function, differed significantly across clustered forest types, with high values on average in sandstone forests and widely varying values in the alluvial and Danum forest types (Fig. 4a). Aboveground carbon, in megagrams of carbon per hectare (MgC ha−1), within the inventory plot boundaries generally corresponded to aboveground carbon distributions derived from the entire forest type (Fig. 4a). The one exception was the alluvial forest. When three forest types were distinguished at Sepilok (k = 3), the alluvial forest inventory plot had significantly higher aboveground carbon than the cluster-derived alluvial forest extent (Fig. 4a, p < 0.001). However, when the mudstone and alluvial forests were differentiated (k = 4), the inventory plot aboveground carbon distribution was comparable to aboveground carbon in the clustered alluvial forest extent. As a result, the mudstone forest encompassed significantly lower aboveground carbon on average. Annual relative growth and mortality rates calculated from forest inventory plot data differed significantly across forest types (Fig. 4b). Growth rates corresponded inversely to mean aboveground carbon at the sandstone (232 MgC ha−1), alluvial (223 MgC ha−1), and Danum 50 ha (194 MgC ha−1) inventory plots (Fig. 4a, b). The kerangas forest did not follow this trend, exhibiting an intermediate plot-level growth rate despite lower average aboveground carbon (180 MgC ha−1). Mortality rates were similar in the alluvial and Danum 50 ha plots, and significantly higher than the mortality rates in the sandstone or kerangas plots.

Fig. 4: Functional dynamics across forest types.
figure 4

a Aboveground carbon density for each field inventory plot (solid line) compared to aboveground carbon for the entire forest type based on cluster results where k = 1 for Danum and k = 3 for Sepilok (dotted line) and k = 3 for Danum and k = 4 for Sepilok (dashed line). b Annual relative growth (gray) and mortality (black) rates for each forest type calculated from forest inventory plot data. Growth and mortality rates could not be calculated for the Mudstone forest type due to the lack of an inventory plot with repeat census measurements. Identical letters represent inventory plots with no significant difference in terms of carbon, mortality rates, and growth rates respectively, based on one-way ANOVA tests (p < 0.01). Error bars in b indicate 95% confidence interval.

The relative importance of canopy traits and structural attributes

Cluster analyses conducted with only structural attributes, only canopy traits, or reduced combinations of canopy traits and structural attributes, indicated that leaf P, LMA, maximum canopy height and Cover20 are critical for capturing the observed forest types (Fig. 5). Clustering with LMA, leaf P, Cover20, and maximum height resulted in similar forest types to those identified when ten canopy properties were used (overall accuracies (OA) of 96.0% and 86.0% for k = 2 and k = 4 respectively) at Sepilok (Fig. 5a; Supplementary Fig. 11a), as well as higher BSS/TSS values at both Sepilok (Supplementary Fig. 12a) and Danum (Supplementary Fig. 12b). At Danum, LMA, leaf P, and Cover20 alone yielded the strongest similarity to the cluster results with all ten variables (OA = 88.0%; Fig. 5b, Supplementary Fig. 11b). The highest overall accuracy for k = 3 at Sepilok was achieved with LMA, canopy leaf P, and N, equal to 85.9%, although the combination of maximum height, LMA and leaf P (OA = 84.8%), and just LMA and leaf P (OA = 84.7%) yielded similar results (Fig. 5a). We were unable to obtain distinct mapped forest types using structural attributes alone. The inclusion of leaf P improved output in all cases in terms of correspondence with plot locations and noise (speckling) reduction.

Fig. 5: Trait importance in forest type mapping.
figure 5

Change in overall accuracy for reduced k-means clustering models using forest structure variables (purple), canopy leaf trait variables (orange), and combinations of structural and leaf trait variables (blue) for k = 2, 3, and 4 for Sepilok and k = 3 for Danum. All results are compared to the full 10-variable k-means clustering analysis for Sepilok (a) and Danum (b). LES traits: leaf economic spectrum traits which include leaf mass per area (LMA), foliar nitrogen (N) and foliar phosphorus (P).

Discussion

We hypothesized that functionally distinct forest types can be mapped at moderate spatial resolutions, using a combination of canopy foliar traits and canopy structure information. Our analysis of LiDAR and imaging spectroscopy data at spatial resolutions ranging from 4 to 200 m (16 m2–40,000 m2), with an emphasis on the 30 m (900 m2) spaceborne hyperspectral spatial resolution, reveals that few remotely sensed canopy properties are needed to successfully identify ecologically distinct forest types at two diverse tropical forest sites in Malaysian Borneo. In testing our second hypothesis that mapped forest types exhibit distinct ecosystem function, we found that forest types identified using remotely sensed leaf P, LMA, Max H, and canopy cover at 20 m height (Cover20) closely align with forest types defined from field-based floristic surveys29,30,31,32,33 and inventory plot-based measurements of growth and mortality rates (Fig. 4b). Our approach, however, enables mapping of their entire spatial extent (Fig. 1) and reveals important structural and functional variation within areas characterized as a single forest type in previous studies (Fig. 3). Current and forthcoming satellite hyperspectral platforms, including PRISMA (30 m), CHIME (20–30 m), and SBG (30 m), have or will have comparable spectral resolution, higher temporal revisits, and much greater geographic coverage. The ability to conduct this type of analysis using remote sensing measurements at 30 m resolution suggests that our method can be applied to these emerging spaceborne imaging spectroscopy data to reveal important differences in structure and function across the world’s tropical forests.

Nested functional forest types revealed

To test our first hypothesis, rather than making an a priori decision about the number of k-means clusters (k), we explored the capacity of remotely sensed data to reveal ecologically relevant variation in forest types. Baldeck and Asner took a similar unsupervised approach to estimating beta diversity in South Africa34. Because the choice of k directly influences analysis outcomes, careful selection of k is required. Different approaches for identifying the number of clusters, using the Gapk and Wk elbow metrics35, yielded varying optimal numbers of clusters for the Sepilok and Danum landscapes (Fig. 1, Supplementary Figs. 4 and 5). However, at both sites, a comparison of results based on different values of k revealed ecologically meaningful structural and functional differences and graduated transitions between forest types (Fig. 2, Supplementary Figs. 7 and 8), indicating that the exploration of traits that aggregate or separate forest types as k changes is a valuable exercise. Overlap between the remotely sensed forest type boundaries and inventory plots within distinct forest types indicate that the series of clustered forests align closely with forest types defined based on in situ data on species composition and ecosystem structure. In part, this type of analysis requires careful selection of the number of clusters. Additionally, however, we gained valuable insights via the exploration of varying numbers of clusters as it relates to biologically meaningful categorization of forest types. Extending this method to other parts of the tropics will require similar decision-making, which will either require user input, or the development of robust automated algorithms for selecting k.

Forest types capture differences in ecosystem dynamics

We further evaluated the canopy traits and structural attributes that were most critical for mapping distinct forest types, hypothesizing that mapped forest types exhibit distinct ecosystem function. Forest types revealed by the cluster analyses were distributed along the leaf economic spectrum, where the leaf economic spectrum characterizes a tradeoff in plant growth strategies36. LMA, which can covary strongly with leaf N and P, is a key indicator of plant growth strategies along the spectrum37. At the slow-return end of the leaf economics spectrum, plants in nutrient-poor conditions with low leaf nutrient concentrations invest in leaf structure and defense, expressed as high LMA, strategizing longer-lived, tougher leaves with slower decomposition rates. This strategy comes at the cost of slower growth. At the quick-return end of the spectrum, plants in nutrient-rich environments with higher leaf nutrient concentrations invest less in structure and defense, enabling faster growth and more rapid leaf turnover, i.e., shorter leaf lifespans. This quick-return growth strategy supports higher photosynthetic rates and more rapid carbon gain36.

In this study, the principal components and clustering results yielded forest types that are indicative of community level differences associated with leaf economic spectrum differences. The nutrient rich sites (Danum1 and Danum2, Supplementary Fig. 8) show high canopy N and P and low LMA compared to the nutrient poor and acidic sites (Sandstone and Kerangas), which contributes to lower leaf photosynthetic capacity (Vcmax) and growth (Fig. 4b). Foliar N:P also increased with site fertility, confirming that tropical forests are primarily limited by phosphorus, and not nitrogen38,39, with large implications for carbon sequestration in these forests. Orthogonal differences in canopy structure and architecture between Danum forest types and Sepilok Sandstone and Alluvial forests could be indicative of ecosystem scale differences in the sensitivity of these forests to endogenous disturbance processes40.

The significant differences in aboveground carbon stocks and growth and mortality rates between forest types further suggests strong differences in ecosystem dynamics. In general, growth rates varied inversely to aboveground carbon, and higher aboveground carbon corresponded to lower mortality rates. As an example, the Sepilok sandstone forests, which are largely comprised of slow-growing dipterocarp species29,33, had the highest median aboveground carbon (236 Mg C ha−1), with higher canopy P and N, and lower LMA. The taller canopy and low canopy leaf nutrient concentrations are consistent with the low growth and mortality rates found in the sandstone forest, indicating a slow-growth strategy yielding larger trees and higher aboveground carbon stocks. In contrast, alluvial forests exhibit high turnover with mortality and growth rates higher relative to Sandstone forests corresponding to lower aboveground carbon on average. Kerangas forests exhibited low aboveground carbon despite an intermediate plot-level growth rate, and mortality rates that were significantly lower than the Danum or alluvial forest types. Kerangas forests, which were characterized by the highest LMA, lowest foliar P and N (Fig. 2a), and the lowest plot-level aboveground carbon density (186 Mg C ha−1; Fig. 4a), are known to have higher stem densities, lower canopy heights, and long-lived leaves5,32,41, suggesting well-developed strategies for nutrient retention42. Interestingly, despite significantly different aboveground carbon and demography, the kerangas and sandstone forests did not differ in LAI or canopy architecture (P:H); although maximum height, Cover20, and Hpeak LAI were significantly higher in the sandstone forest, highlighting the need to account for differences beyond LAI when scaling processes from leaves to ecosystems.

In addition, when three forest types were distinguished at Sepilok, the alluvial inventory plot had significantly higher aboveground carbon than the remote sensing-derived alluvial forest extent (Fig. 4a, p < 0.001). It was only when the mudstone and alluvial forests were differentiated when k = 4 that the inventory plot and clustered alluvial forest areas exhibited similar aboveground carbon distributions, with significantly lower carbon in the mudstone forest. Although Sepilok mudstone and alluvial forests are often characterized as a single forest type5,43, independent research first identified mudstone hills as unique based on differences in soil cation exchange capacity, pH, and nutrient concentrations that translated into intermediate plant growth rates in mudstone forests44. More recently, higher clay fractions and higher exchangeable Mg, Ca, and K were found at varying soil depths in Sepilok mudstone forests compared to alluvial forests, although alluvial forests exhibited higher foliar N, P, K, and Mg concentrations compared to mudstone forests30. Our remote sensing findings independently support the uniqueness of mudstone forests based on both leaf traits and structural attributes (Figs. 2 and 3). The lower aboveground carbon in the mudstone forest may be due to lower leaf nutrient concentrations and higher soil acidity, as well as differences in hydrology associated with seedling and sapling responses to flooding that influence the species assembly45,46. Because the mudstone forests in Sepilok are also closer to anthropogenic forest edges than alluvial forests, edge effects—which have been shown to significantly influence large tree mortality and lower aboveground carbon—may also be a factor47,48,49.

At Danum, our results indicate that the region is comprised of one to three forest types that differ in LMA, foliar N and P, canopy height, and vertical structure (Figs. 1 and 2; Supplementary Figs. 78). Our finding that two of these forest types (Danum 1 and 2) were found within the 50 ha Smithsonian ForestGEO inventory plot interestingly aligns with recent, independent research. Differences in species composition and soil characteristics have been identified between the northeast corner and the remainder of the 50 ha plot30. A recent study also identified the northeast corner (Danum 1) has having lower species richness, diversity, stem density, and basal area compared to the rest of the plot (Danum 2), linked to less acidic soils with a higher cation exchange capacity and higher Ca, Mg, and Ni content31.

Implications for modeling tropical forest biomes

We did not find significant variation in total ecosystem LAI across forest types in this study. In contrast, vertical variation in structure was more strongly linked to differences between functionally distinct forest types. LAI is considered one of the most important ecophysiological attributes of vegetation, and is widely used in terrestrial ecosystem and biosphere models to upscale estimates of leaf-level processes to ecosystems and to model land atmosphere interactions26,27. LAI varies significantly among the world’s biomes50. Within tropical forests, previous studies have shown that variation is correlated with maximum water deficit, minimum temperature, and forest protection status51. However, as shown here, LAI variation does not vary significantly across the different lowland forest types (alluvial, sandstone, mudstone, and kerangas) found at Sepilok and Danum. Average canopy LAI (estimated from the airborne LiDAR data using the method described in ref. 52) was similar across forest types, ranging from 5 to 6. A recent study argued that total LAI may not be directly relevant for many processes in ecosystems (e.g., productivity) beyond a value of three53.

However, our finding of significant variation in vertical LAI profiles, despite no substantial variation in average ecosystem LAI, provides important evidence that vertical foliar distributions may be more important than the absolute amount of leaf area for characterizing differences across ecosystems, underscoring the importance of evaluating additional LiDAR-derived metrics and leaf traits. In particular, the vertical distribution of leaf area is important for many canopy processes since the total amount of leaf surface area and its vertical organization can vary independently54,55. Although terrestrial biosphere models differ in their representation of vertical forest strata, there has been a growing effort to incorporate vertical variation more directly in many models56. Several recent model developments are at the cutting edge of representing vertical variation, which implement vertical gradients of irradiance, water content, and leaf temperature in ways that better enable models to capture differences in function within and between forest ecosystems57,58,59.

In conclusion, we explore the ability to map forest types at the spatial resolution of forthcoming hyperspectral satellite sensors and evaluate the ability of those forest types to capture differences in three aspects of ecosystem dynamics: aboveground biomass stocks, growth rates, and mortality rates. It was beyond the scope of this study to conduct ground validation of the entire extent of the mapped forest types. However, an important next step for further research entails a more detailed analysis of differences in forest composition, structure, and ecosystem dynamics at the larger landscape scale, beyond the inventory plots evaluated in this study. Establishing 1 ha plots and repeatedly censusing them across the extent of each forest type mapped would yield more robust accuracy assessments of the forest type maps reported here and will be critical for disentangling the mechanisms and processes underpinning differences in structure and function. Despite this need for further in situ analyses, this study provides the first step toward reliably characterizing differences in forest types over large areas, where forest inventory plots are not readily available. This type of mapping alone offers invaluable insights into differences across vast areas of tropical forest that are otherwise often characterized as a single biome in ecosystem models. The ability to capture this variation within the global critical tropical forest biome sets a new bar for terrestrial biosphere modeling. Results from this type of analysis can be used to parameterize and benchmark earth system models, further constraining high uncertainty in the future dynamics of these ecosystems.

Since the main axes of variation in canopy properties correspond to quantities measurable from spaceborne LiDAR and imaging spectroscopy, our approach offers a framework for large-scale mapping of functionally distinct forest types that can be employed across highly diverse tropical forest ecosystems at regional and global scales. Canopy leaf phosphorus (P) and leaf mass per unit area (LMA) were critical for distinguishing between forest types and will thus be essential to accurately map from spaceborne sensors for ecological applications. Maximum canopy height, and the fraction of canopy cover taller than 20 m, were important for distinguishing forest types, although variation in structure alone was insufficient to capture differences. These types of analyses at pan-tropical scales will be invaluable for improving understanding of ecosystem variation across a biome that is both incredibly threatened by and critical to mitigating global climate change. Our findings also underscore synergies between ground-based and remote-sensing ecological analyses, whereby landscape-scale remote surveys can efficiently pinpoint locations that can be targeted as high priority for discovery-oriented fieldwork and ground-based measurements.

Methods

Study landscapes

The study landscapes are in Sabah, Malaysian Borneo, encompassing forests in Danum Valley with the tallest trees in the tropics60, and nutrient-poor kerangas forests with stunted canopies and unique floristic composition61. The first landscape is Sepilok, a 4500 ha reserve of lowland mixed dipterocarp forests spanning varying topography and soil nutrients5,32,41,44. The second landscape is Danum, a 44,000 ha conservation area with predominantly lowland, intact tropical rain forest. In this study, we focus on the 50 ha ForestGEO inventory plot located in the eastern part of Danum.

Both landscapes exhibit differences in structure, function, and composition that correspond to underlying soil and geologic substrate5,33,41,43,44. Sepilok is characterized by three forest types: alluvial forests on fertile ultisols along alluvial flats and gentle slopes; sandstone forests on well-drained, nutrient-poor ultisols along steep ridges; and kerangas forests that dominate acidic, extremely nutrient-poor podosols along lower dip slopes of cuesta landforms33,62. Total phosphorus, nitrate, and base cations are significantly higher in alluvial soils than in the sandstone and more acidic kerangas forest soils, influencing forest type differences in species composition, leaf traits, and stand structure29,32. An earlier field study also identified mudstone hills within the alluvial forests as being further distinguishable in terms of soil chemistry and plant growth44, although mudstone and alluvial areas in Sepilok are typically characterized as a single forest type5,43.

Airborne remote sensing data

To measure forest structure and foliar traits, we used co-aligned LiDAR and imaging spectroscopy data collected by the Global Airborne Observatory in April 2016, which are described in detail in the Supplementary Methods and in ref. 22. We examined ten forest structure variables and canopy foliar characteristics that are strongly linked to ecosystem function and have demonstrated measurability with high accuracy using airborne remote-sensing techniques (Supplementary Tables 1 and 2, Supplementary Figs. 1 and 2). Variation in canopy structure was characterized using five metrics: 99th percentile of total canopy height (Max H, m), LAI (LAI, m2 m−2), the peak height of LAI (Hpeak LAI, m), a measure of canopy architecture indicating the vertical distribution of plant foliage (P) relative to the total canopy height (P:H ratio), and the fraction of canopy cover taller than 20 m height above the ground (Cover20, %). Variation in canopy leaf traits were analyzed based on differences in leaf mass per area (LMA, g DM m−2), foliar nitrogen (N, %) and phosphorus (P, %) concentrations, and foliar N:P ratios. To assess differences in maximum photosynthetic capacity, Vcmax was estimated from remotely sensed maps of leaf N and P using the equation in Table 3, model 1 from25. Estimated Vcmax was not validated with in situ ecophysiology measurements and is thus simply meant to provide an additional axis of functional variation in this study. To examine the feasibility of conducting these analyses at coarser resolutions, we resampled data and ran analyses at resolutions ranging from 16 m2 to 40,000 m2. LiDAR and imaging spectroscopy data and processing are described in Supplementary Methods.

Characterizing functionally distinct forests

We mapped forest types across Sepilok and Danum. At Danum, we restricted our analysis to the 50 ha ForestGEO plot location and a 1 km buffer around the plot. To characterize functional and structural diversity across all pixels, we (1) conducted a PCA to reduce dimensionality of all ten canopy leaf traits and structural attributes (hereafter canopy properties), and (2) ran a k-means cluster analysis28 on the first two principal components to categorize pixels into functionally distinct forest types. PCA and k-means cluster analysis data processing is described in Supplementary Methods.

The primary metric for identifying the appropriate number of clusters (k) was the gap statistic (Gapk), which defines the number of clusters based on the first local and global maxima35. We also evaluated output for k = ks + 1 and for k = ks – 1, where ks represents the number of clusters selected using Gapk. Two secondary cluster metrics were also considered: (1) the elbow approach using the within group sum of squares (Wk), and (2) the BSS divided by the TSS. A higher value of BSS/TSS indicates improved fit of the cluster analysis to the data63. Because BSS/TSS increases monotonically as k increases, we evaluated the k at which BSS/TSS increases flattened, in addition to Gapk and the Wk elbow approach35.

We visually evaluated cluster results against inventory plot data from forest ecosystems that have been studied extensively in the field and exhibit clear differences in structure and function. Significant differences in canopy properties between clusters were calculated based on one-way ANOVAs using the aov and TukeyHSD functions from the stats base package in R64. To explore the minimum number of canopy properties required to capture differences in forest types, we evaluated cluster results using only LiDAR variables (structural attributes), only imaging spectroscopy variables (leaf traits), and reduced combinations of canopy properties. To evaluate these reduced models, we calculated overall accuracy as the proportion of pixels mapped the same as the full 10-variable model.

Inventory plot data

To evaluate cluster analysis performance, we compared our forest type maps to inventory plot data at Danum and Sepilok. Our plot dataset consisted of nine existing 4 ha forest inventory plots distributed across alluvial (n = 3), sandstone (n = 3), and kerangas (n = 3) forests at Sepilok, and one 50 ha plot at Danum. Data from the nine 4 ha Sepilok plots and the Danum 50 ha plot were from the ForestPlots.net online repository65 and the ForestGEO online repository66, respectively. The datasets include stem diameter measurements and taxonomic identification to species level for every tree ≥1 cm and ≥5 cm in diameter in the ForestGEO and ForestPlots.net plots, respectively. Census years from each plot were as follows: alluvial—2001, 2009, 2014; sandstone—2001/03, 2008/09, 2013/14; kerangas—2001, 2008/10, 2014/15; Danum—2011/15, 2019. The Global Airborne Observatory campaign in Sabah was conducted in 2016.

Observed differences in ecosystem properties

In lieu of direct measurements of ecosystem function at the study locations (e.g., net primary productivity), we quantified differences in three related ecosystem properties: aboveground carbon, growth, and mortality. We compared stand-level growth and mortality rates calculated from forest inventory data and remotely sensed estimates of aboveground carbon density (ACD, Mg C ha−1)—including stem, branch, and leaf biomass—at plot locations within the inventory plots, and across all mapped pixels within each forest type to examine differences in aboveground carbon beyond the plots. ACD at 30 m resolution was estimated from the Global Airborne Observatory top-of-canopy height (TCH) and Cover20 data following67, described in ref. 68. The method involves estimating ACD from a network of 0.25 to 1 ha field plots using the BIOMASS workflow described in conjunction with the pan-tropical biomass allometry69,70. Equations from71, modified by68, were used to estimate ACD from the TCH data, modified based on67 to incorporate Cover20 as a proxy for stand-level basal area. Annual relative DBH growth rates and annual mortality rates were calculated from plot data (stems ≥ 10 cm) following72. When calculating growth rates, we excluded trees with broken or resprouted stems and stems that grew >7.5 cm yr−1 or shrunk >25% of their initial DBH following72. Negative growth rates <25% of initial DBH were converted to zero.