Introduction

Biomes are biological communities sharing a climate. Ecotones are their transition zones. It is accepted that anthropogenic climate change has already generated an impact on the global distribution of biomes, causing disruptions in the ecological functions and loss of biodiversity. That has resulted in more vulnerable environments1,2,3. Changes in ecotones have been more severe as these regions are not simple transient zones but have unique ecological characteristics defined by the interaction of neighboring biomes4,5,6. A particular combination of habitats tends to create new ecological niches and that has increased biodiversity through evolution.

Biomes and ecotones are also considered areas of primary interest for climate change studies. They are especially sensitive to fluctuations, thus acting as an early warning about impacts of anthropogenic climate change7,8,9. The process can be explained in purely mechanistic terms since biomes and ecotones account for the fluxes of matter and energy that affect the biota (e.g., through nutrient cycling). The impact of climate change is particularly acute in ecotones, where species are pushed to their limits in a context of unstable equilibrium5.

The climate is the main factor that shapes the distribution of biomes10. Soil, latitude, anthropic pressure and existing flora are secondary factors11,12,13. There is agreement in that biomes can be characterized by a few abiotic factors such as temperature and rainfall14,15,16 and that these ecological units are a robust dimensional reduction for complex plant-specific physiological thresholds of heat and water demand17,18,19. In particular, Holdridge’s scheme (HLZ) provides a comprehensive classification system to describe both life zones and ecotones from those environmental factors. The latter feature is not present in the original formulation of the method but an enhancement through a minor adaptation20,21 (Fig. 1a).

Fig. 1: Holdridge life zones classification scheme and class agreement for CMIP6 models and reference dataset.
figure 1

a The HLZ system shows the location of each life zone according to biotemperature, potential evapotranspiration ratio, and annual precipitation; the original 38 zones were aggregated into 13 core zones and 27 ecotones (TLZs) following Monserud & Leemans80. b Class agreement chart, the upper-left triangle shows the Kappa coefficient matrix (k; 1 perfect agreement; 0 no agreement) and the bottom-right displays the correlation ellipsoids between classes. The bars on the top of the figure are k and R2 coefficients of each model when compared with reference dataset (CRU).

Biome classification using Holdridge’s system, can be carried out through climate model outputs. Current Earth System Models (ESMs) have evolved beyond Global Climate Models (GCMs) and now include the main physical and biogeochemical processes of the Earth22. Such enhancements build confidence in these models having a superior ability to account for biological factors, and in particular the distribution of life on Earth. The combination of ESM climate outputs with classifications such as the HLZ scheme define climate envelopments in an objective way.

The ability of any method to characterize biomes and ecotones can be evaluated by comparing a present-climate classification from ESM outputs with an actual classification using measurements (ground observations or satellite data). This is a necessary (but no sufficient) condition to gauge ESM capabilities, and also a means to identify those areas where most models agree and where models strongly disagree. But perhaps more importantly, such validation helps identify potential shortcomings in the modeling, thus informing on the limitations and uncertainties in the predicted changes in biomes and ecotones. Indeed, guidance on the error source is also beneficial not only for climate modelers but also for life scientists, who make use of model output for their own research interests. Thus, a comprehensive understanding of errors helps to develop a more meaningful analysis of the consequences of climate change23.

Results

Biome distribution in present climate

We first assess how well Coupled Model Intercomparison Project phase 6 (CMIP6) models characterize current biomes. Thus we compared the model-derived classification with the observations from the Climate Research Unit (CRU) Time Series (TS) dataset (CRU, Fig. 1b). For the present climate, an overall moderate consensus appears (\(0.4\le {\rm{\kappa }}\le 0.7\)). Regarding precipitation, we found that a good performance in the representation of the field (R2 ≥ 0.7) is essential to obtain acceptable kappa coefficients, although there are exceptions, namely CESM2, MIROC6 and CIESM. In these, good scores for precipitation estimates are not translated into better representation of HLZs. Only three models—HadGEM-GC31-MM, EC-Earth-Veg and UKESM1-0-LL—achieved the best scores in both R2 and kappa, but the six ensembles do agree. Models reproduce the spatial distribution of major biomes (Fig. 2a–c) but discrepancies are indeed found around the limits of life zones where large inter-model variability is observed, i.e., the ecotones (see supplementary material for a panel plot of individual models, Supplementary Fig. 1).

Fig. 2: Global distribution of life zones, ecotones, and consensus map for present climate (1980–2014).
figure 2

Global maps of life zones for CRU dataset (a), multi-model ensemble mean of fifty-two members (b), and ensemble mean of top ten rated climate models (c). d Consensus map of life zones based on top ten CMIP6 models (T10). Blues are regions where most models agree with reference dataset (≥ 80%), yellows are regions with moderate consensus (21–79%), and reds are regions with low consensus (≤20%). Stripes are regions with large inter-model variability (half of the fifty-two models disagree in the type of life zone). The treemap shows the percentage of the area covered by each category. The barplot show the percentage of the total area with large inter-model variability. e, f Deconstruction of errors in the multi-model ensemble mean and T10 ensemble mean. Yellow, light yellow and dark yellow are drier, colder and drier, and warmer & drier conditions than reference data. Blue, dark blue and light blue are wetter, warmer and wetter, and colder and wetter conditions. The agreement between modeled and observed life zones is colorized in gray (100% agreement). Double donut plots distribute the proportion of global agreement/disagreement. g, h Global distribution maps of ecotones for multi-model ensemble mean (g) and T10 ensemble mean (h). The stippling shows the distribution of ecotones as depicted by CRU. Inset steam plots compare observed (black) and modeled (red) area that is covered by each one of the twenty-seven ecotones (TLZs). Bar plots show the percentage of the total area covered by ecotones.

Dissent is found in areas with complex orography such as the Rocky Mountains, the Andes, and the Himalayas. There, models have difficulties in reproducing the heterogeneous distribution of biomes. The ‘Cold Parklands’ category is misclassified in most models confirming the limitation ESMs have to classify climate types when extreme values are involved24. Almost no model can properly characterize the extremely humid tropical rainforest in the Amazonia and the same for the extremely hot desert in central Australia.

Most of the model-observations biases are found in regions with moderate-to-large inter-model variability, as depicted by Fig. 2d (stripes). The lack of consensus over these areas means that some key processes are not well modeled yet. We point below to those related to the water cycle modeling, Fig. 2e, f (See Table 1 for a description of the methods).

Table 1 Criteria for deconstructing model errors in Holdridge’s Life Zones scheme.

In spite of biases, consensus between the top-10 models and the reference data is above 51% of the total area (≥80% models agree). The consensus area covers a wide range of habitats; including hot deserts, such as the Sahara and the Arabian deserts, rainforests of South-east Asia, and cool and temperate forests of western Europe and eastern US.

Ecotones in present climate

As transition areas, ecotones are more challenging than biomes. A precise characterization of ecotones for the present climate is however essential to use model outputs as a guide for future climate shifts, especially when the projected changes are expected to be more pronounced in transitional areas25,26,27.

We compared the models with CRU data and found that the individual models identify the major ecotones. However, they cannot capture the transition zones of some of them, such as Cold Parklands (TLZs 3,4,5,7 and 8). The problem could be identified in the CP-FT-BF (TLZ 5) where the ecotone is absent in most models (see Supplementary Fig. 3 for individual models). This is a consequence of wet bias that is amplified by the log-scaling, where minor changes in the lower bounds produce large impacts. The southward displacement of the TLZs 21 and 23 is another important error source, as illustrated in Fig. 2(g, h). This pattern is observed in southern Africa where the wet bias causes the misrepresentation of the transition between tropical dry forest, savanna, and the Afromontane forest.

Regarding the dry bias over land, this is a known shortcoming of models28,29,30. The causes are a matter of discussion31,32,33 but the feature is not controversial and is accounted for when model data are used in the biological realm34,35. Here, we found that the dry bias features in the HLZ classification in several places, such as an increase in the eastward expansion of tropical semiarid-tropical dry ecotone in India (TLZ 22).

We have found that models overestimate the global extension of ecotones by 2%. The main cause is the artificial expansion of the transitional zone of tropical rainforest and tropical dry forest (TLZ 24). An example of this process can be seen in the Amazonia. In that region, the ensemble mean underestimates annual precipitation, causing a dry bias that expands the tropical rain-dry ecotone. The effect is palliated in the top-10 ensemble (a subset of the ensemble) thanks to their better estimation of precipitation, as shown in Supplementary Fig. 5.

Indeed, models with better precipitation estimates produce more credible biomes and ecotones. Model ensembles benefit greatly from these good models but cannot completely compensate those with unrealistic precipitation estimates. Overall, the multi-model ensemble (MME, 52 members) correctly features the continuous bands of forest tundra-taiga transition (TLZ 6) as well as steppe-cool forest-boreal forest ecotones (TLZs 16 and 11) but barely captures the extension and the precise location of these TLZs. The MME estimates colder temperatures in northern Eurasia and shifts TLZ 6 and TLZ 11 to the south while the wet bias increases the forest-steppe ecotone in central US (TLZ 16). The top-10 ensemble reduces some of those biases but location problems remain, meaning that the problem is shared by most models. This is more clearly seen in the accuracy in rainfall estimates in the Maritime Continent, which yields a better characterization of the temperate and rain forests ecotone (TLZ 25). This contrasts to the wet bias observed in central Australia which harms the representation of transitional areas between hot desert, tropical semiarid and the Kwongan, a chaparral-like ecoregion (TLZs 19 and 21).

Deconstructing biases

Because Holdridge’s classes are calculated from monthly climatologies of temperature and precipitation, we can decompose the specific biases that are responsible for errors in the representation of biomes and ecotones (Fig. 2e, f). We found that both ensembles have similar results, but the top-10 ensemble obtains slightly better scores thanks to bias reduction in the Amazonia and central Australia. The advantage is only apparent because an ensemble with few members minimizes the inter-model variability, a crucial problem in those regions36,37,38. Moreover, the top-10 ensemble intensifies the bias in key transient zones for future climate, such as steppe-cool forest (TLZ 10) and temperate forest-tropical dry forest ecotones (TLZ 23).

As argued above, most of the error can be explained by deficiencies in modeled rainfall. One exception is the southward propagation of boreal forest over Eurasia where models predict colder conditions than reference data (see Supplementary Fig. 2 for individual maps of model bias and Supplementary Fig. 4 for MAB). Precipitation plays the main role in the dry bias observed in the Caribbean, which is a common feature in all models39,40. Wet bias in west US is also a shared problem caused by a misrepresentation of the precipitation field41. Modeling the seasonal displacement of the Intertropical Convergence Zone (ITCZ) is yet another problem. It results in a wet bias in northeastern Brazil and southern Africa in boreal winter and in the East Asian summer monsoon42,43.

Sources of error in modeled precipitation can be attributable to microphysics, convection, boundary layer and radiation parameterizations. The coarse resolution of CMIP6 models is also important for small-scale precipitation, which is usually parameterized. Models have also known problems to simulate the orographic effects over major mountain chains because the complex interactions between dynamics, thermodynamics, and microphysics44. The problem is highlighted in areas near arid regions where minor changes in precipitation produce large differences between life zones.

Better precipitation estimates—not better temperature estimates, or more spatial resolution—translates into a better classification of biomes and ecotones in Holdridge system. However, it is worth noting that a good precipitation score does not directly translate to a more precise class in the biome or in the ecotone classification so the previous observation is not as obvious as it may seem. Classes in Holdridge system stem from a series of thresholds, cut-offs and intervals and there is not a univocal relationship between both ranks.

Future climate

Uncertainties are inherently greater in climate projections as these include scenarios that model economic and social behavior. The future unknowns involved are addressed in the community by introducing Shared Socio-Economic Pathways (SSPs)45 or scenarios.

We used the seven standard SSPs in our calculations. Regarding ecotones, both the so-called optimistic and worst-case scenarios (see Supplementary Fig. 29 for other SSP) show the same pattern for the direction of the changes (Fig. 3a, b). Flow diagrams for these extreme cases depict an important reduction of tundra and forest tundra life zones, a global decrease in the extent of the ecotones, and an increase of warm and tropical life zones. The major difference between scenarios is found in the intensity of these changes. Thus, for example, the MME predicts a much more intense decline of TLZs in SSP5-8.5 (3.8% of total area) than in SSP1-2.6 (1.3%).

Fig. 3: Life zone area changes from present climate (1980–2014) to future climate (2015–2100).
figure 3

Flow diagrams for the low-emission scenario SSP1-2.6 (a) and high-emission scenario SSP5-8.5 (b). The twenty-seven ecotones were aggregated into one category (TLZ*) for a better visualization.

The spatial representation of the flows (Fig. 4a, b) provides further insight into what could be expected in future climates. Thus, the severe reduction of subtropical forests (TF) in Africa clearly features in maps but is hidden in the flow diagrams due to the small change over total area and non-directional character of the flux diagrams. Maps also highlight the depletion of forest tundra in the Tibetan Plateau as well as the poleward expansion of dry climates. The northward propagation of the chaparral in Europe is another process that features in all the SSPs.

Fig. 4: Changes in the distribution of life zones and ecotones under two future climate scenarios.
figure 4

Global distribution of future life zones for SSP1-2.6 (a) and SSP5-8.5 (b) scenarios. c, d maps showing the direction of the changes in HLZ for the low-emission (c) and high-emission (d) scenarios. Yellows are shifts to drier zones (yellow: drier; light yellow: colder & drier; dark yellow: warmer & drier), blues are shifts to wetter zones (blue: wetter; light blue: colder & wetter; dark blue: warmer & wetter), and grays stand no changes. Stripes are regions with large inter-model variability. Double donut plots illustrate the proportion of each change type. e, f are maps of the distribution of ecotones under SSP1-2.6 (e) and SSP5-8.5 (f) scenarios. The stippling shows the distribution of ecotones for present climate. Inset stem plots show the change in the area covered by each TLZ. Bar plots are the percentage of the total area covered by TLZs for present (black) and future scenarios (red). The size of the ensemble is n = 36 for SSP1-2.6 and n = 37 for SSP5-8.5.

The spatial consistency of the estimates changes builds confidence in the modeling. Moreover, the maps of differences show consensus between scenarios in the identification and the location of future shifts. These changes, however, differ in their extent (Fig. 4c, d). Shifts toward warmer and drier life zones were the most frequent type of change (14%-22%) while unchanging life zones range from 81%-69%, according to each SSP. In these areas, the expected variations in temperature and precipitation do not exceed the critical threshold for biome shift.

The location and extent of changes in the ecotones depend on the scenario (Fig. 4e, f). In some cases, the projections are disparate. SSP1-2.6 and SSP5-8.5 give opposite results in the transition zone between temperate and tropical forests (TLZs 24 and 25). The disagreement between the simulated changes in both scenarios is most apparent in the Amazonia and in the Maritime Continent where SSP5-8.5 reduces the extension of TLZs 24-25 in favor of tropical dry and tropical rain life zones.

Like biome shifts, most differences between scenarios appears in terms of signal strength, not in the direction of the change. Thus, the contraction of tundra-forest tundra-boreal forest ecotones (TLZs 2 and 6) is observed in all SSPs but with different magnitudes. Other shared pattern is the conversion of transitional areas between chaparral, temperate forest and tropical dry forest (TLZ 17 and 23) into new Holdridge’s life zones. Past TLZ 17 is now identified as chaparral while the ecotone lies further north (e.g., in the eastern US and southern Europe). The case of TLZ 23 in central and eastern Africa is paradigmatic: the transition from subtropical temperate forest to tropical dry forest in the Angolan Highlands and the Afromontane forests will be completed by 2100 and, consequently, the total area covered by the ecotone will be reduced more than 33%. TLZ 10 will also decrease, but the impact will be higher in low-emission scenarios (SSP1-1.9 and SSP1-2.6). Future projections include the expansion of few transient zones, such as the thorn woodland and desert scrub ecotone (TLZ 19). The increase is better observed in the ‘optimistic’ scenarios, as with TLZ 10.

Discussion

The extent and intensity of biome and ecotone shifts—defined here by changes in climatic variables—have a wide impact on several Earth’s cycles. For example, the expected reduction of forest-based biomes (−4.5%, excluding tropical dry forest) and the expansion of grasslands-shrublands (+9.3%) are transformations that will certainly affect biogeophysical processes (through albedo and evapotranspiration)46, biogeochemical cycles (through the nutrient cycling)47, and biogeographical processes48. Changes in albedo are a well-known example of how biome disruptions can have a global impact.

The reduction of forest tundra and its ecotone (TLZ 2) also modifies the albedo and increases the cooling effect of the surface, altering the global energy flux49. Similarly, the conversion of transitional zones between cool and temperate forests (TLZ 17) into chaparral modifies soil properties, diminishing its carbon storage capacity and nutrients50,51. Nutrient deficiencies such as nitrogen and phosphorous minimizes the chances of recovering the system to the initial state creating a feedback loop.

Another interesting example of how biome disruptions are widespread is found in Africa. The increase of temperature accelerates water stress that reduces transient areas between subtropical moist forest and tropical dry forest (TLZ 23) and promotes the replacement of tall, multi-stratal closed canopies by open canopies and woodlands of drought-tolerant species52. The consequences in terms of the carbon and the nitrogen cycles are well known53,54,55 (Fig. 5) and include lower carbon and nitrogen use efficiency, as well as less storage capacity.

Fig. 5: Soil Carbon and Nitrogen by Holdridge Life Zones.
figure 5

Carbon Stock (a) and Total Nitrogen (b) of the topsoil (30 cm) in the Holdridge Life Zone scheme. Gridded data of carbon and nitrogen was obtained from the SoilGrids 2.0 dataset80. Light (dark) colors indicate less (more) content of carbon/nitrogen in the topsoil.

In general, the transition to drier and warmer environments that we found reduces the ability of certain biomes to act as carbon sinks due to limited photosynthetic activity. As shown in Fig. 5, there is an inverse relationship between nutrient availability and the PET ratio (PET/precipitation), but the behavior of nitrogen is complex and more sensitive to small changes in temperature and precipitation, especially in colder environments (e.g. TLZ 6).

The cool deserts category is also affected by global warming. We expect a reduction that ranges from 40% to 80% of the total area according to each scenario. This biome shifts toward transitional zones between hot and cool deserts (TLZs 12 and 13) and, to a lesser degree, toward chaparral and hot desert life zones. It is obvious that increased temperatures enhance evaporation, accelerate soil water consumption, and reduce subsurface water storage, which intensifies dryness. But it is less obvious without making the actual calculations to which extent the new warmer and drier conditions increase the exposure to wildfires and favor biological invasions, compromising the survival of native species56. The changes featured in the maps are consistent with a process in which the expansion of the desert decimates the biological communities at their fringes. Ecotones act then as a transient refuge for many species in harsh conditions. The system can go into non-recovery state if it is severely impacted.

Future changes also affect steppe (−26.6%) and cool forest (+23.6%) areas. For steppe, shifts are towards thorn steppe and dry forest ecotones (TLZs 14-16) and its neighboring life zones (chaparral and cool forest). Cool forest expands at the expense of boreal forest and the ecotones between deciduous forest, taiga, and steppe (TLZs 10 and 11). The consequences of these shifts are different although the major driver of change is the same: rising temperature. In the first case, shifts toward chaparral reinforce the main problems of moisture-limited biomes: soil degradation; water scarcity; and more pressure on resources. Those limitations are crucial for wildlife there, as species compete for pastures and freshwater. On the contrary, in temperature-limited biomes—such as boreal forest—an increase in temperature may translate into higher net primary production, more nutrients, and more biodiversity. The impact of climate change on these biomes is complex because the benefits may be minimized by the loss of moisture through increased evapotranspiration. These findings are important by themselves, but it should be stressed that confidence in the previous results is contingent to model ability to correctly depict boundaries between climates. Climate models are powerful tools for environmental applications but have limitations and known uncertainties that should be taken into account in any discussion on those potential biome shifts.

Counterintuitively, we found that uncertainties derived from socio-economic scenarios are not critical for changes in biome distribution. Trends are similar for all SSPs but they differ in the intensity and the extent of the climate shifts. On the other hand, model uncertainties play a key role in future estimates, especially over regions with large inter-model variability. The lack of consensus for present climate affects the reliability of future projections of biomes and, consequently, the projected shifts should be carefully scrutinized, always having in mind that they are contingent upon how well the model represents precipitation in the first place. For ecotones, scenario and model uncertainties are equally important. Many climate shifts appear over regions with large model uncertainties. The western US, the Andes, the African Great Lakes, the Gobi Desert and the Tibetan Plateau are some but a few examples. The increase of spatial resolution can partially solve some problems—e.g. the wet bias in the Tibetan Plateau and Tropical Andes—but many errors inherent to the modeling are still present57,58. Model limitations are often hidden to the environmental sciences community. Thus, for example, future biomes maps show a severe reduction of the Afromontane Forest. If true, an adaptation plan would be urgent to minimize the potential loss of biodiversity59. However, we know that those expected changes are subject to large uncertainties due to limitations in the modeling of precipitation. Discrepancies are clear in the consensus map but are buried in the MME mean. Similarly, inter-model variability impacts the future distribution of prairies in North America60. The differences in precipitation percolate to biome classification and precludes a clear answer to expected changes. Forest tundra in the Tibetan Plateau is another climate hotspot that could lead to biome extinction if we trust the MME projections. We already know that the region is currently under high ecological risk due to rising temperatures61 but there is a lack of model consensus about how much territory will be affected because not only precipitation but also temperature uncertainties. Practitioners should put special focus on these hidden errors in order to avoid potential pitfalls; otherwise that could lead to inappropriate policy decisions.

Some biases are intrinsically linked to the classification system. In Holdridge’s scheme, life zones delineations have a logarithmic scale, which means that lower climate extremes (e.g. cold deserts) are more sensitive to small deviations in climate parameters. The incorrect classification of the Patagonian Desert by most models serves as a perfect example. The logarithmic scale can also affect higher climate extremes in a different way. Annual precipitation in tropical rainforest in Africa is between 1600 and 2000 mm yr−1, slightly below the lower limit of this biome according to Holdridge’s classification scheme. For that reason, CRU and models fail to correctly represent rainforests in Africa but they do in other rainier regions such as South America and the Maritime Continent (above >2000 mm yr−1, on average). Although relevant, these types of errors are the easiest to be controlled through sensitivity analysis.

A potential source of uncertainty in biome and ecotone shifts is the way classification systems tackle potential evapotranspiration (PET). Holdridge’s scheme is based on biotemperature, but other classifications include different variables such as wind speed, radiation and relative humidity62. Resulting differences in PET estimates are propagated to moisture conditions that define the boundaries of each life zone (PET ratio), introducing more uncertainty into predicted biome and ecotone shifts. A simple way to ensure predictions using PET are consistent with those not using it is to compare the outputs. Thus, if the models suffer from the same biases under different classifications, we can conclude that the uncertainties are unlikely to be due to PET misrepresentation. Similarly, if a region experiences a biome change under one type of classification, but not under the other, we can conclude that PET modeling may be playing a role63.

In order to ascertain such potential uncertainty, we performed a comparison of Holdridge classification with Whittaker’s biomes and Köppen’s climate types. We found no significant difference. The affected areas by the shifts and their drivers (changes in precipitation) were similar for the three classification schemes, as are the regions affected by future biome shifts. This lends confidence to Holdridge’ system ability to gauge biome and ecotone shifts in spite of not directly considering PET.

Future projections of terrestrial biomes are also affected by mismatches in source data. Fields with clear-cut gradients, such as precipitation, are difficult to measure even at highly aggregated levels. A major problem of gauge-based observations is that they have low spatial coverage and are generally undersampled in areas with complex orography. The montane forest between Mexican Sierras (Madre Oriental and Occidental) is the canonical example. We observed that CRU underestimates total annual precipitation and thus induces Holdridge to define the area as chaparral while most models classify it as a temperate forest, which is in agreement with in-situ observations of the green cover. This type of problem becomes critical when models are ‘tuned’ towards biased observations during the development stage, masking and propagating errors through the climate projections. Indeed, over-tunning may impact other variables than those tuned given the high non-linearity of the system64,65.

Scale is another aspect that deserves attention. Model outputs are often combined with ecological and vegetation models for vulnerability assessments at a small spatial scale66. However, precipitation estimates can only be adequately used using aggregated quantities and large domains, given the patchy nature and the large spatial variability of the field. Even at large temporal and spatial aggregations, errors exceeding 100% are common in precipitation estimates even for state-of-the-art climate models67. The inherent uncertainty in the estimation of the input data is an important limitation for vulnerability models because downscaling amplifies the cascade of uncertainties in downstream models68,69. Some biases in GCMs and ESMs are due to known limitations in the parameterizations but many others are related to the still imperfect knowledge of the interactions between components of the climate system70,71—so the ‘inherent’ label above. A standard way to cope with errors is by using bias correction methods, which are developed under the assumption that biases are constant over time. However, such statistical stationarity is a strong assumption not supported by empirical analyses. The suitability of that approach is even more unclear for future projections and several authors argued that its performance is highly dependent on the variable of interest, the area of study and the methodology72,73,74.

Consensus maps of indicators such as biome and ecotone classes—e.g. Figure 2d—can be used to qualitatively pinpoint systematic errors in GCMs and ESMs. Another application of these maps is their use for monitoring biodiversity hotspots and its probability of risk, which is derived from the model consensus. An important assumption of our approach is that processes will operate in the same way in future climate giving that the physics will be the same—the rosy scenario—but that is strong assumption in the case of biogeographical variables. So far, and from the results shown here, we can only conclude that confident areas include: east North America and the Brazilian Cerrado; Most regions in Europe, including Fennoscandia; the Congo basin, and hot deserts in Africa and the Middle East; northern Russia and eastern China in Asia; the Maritime Continent and central northern regions of Australia.

In some other areas, however, and attending to a purely quantitative analysis (CRU, ground truth data for precipitation) there is less confidence in the use of GCMs and ESMs to gauge biodiversity shifts. Those include the west North America, Mesoamerica and the Caribbean South America, Tropical and Chilean Andes, and the Gran Chaco in America; eastern Afromontane Forest in Africa; the Irano-Anatolian region, mountains of central Asia, and mountains of southwest China in Asia; and most of transitional zones. A major problem facing the community is that most of those areas are terrestrial hotspots of biodiversity.

In conclusion, despite the advances in recent years, climate models are far from being perfect and modeling water cycle remains the Achilles’ heel of ESMs. The precise measurement of precipitation in the present climate is still challenging and affects biodiversity studies. In the case of future, estimates of precipitation are even more uncertain. An inadequate use of precipitation data in environmental models, one beyond the known limitations of precipitation measurement and modeling, may affect the conclusions of vulnerability assessment studies75. Even the sign and amplitude of the error are uncertain: it could be either an overestimation or an underestimation of the impacts on biota and human life.

It is worth noting that Holdridge’s approach, like any other classification method, is just an indirect way of defining biome distribution, one based on the mean state of key climate variables. We assume that the boundaries defining the classes remain constant over time and we use them to predict future biome shifts. The real world, however, is much more complex and field studies are the ultimate standard to inform policy decisions76.

Here we have focused on the model side in order to identify whose areas are beyond confidence given our state of the art in precipitation science and those where our current knowledge grants us robust conclusions. Indeed, that does not mean that impacts in areas lacking consensus are to be dismissed or model results questioned there. On the contrary, notwithstanding the many caveats that may apply, global climate models are essential for a better understanding of how Earth’s system works. Their outputs provide an accurate estimation of the changes in future climate, but a more solid conceptual and process understanding of climate model biases is required to be used in climate change vulnerability assessments. The point of evaluating climate models is not to criticize them or imply that they are unsuitable for environmental applications, but rather to identify areas that need improvement and direct resources to fulfill those gaps.

Methods

We used data from the climate models participating in phase six of the Coupled Intercomparison Project (CMIP6). We applied a modified version of Holdridge’s life zones system77 (HLZ) to evaluate model and multi-model ensembles for present climate and future scenarios. Models were ranked using Cohen’s kappa coefficient (a qualitative evaluation of classes) and the agreement with observed precipitation (a quantitative estimate). Specifically, we used the following materials and methods.

Data

We used fifty-two Global Climate Models (GCMs) from the Coupled Intercomparison Project Phase 6 (CMIP6, see supplementary table 1 for a list)78 to generate Holdridge’s life zones classification system for present (1980-2014) and future climates (2015-2100). Additionally, we included a multi-model ensemble mean of fifty-two members (MME) and five ensembles of top-ranked models (T05, T10, T15, T25 and T40, see supplementary table 1 for a complete list of models). The reference dataset was Climate Research Unit Time Series version 4.04 (CRU)79. Future life zones and ecotones were evaluated under seven scenarios (SSP1-1.9, SSP1-2.6, SSP4-3.4, SSP2-4.5, SSP3-7.0, SSP4-6.0, SSP5-8.5). Both the GCMs and observational datasets were interpolated to a horizontal resolution of 1.0 × 1.0 using a bilinear remapping method. The analysis was for land-only, excluding Antarctica. Data from CMIP6 models were downloaded from the ESGF. Reference data was obtained from the CEDA website. Soil carbon stocks and nitrogen were computed for each HLZ using the SoilGrids 2.0 dataset80.

Holdridge life zones

HLZs are defined by three climatic measurements: annual precipitation (mm year−1), biotemperature (°C) and PET ratio. Annual precipitation (APR) was calculated from monthly precipitation data. Mean annual biotemperature (MAB) was derived from monthly average temperature. Those months with mean temperature over 30.00 °C and below 0.00 °C were omitted, as in the original method. PET ratio (PER) was defined as the mean annual biotemperature multiplied by a constant value (58.93) and divided by annual precipitation81. We assigned a class to each grid cell by computing the minimum Euclidean distance between each pixel and the geometric centroids of life zones as defined in Sisneros82 (see Supplementary Table 2 for details). The resulting 33 classes were aggregated into 13 major biomes following Monserud & Leemans83. Maps of HLZs for individual models can be found in supplementary figures (Present: 1; Future: 814). Maps of MAB (Present: 4; Future: 3036), APR (Present: 5; Future: 3743), and PER (Present: 6; Future: 44–50) for present and future climates are also included in supplementary figures.

Transitional life zones (Ecotones)

Holdridge’s classification system is a set of 36 hexagons in a triangular frame. Biotemperature, precipitation and PET ratio mark out six separated triangles in each hexagon which represent the ecotones (Fig.1a). Each triangle connects 3 adjacent core zones (inner hexagons). For example, the three lines of precipitation (250 mm), biotemperature (3oC) and PET ratio (1.0) intersect to form triangle 5 (CP-FT-BF ecotone). As did for HLZs, we aggregated the initial 216 transitional life zones into 27 different classes. Supplementary table 3 includes a complete list of ecotones and their defining criteria. Maps and stem plots for individual models can be found in Supplementary Figures (Present: 3, 51; Future: 2228, 5258).

Whittaker’s biomes and Köppen’s climate types

We complemented the analysis with two other classification schemes to ensure that the results were not linked to the chosen classification system. We used the modified version of Whittaker’s biomes described in Ricklefs84, which divides the Earth into nine biomes. Classification was performed using the R package BIOMEplot. For the Köppen scheme, we employed the standard algorithm as used in Navarro et al.85. It classifies ecoregions into five distinct climate types, including one hydrologic type (B) and four thermal types (A, C, D, E). The algorithm also includes three subtypes (f, s, w) to capture the annual precipitation cycle.

Model rank

We used Cohen’s kappa coefficient86 to quantify the ability of individual models to reproduce Holdridge’s life zones as depicted by CRU for the historical period (1980–2014). The kappa coefficient (κ) is defined as: \({\rm{\kappa }}=\frac{{P}_{0}-{P}_{e}}{1-{P}_{e}}\), Where \({P}_{0}\) is the proportion of units with the agreement and \({P}_{e}\) is the hypothetical probability of chance agreement. Grid boxes were weighted by area. The kappa statistic ranges from 0 (no agreement) to 1 (perfect agreement). Models are also ranked in terms of agreement with precipitation observations. The metric used was the coefficient of determination R2. We used Python packages sklearn v0.24.1 and scipy v1.6.1 to perform the statistical analysis. Individual scatter plots of annual precipitation can be found in Supplementary Fig. 7.

Qualitative methods for deconstructing errors and future changes in HLZs

We made a pixel-by-pixel comparison of modeled and observed life zones. If they were coincident, we codified those areas as “agreement”. For mismatching areas, we computed the difference between the nearest geometric centroid for reference and modeled datasets for PET ratio, biotemperature and annual precipitation. Positive values indicate that reference data has higher values than model while negative values are otherwise. Zero means both are equal. Then, we applied the following algorithm:

The same procedure was done for future changes but comparing modeled life zones for present and future climate scenarios. Maps and donut plots of differences (changes) for individual models can be found in Supplementary Figures (Present: 2, 59; Future: 15-21, 60-66).

Mapping consensus

We compared the distribution of HLZs from top ten models (see Fig1b for ranking models). Consensus was obtained when, at least, eight models agreed with reference data in the type of life zone. Dissent was defined when less than three models agreed with reference data. We classified as N/C those cases where models have moderate consensus (3–7 models). All computations are based on pixel-by-pixel comparison. The same procedure was performed for Whittaker’s and Köppen’s climate schemes.

Carbon and nitrogen stocks in the topsoil

We used SoilGrids 2.0 dataset to compute carbon and nitrogen in the topsoil (30 cm) for each pixel. Finally, we grouped all pixels that fall into each hexagon of Holdridge’s scheme and then computed the average to obtain the resulting values for each hexagon.