Global climate models (GCMs) are critical for understanding climate change. Still, their coarseness restricts their usefulness for creating climate change adaptation and mitigation policies, especially at regional scales where climate change impacts are more pronounced1. Over the past two decades, many world regions have experienced increased intensity and frequency of extreme climate events. These extreme events have been attributed to climate change2,3,4,5. Nevertheless, past generations of GCMs predicted a rise in such severe occurrences6,7,8.

Coupled Model Intercomparison Project Phase 6 (CMIP6)9 provides significant advantages that combine representative concentration pathways (RCPs) with shared socioeconomic pathways (SSPs), model advancement, and better modelling of synoptic processes8,10,11,12,13,14. However, as these GCMs become more widely available, it is critical to continue evaluating their ability to represent extreme climatic events worldwide.

Seneviratne et al.7 reported more heavy rainfall episodes, decreased cold extremes, and increased daily temperature extremes globally. Adeyeri et al.2 observed that warm spell duration and the frequency of warm days and nights exhibited statistically significant positive trends over the transboundary Lake Chad Basin. Ge et al.13 investigated future changes in precipitation extremes over Southeast Asia. Chen et al.15 evaluated and compared CMIP6 and Coupled Model Intercomparison Project Phase 5 (CMIP5) model performance in simulating extreme seasonal precipitation in the western North Pacific and East Asia. Ridder et al.16 reported future changes in return periods for the co-occurrence of heatwaves, drought, extreme winds, and precipitation based on a CMIP6 multimodel ensemble.

Dike et al.17 evaluated seasonal precipitation extremes using CMIP6 and reported an increase in the severity of projected dry spells over Central Asia. Collazo et al.18 considered CMIP6 models' ability to represent observed extreme temperatures and concluded that the chosen models could not simulate cold days trends. Das et al.19 assessed the impact of climate change on temperature extremes and projected a substantial rise in temperature extremes. Wei et al.20 evaluated 23 CMIP6 models' ability to simulate extreme climate events over China and stated that considerable uncertainties remain in future climate projections. Although there are indications that GCMs have significant biases in climate simulations compared to observations21,22,23, there have been positive improvements in the CMIP models24,25,26. Hence, selecting an appropriate collection of GCMs is critical for the improved assessment of climate change impacts.

Even though previous researchers focused on the ability of CMIP models to simulate extreme climate events or a particular meteorological variable, none of these studies provides an informed decision on the climate models' skill in representing different sector-specific indices. Neither offers a holistic approach that could be useful for selecting appropriate GCMs when aiming to represent different climate extremes better. It is worth noting that a proper depiction of a global variable's spatial or temporal climatology by a GCM is not the same as the model's ability to represent climate extremes, especially at regional scales, correctly. Additionally, they do not evaluate the multi-index causal effects due to the mutuality of the extreme events. Understanding the causal influence of the candidate's extreme event on the responses of the other extreme occurrences is important.

Therefore, this study focuses on CMIP6 skill in representing multi-index climate extremes for different sectors using eleven CMIP6 models. The sector-specific indices are classified as Health, Agriculture, and Water Resources. We subsequently evaluated the multi-index causal effects for fourteen indices as well as the trends and climate change signals.

Data and methods

The CMIP6 dataset incorporates different shared socioeconomic pathways (SSPs). This study focuses on eleven CMIP6 models (Supplementary Table S1) under two SSP scenarios (SSP 370 and 585) during historical (1979–2014), near future (2025–2060), and far future (2065–2100) periods for Africa, Asia, North America, South America, Europe, and Oceania. SSP370 is the medium-to-high end of future emissions and temperature scenarios27, while SSP585 is the only SSP with emissions sufficient to deliver the 8.5 W/m2 level of forcing in 2100. Table 1 presents the sector-based indices28.

Table 1 List of indices.

The reference dataset, W5E529,30, is a daily-resolution observed climate data on a global 0.5° × 0.5° lat-lon grid. W5E5 was selected as the reference dataset because it has been used to bias-correct climate models for impact studies31. All precipitation outputs were re-gridded using second-order conservative mapping due to the significant variability and unequal distribution of precipitation. In contrast, all temperature outputs were re-gridded using bilinear interpolation32 to prevent erroneous scale gap effects31. All outputs were re-grided to a single 1° × 1° grid.

GCM ranking

The performance of each GCM was ranked for each climate index during the historical period using five different metrics23,33,34,35; namely, percentage bias, mean absolute error, index of agreement, correlation, and root mean square error. These performance metrics were combined for a universal rank Eqs. (1) and (2) in which all metrics were assigned an equal weight35,36.


where \({T}_{i,j}\) is the error of metric j and particular model i in space and time. All metrics m are added together to get the total relative error \({T}_{i,tot}^{*}\) per GCM per index. It should be noted that the index of agreement and correlation were reversed for the procedures min() and max() because the best-performing models should have a total relative score closer to zero. Rank 1 is the best-performing model for each index.

Empirical cumulative distribution function (ECDF)

The ECDF is calculated by ordering all the unique observations in the data sample and calculating the cumulative probability for each as the number of observations less than or equal to a given observation divided by the total number of observations (n).

This is given as37;

$$\widehat{{F}_{n}}(t)=\frac{{\sum }_{i=1}^{n}1{X}_{i}\le t}{n}$$

where 1X is the indicator of event X. When t is fixed, indicator \(1{X}_{i}\le t\) is a Bernoulli random variable with parameter p = F(t).

\(\widehat{{F}_{n}}(t)\) is the unbiased estimator of F(t). The independent, identically distributed real random variables with the shared cumulative distribution function F(t) are denoted by Xi(t).

We examined the ability of the GCMs to simulate the monthly climatology of the extreme indices by visually inspecting and subjecting the ECDF distributions to Kolmogorov–Smirnov test to understand the distribution differences between the reference and the outputs from the GCMs.

Correlation, partial correlation, index of importance and causal effects

The degree of association is a metric that shows how strong the relation between two variables is without considering that a third variable may influence both variables. When several factors impact the phenomenon under examination, partial correlation becomes extremely important.

However, due to the changing variance and linearity of the extreme events, we used the spearman correlation to benchmark the monotonic relationship between the extreme events.

Partial correlation is the correlation of two variables while controlling for a third or more other variables. The relationship is said to be partial when two variables are correlated while conditioning the third or several other variables38.

The partial correlation is given as:

$$\rho UVW= \frac{\rho UV- \rho UW\rho VW}{\sqrt{\left(1-{\rho }^{2}UW\right)*(1-{\rho }^{2}VW)}}$$

where \(\rho UVW\) is the partial correlation of variables U and V conditioned on W, \(\rho UV\) is the correlation between varaibles U and V, \(\rho UW\) is the correlation between variables U and W while \(\rho VW\) is the correlation between variables V and W.

For a set of t controlled variables, W is

$$W= \left\{{W}_{1},{W}_{2}, .., {W}_{t}\right\}$$

Statistically significant correlation and partial correlation were examined using the Student's t-test at a 95%, 99% and 99.9% confidence level.

Due to the mutuality of these extreme events, we also quantified the causal influence of the candidate's extreme event on the responses of the other extreme occurrences.

Several methods have been proposed to assess this, but the most popular is the permutation of importance, based on the random forest algorithm39,40. However, this approach is unsuitable for highly correlated events, as it cannot distinguish between an event's conditional and marginal influence41.

As a result, the Conditional Variable Relevance—also known as the Conditional Permutation Importance (CPI)—was used to assess more partial importance in random forests. The CPI is presented as:

  • For the regression trees in the random forest

    $${R}^{(t)}=\sum_{{i\in \beta }^{(t)}}\frac{{\left({\widehat{y}}_{i}^{(t)}-{y}_{i}\right)}^{2}}{\left|{\beta }^{(t)}\right|}$$
    $${R}_{(k)}^{(t)}=\sum_{{i\in \beta }^{(t)}}\frac{{\left({\widehat{y}}_{i(k)}^{(t)}-{y}_{i}\right)}^{2}}{\left|{\beta }^{(t)}\right|}$$
  • For the classification trees in the random forest

    $${R}^{(t)}=\sum_{{i\in \beta }^{(t)}}\frac{I\left({\widehat{y}}_{i}^{(t)}\ne {y}_{i}\right)}{\left|{\beta }^{(t)}\right|}$$
    $${R}_{(k)}^{(t)}=\sum_{{i\in \beta }^{(t)}}\frac{I\left({\widehat{y}}_{i(k)}^{(t)}\ne {y}_{i}\right)}{\left|{\beta }^{(t)}\right|}$$
    $${x}_{i(k)}={(x}_{i}1, \dots ,{x}_{ik-1, }{x}_{p\left(i\right)k, }{x}_{ik+1}, \dots , {x}_{ip})$$

where \({R}^{(t)}\) and \({R}_{(k)}^{(t)}\) are the prediction error of tree t in a random forest with predictors p and ntrees of trees based on the out-of-bag sample \({\beta }^{(t)}\), respectively, before the permutation of the out-of-bag values of \({X}_{k}\). \(\left|{\beta }^{(t)}\right|\) is the out-of-bag cardinal number sample for tree t, and I() is the indicator function.

\({\widehat{y}}_{i}^{(t)}\) is the function \({f}^{(t)}{(x}_{i})\), which is the random forest prediction of the out-of-bag observation i before permutation. \({x}_{p\left(i\right)k}\) is the ith \({X}_{k}\) observation after permutation.

The forest permutation of importance \({PI}_{(k)}\) is the average overall tree-wise permutation \({PI}_{(k)}^{(t)}\).


where \({PI}_{(k)}^{(t)}\)= \({R}_{(k)}^{(t)}= {R}^{(t)}\)

In the CPI, out-of-bag values of one predictor \({X}_{(k)}\) are conditionally permuted on other predictors \({Z}_{(-k)}\).

The extreme events are subjected to CPI following Debeer and Strobl40, while the causality assumptions are presented by van der Laan42.

Climate change signals in extreme events

For risk management purposes, climate projections should include a wide range of feasible spectrums of anticipated future climate change36. The climate change signals (CCS) in extreme events were calculated as the difference between (i) the far future and the historical period, (ii) the near future and the historical period, and (iii) the far future and the near future.

Trends and significance

Spatial plots based on robustness and significance criteria are presented for all indices' historical and future assessments. Following Haensler et al.43, this study considers a signal robust if 66% of the models agree. Statistically significant trends and climate change signals were examined using the Student's t-test at a 95% confidence level35. The trend in the climate extreme events was calculated based on the Mann–Kendall trend test, while the trend's magnitude was estimated using Sen's slope35,44,45,46.


Spatial spread climatology

The climatology was computed for all indices for the historical period. For instance, the climatology of extreme temperature range (ETR) ranged from 2 to 47 °C (Supplementary Fig. S1), with the tropics having the smallest range. BCC replicated the spatial spread, with slight overestimation over Russia and underestimation around the Brazilian and West African coasts. Additionally, other models had varying degrees of overestimation or underestimation. Generally, the range increased away from the equator in the Northern and Southern Hemispheres. This pattern is visible in all models. The departure of the models from the reference is presented in Fig. 1. Most models had evident cold biases in the Southern Hemisphere. However, some models, e.g., BCC, performed well in North America, Greenland, West Africa, and some parts of Russia. The hot bias was highest in CANESM5, with ETR values as high as 23 °C in Greenland and China. MPI-ESM, MPI-LR, and NORESM2 all recorded cold tendencies over entire continents. The spatial spread for other indices is presented in the supplementary file (Supplementary Fig. S2 to S9).

Figure 1
figure 1

Bias in the monthly climatology of ETR for different CMIP6 models.

Regional ECDF

Taking the ETR as an example, Supplementary Fig. S10 shows the distribution of the ECDF on the continental scale. Over Africa, four models (CANESM5, BCC, UKESM, and INM) reasonably replicated the reference ETR (Supplementary Fig. 10a, Table 2). However, based on the Kolmogorov–Smirnov statistics (Table 2), INM showed the lowest distribution distance (0.14), although with a P-value lesser than 0.05. This means INM performed best in capturing the ECDF distribution. However, the P-value suggested that INM distribution was not the same as the reference distribution. MIROC6 overestimated this by up to 5 °C. At the same time, IPSL, MPI-LR, MPI-ESM, and NORESM2 underestimated the distribution by up to 7 °C. Over Asia, there was no outright distinction in the distribution by the models. Nonetheless, MIROC6 was close to the reference with a Kolmogorov–Smirnov statistic of 0.15. Over South America, no model could replicate the reference distribution well, although BCC performed better than the other models. The Kolmogorov–Smirnov statistics is 0.48. All models behaved differently in capturing the ECDF of different indices and for separate continents.

Table 2 Kolmogorov–Smirnov statistics for ECDF comparison.

GCM performance and ranking

Figure 2 shows the multi-index continental GCM ranks. Over Africa (Fig. 2a; Supplementary Table S2), MPI-ESM performed best in replicating the daily precipitation intensity (DPI), while BCC performed worst (ranked 11th). For ETR, four models (CANESM5, BCC, UKESM, and INM), as observed in the ECDF earlier, ranked best in Africa. Over Asia, CANESM5 ranked first in heatwave index (HWI), maximum consecutive dry days (MCCD), daily temperature range (DTR), and dry spell total length (DSTL). In contrast, BCC ranked first in heatwave maximum length (HWML), hot spell maximum length (HSML), and DTRV. As observed by the ECDF of ETR over Asia, MIROC6 was the best-performing model in ETR. Over Europe and North America, IPSL ranked first in replicating the observed warm and dry days (WDD). In contrast, MPI-LR ranked first in reproducing the reference DPI over South America and Oceania, respectively. Notably, the hot spell frequency (HSF) and heat wave frequency (HWF) were well replicated by all CMIP6 models, as the error margin was between 2 to − 3 days/decade for all models.

Figure 2
figure 2

Multi-index continental GCM ranks.

Correlation and partial correlation

Figure 3 shows the multi-index correlation and partial correlation plot for the reference on the continental scale. For instance, in Africa, the HSML was strongly positively correlated with heatwave frequency (HWF) (88%), HWI (87%), HWML (90%), and heatwave total length (HWTL) (90%), all at the 95% confidence level (upper diagonal). However, this is different when considering partial correlation. The partial correlation plot (lower diagonal) shows that the HSML was positively correlated with heatwave frequency (HWF) (21%), HWI (59%), and HWML (27%), while negatively correlated with HWTL (− 18%), all at the 95% confidence level. The same pattern was observed in Asia and Oceania. The partial correlation values and directions differed in North America, South America, and Europe. The analysis was repeated for the ensemble-mean of the CMIP6 (Supplementary Fig. S11). Generally, the reference and model ensemble-mean have similar correlation and partial correlation patterns, albeit with differing magnitudes.

Figure 3
figure 3

Multi-index correlation and partial correlation plot for the reference over (a) Africa (b) Asia (c) North America (d) South America (e) Oceania (f) Europe. The upper diagonal to the indices is the correlation values, lower diagonal to the indices is the partial correlation values. *, ** and *** represents significant correlation, and partial correlations at P-value = 0.001, 0.01 and 0.05 respectively.

Causality and importance

Figure 4 presents the CPI results for the reference data. Over the entire continent, HWTL was greatly influenced by the hot spell maximum length (HSML) and HWF. There was no universal pattern of causality for ETR, as the importance-modulating index varied with the continent. In Africa, HWTL, DTRV, and HSML were the most influential indices for ETR, while in Asia, HWI, DTR, and DTRV were the most significant. In contrast, DTRV and DTR were the most influential indices in Europe. HWTL and HWF greatly influenced changes in the pattern of HSML on all continents, while DSTL was the most crucial index modulating the MCCD for all continents. Supplementary Fig. S12 to S18 show the CPI of other extreme indices over the entire continent. Additionally, there was general agreement between the index of importance observed in the reference and the multimodel ensemble-mean.

Figure 4
figure 4

Conditional variable of importance of selected extreme indices using the reference.

Trends and CMIP6 trend consistency

As a case study, Fig. 5 shows the trends in ETR for reference and the CMIP6 models during the historical period. For reference, there were some statistically significant negative trends in ETR in some parts of North America and Asia, with values between − 4.1 and − 0.3 °C/decade. However, there was a typical positive trend of ETR in Africa and South America, though some parts were not statistically significant. Moreover, the CMIP6 models could not capture the significant trends in the reference. They also exhibited some underestimations, particularly in South America, and overestimations in Greenland. Except for IPSL and CNRM, most models showed a general warm bias over North America and northern Asia (Supplementary Fig. S19). Furthermore, there was a characteristic cold bias in South America, Africa, and Oceania. The range of the model trend biases was between − 3.4 and 4.7 °C/decade.

Figure 5
figure 5

ETR trend for CMIP6 models during the historical period. Hatchings represent area with significant trend at 95% confidence level.

There was a consistent negative trend of DTR over most parts of West and North Africa, southern Europe, and Central Asia, with agreement by all eleven models (Fig. 6). The positive DTRV trend consistency was prominent mainly between latitudes 15° N and 35° N; and longitudes 25° W and 60° E, while an apparent negative trend was observed over Europe, northern Australia, and western Canada, with agreement by at least nine models. The consistent positive trend of HTWL was particularly noticeable in West and North Africa, most parts of Australia, and South America.

Figure 6
figure 6

Trend consistency for CMIP6 models during the historical period.

Projection and change signal

Taking ETR as a case study, Fig. 7 shows the CCS in trends between the historical, near future (NF), and far future (FF) for both SSP 370 and 585 using the CMIP6 ensemble-mean. Although there were some significant negative trends of ETR in some parts of North America and some significant positive trends in some parts of South America, the ensemble-mean generally replicated the spatial trend in other parts of the world during the historical period, although with different magnitudes and significance. In most cases, most parts of the tropics witnessed a positive trend of ETR during the historical period. The FF under SSP 370 witnessed a significant widespread negative trend in Europe and Alaska. Similarly, other parts of the world mainly had positive trends. The trend values varied between − 4.9 and 3.5 °C/decade.

Figure 7
figure 7

Trends of ETR during the reference (a), historical and future periods (bf), climate change signals (CCS) in trends between the historical, near future (NF) and far future (FF) (gl) for both SSP 370 and 585 using CMIP6 ensemble mean. Hatchings represent area with significant trend at 95% confidence level.

We further explored the climate change signal in the trends for the different periods.

The CCS between the historical and SSP 370 NF showed a rise in the trend of ETR from the historical to SSP 370 NF periods in Europe, with a magnitude between 1.0 and 4.0 °C/decade. However, there was a general drop in this trend for most parts of Canada and some parts of Russia. Moreover, most parts of the world experienced an increasing trend from the historical to SSP 370 NF. In reverse, the CCS between the historical and SSP 585 NF revealed that most areas of Canada that were previously accustomed to negative trends were reversed to positive. In contrast, the once positive trend in the USA under SSP 370 NF became negative. China also had a more intense positive trend during this period. A further look into the CCS between SSP 370 NF and SSP 370 FF shows a significant deepening in the trend of ETR over most parts of Europe, with magnitudes of between − 2.1 and − 5.0 °C/decade. At the same time, some other areas like north-eastern Canada had a significant rise in the trend by up to 4 °C. Supplementary Fig. S20 shows the result of the same experiment considering DTR.

Figure 8 shows the yearly and total cumulative heatwave index (HWI). Africa had the highest days of HWI. The reference series values ranged from 149 (in 1983) to 171 days/year (in 2012). However, the highest days in the historical CMIP6 ensemble-mean varied from 120 to 130 days/year. The highest days of HWI for the entire period of study manifested during the far future under SSP 585, with 181 days/year from 2098 to 2100. Nevertheless, comparing the reference with the historical simulations showed an underestimation of HWI days for all continents. The total cumulative HWI index showed Africa having more than 6000 days of HWI during the far future under SSP 370 and 585, while Asia maintained a value of 2000 days under the same SSPs.

Figure 8
figure 8

Yearly cumulative heatwave index for the reference (ref), historical (sim), near future (NF) and far future (FF) (b) Total cumulative heatwave index. AFR, AS, EU, NA, SA and OC mean Africa, Asia, Europe, North America, South America, and Oceania, respectively. 37 and 58 denote SSP 370 and 585, respectively. On the year axis, the first index represents the starting year of each period, i.e., 1979, 2025 and 2065, for historical, Near future and far future periods, respectively. The box's border in Fig. 8b indicates the circumference axis, with the smallest circumference starting with 2000.

Zonal and meridional cross-section ETR

Zonal and meridional cross-sections were used to understand further the influence of land, sea, and latitude differences on ETR (Fig. 9). Over Africa, there was an apparent increase of ETR away from the equator and the coast. However, high ETR was visible in desert regions, specifically between 20 and 30°N (Sahara Desert) and between 25 and 32°S (Kalahari Desert). More importantly, the ETR was highest in the Kalahari Desert for all years due to cooler (freezing) nights than in the Sahara Desert. Furthermore, the ETR decreased toward the Mediterranean due to the ocean's high specific heat capacity. ETR was highest in regions with a dense landmass and reduced toward the ocean due to the land's low specific heat capacity compared to the ocean.

Figure 9
figure 9

Zonal and meridional cross-section of reference ETR.

ETR was lowest over scattered landscapes (− 10° N to 10° N) in Asia and increased with denser landmass (Fig. 9c). Meridionally, ETR was highest in mountainous regions due to the ability of these regions to warm and cool faster than the surrounding regions. ETR also decreased toward the Pacific Ocean. Generally, ETR ranged between 3.0 and 72.0 °C for the displayed continents.

Additionally, there were cold biases in most CMIP6 models along the tropics (Fig. 10). However, NORESM2 reasonably captured the low ETR along the equator and a few degrees away from the equator. IPSL and INM underestimated the low ETR along the equator by 5 to 20 °C. In contrast, MPI-LR, MPI-ESM, CANESM5, and BCC captured the high ETR north of 50°N reasonably well. The models, overall, mirrored the reference patterns. The ETR reduced as it approached the Arctic due to ice, increased as it advanced toward the Mediterranean, and declined as it approached the equator. The broad range was between 1.0 and 73.0 °C.

Figure 10
figure 10

Zonal cross-section of the reference and CMIP6 ETR during the historical period.

Discussion and conclusion

In this study, we explored the potentials of eleven GCMs participating in CMIP6 project in representing fourteen climate extreme indices that are useful for different sectors, including health, agriculture and water resources. Understanding of climate model skills, multi-index causal effects and global climate change signals of extremes is limited, but is key in reinforcing well-informed decisions in the various sectors.

For the selected models, individual CMIP6 GCM results often varied significantly between regions and models. For example, the top-performing GCM in simulating the reference DPI tended to exhibit consistently strong performance across continents, whereas the best-performing model in simulating DTSL differed among continents.

Additionally, there were observable biases in the climatology of various indices. The distribution of the ECDF supported these inconsistencies. Notably, all models behaved differently in capturing the ECDF of different indices and for different continents. These inconsistencies could be attributed to the various land surface schemes and simulation of features such as vegetation36 and orography47, unrealistic large-scale variability35,48, and contrasting internal variability between climate models and observations23,49.

Nevertheless, previous studies50,51,52 showed that CMIP GCMs struggled to capture daily temperature minima and maxima. Yet, understanding extreme temperature events is critical since many earth system processes are influenced by the maximum and lowest temperatures as well as the mean temperature36,52.

The performance of GCM in simulating precipitation extremes also varied widely across the globe. However, it is essential to note that the highest biases were recorded immediately north and south of the equator, particularly for DSTL. The signal of these biases varied among the different models, suggesting the inability of the models to correctly resolve important biophysical features such as solar radiation, vegetation, and cloud features, especially at daily scales53,54 or other synoptic-scale system interactions55. In contrast to Srivastava et al.56, who reported that UKESM was one of the best-performing models in the US for precipitation simulation, we found that IPSL performed best in simulating MCCD for both Africa and North America, while MPI-ESM and MPI-LR performed best in simulating DPI in Africa and North America, respectively. Across all continents, IPSL performed best for MCCD, while MPI-LR performed best for DPI. Remarkably, the MPI model family performed best in capturing the DPI across the six continents. This may have been produced from model inter-dependence36,56,57.

Despite the different biases associated with the various models, the correlation and partial correlation patterns were similar for the reference and model ensemble-mean, although with different magnitudes. Also, the causal impact for different indices varied. More precisely, the causal impact for ETR did not follow a universal pattern since the important modulating index differed depending on the continent. Furthermore, the multimodel ensemble-mean and the significant index agreed with the reference.

There was a mix of trends for the various specific indices for each location. Despite a rise in both maximum and minimum temperatures, the significant negative trends of DTR suggested quicker warming of the minimum temperature2.

Nonetheless, these results have some practical implications for decision-makers in different sectors. Due to changes in land cover, land use, and the resources employed in urban centres, cities have greater surface temperatures than rural areas (urban heat island phenomenon53,58,59,60). Additionally, urban heat island impacts lengthen the duration of heat events during heatwaves61; however, this behaviour is different for the various atmospheric circulation regions. For example, we could not observe increased heatwave total length in most urban areas in the mid-latitudes. Yet, the reverse is for urban areas in the tropics. This demonstrates that rather than urban heat islands, large-scale meteorological conditions62, significantly impact the length of the duration of heatwaves in midlatitude.

The consequences of drought on water demand and supply by natural systems and people, on the other hand, will be amplified as the climate warms63,64. Furthermore, rising temperatures worsen heavy precipitation by increasing atmospheric moisture, promoting the precipitation event through moisture convergence at low altitudes65,66 and increasing evapotranspiration rates arid areas46. This will lead to more intense hydrometeorological situations, such as floods and droughts5 and significantly impact the amount and quality of available water as well as river discharge timing and amplitude67. As a result, humans, society and natural systems are at risk67,68.

Therefore, it is crucial to design appropriate strategies to strengthen resilience to extreme climatic events while also mitigating further GHG emissions. However, optimal adaptation/mitigation strategies for climate change could be hampered by incorrect information from the poor representation of climatic events in data sets.

Beyond the specifics of this research, future work could investigate extreme historical events with additional reference datasets, bias-corrected or downscaled CMIP6 datasets, especially for extreme precipitation events. Additionally, more CMIP6 models could be utilized.