Introduction

Approved by the United Nations Framework Convention on Climate Change (UNFCCC), the Paris Agreement set a challenging target to keep global warming to no more than 2.0 °C and pursue efforts to achieve a lower warming of 1.5 °C above preindustrial levels1. However, the current fossil-fuel-dominated energy structure and mitigation strategies cannot well support achieving this global warming target. Although some studies claimed that the COVID-19 pandemic led to an evident decrease in greenhouse gas and aerosol emissions2,3, the global cooling response to the pandemic could be sudden and negligible in a long run, because even large emission reductions applied for a short duration can not drastically impact on future climate change4.

Addressing climate change and low-carbon development is a major challenge worldwide. Every country requires to develop climate action plans in the form of “nationally determined contributions” (NDCs), make zero-emission policies to reach a carbon peak, and reduce the greenhouse gases emission to achieve carbon neutralization, so as to accomplish the truly ambitious global warming goals5. As the largest developing country and currently the largest carbon emitter, China has also proposed a plan to peak carbon dioxide emissions by 2030 and achieve carbon neutrality by 2060 (“30•60 Dual-Carbon Target”). China is also the first developing country to announce the carbon reduction target, which is a landmark in the fight against climate change, though it is still facing a series of challenges6.

Previous studies indicate that the mean temperature in China has risen at a rate of 1.3–1.7 °C (100 yr)−1 since 1900 based on observations7. It is generally recognized that even a marginal signal in global mean temperature changes would affect greatly temperature and precipitation extremes. In recent years, the frequently occurring extreme events show an increasing tendency with the faster rate of warming. China is one of the countries, which are suffering the most severe climatic disaster. According to the statistics, China is generally subjected to direct economic losses of more than 34 billion every year owing to the droughts and from 1984 to 2018 the annual affected crop area was recorded to be exceeding 200,000 km2.8. Raising temperature extremes along with heat-related hazards have drawn significant societal concerns over the past two decades. In the context of global warming, temperature extremes tend to be more frequent and intensive, causing a potential hazard of increased heat-related morbidity and mortality, especially in regions with complex and fragile climate systems in China9. Hence, in the face of both global warming and “Dual-Carbon” targets, it is essential to evaluate extreme temperature changes in China in the context of climate change.

Global climate models (GCMs) are useful tools for evaluating the historical climate and anticipating future climate change. Although numerous studies have been conducted in the climate extremes using GCMs from different previous phases of Coupled Model Intercomparison Project (CMIP), the reliability of future climate projections is debatable because of these models’ coarse resolution and associated uncertainties10. The latest CMIP6 offers multi-model climate projections based on more realistic future scenarios (SSP-RCP), which are a direct reflection of the social concerns related to the mitigation, adaption or impact of climate change11. A range of studies has accessed the changes in extreme temperature around the world using the outputs of CMIP69,12,13,14,15,16,17.

Although both national climate scenarios and international climate assessments depend heavily on the outcome of multiple climate model simulations, the reliability of these models has an important impact on science and ultimately policy-targeted science communication18. Compared with CMIP5, CMIP6 models can better represent physical processes at smaller scales19, and it is proved that they have globally improved their historical representations of climate extremes indices compared to the previous models20,21. However, due to some essential uncertainties, such as the resolution, physical process and forcing conditions in models, some studies also show that there is relatively unsatisfactory or no clear superiority of CMIP6 based on their performance in representing the characteristics of climate indices during the historical period22,23. In addition, not all models are equally plausible, since some models have larger biases than others. In other words, the uncertainties may be too large when unrealistic models are included, while the range of models may be too narrow when we underestimate uncertainties from processes that are not or are poorly represented18. Therefore, bias-corrected projections are necessary for the raw spread of model ensemble at regional and local scales. It is also significant to provide bias-corrected information on future changes in extreme temperature for governments in making adaptation and mitigation policies. However, to the best of our knowledge, barely any study has focused on the bias-corrected projections in temperature extremes across China under the CMIP6 scenarios and global warming levels.

In this study, we use the Equidistant Cumulative Distribution Functions (EDCDF) method to correct the outputs from 12 CMIP6 GCMs in historical and different SSP-RCP scenarios over the whole China region. Our main goals are as follows: 1) to compare the performance between original and bias-corrected CMIP6 models; 2) to project the changes in extreme temperature under different global warming scenarios; 3) to assess the benefit for China if global warming is arrested at 1.5 °C rather than 2.0 °C.

Results

Model evaluation

Temporal evolution

Supplementary Figure 1 exhibits the comparisons in the temporal evolution of extreme temperature indices between the bias-corrected CMIP6 multi-model ensemble and CN05 observation in the baseline period. The results show that the general trend of the temporal variation is well simulated by bias-corrected models for most extreme temperature indices over China. Overall, the spread (gray region) of the ensemble models can cover the temporal evolution of CN05. Moreover, the simulated indices from the multi-model ensemble mean compare well with the CN05. However, there are a few exceptions. For instance, the number of cold nights (TN10p) and frost days (FD) simulated from ensemble models is overvalued against the CN05, which exceeds the lower boundary (5% percentile) of CMIP6 models in some years. In addition, owing to canceling out most of the internal variability among ensemble models, the variation curve in GCMs ensemble mean shows smoother than the CN05 observation with a larger inter-annual variation.

Figure 1 plots the results from IVS scores, which are used to evaluate the match of the interannual variability between the CMIP6 models (before and after bias correction) and observation in China. Except for the no bias-corrected TR index and CNRM-ESM2–1 model, the IVS scores in most extreme temperature indices are below 1.0, indicating that the CMIP6 models show a reasonable ability to reproduce the observed interannual variability in temperature extremes in China. In addition, the results also show that the bias-corrected models have smaller IVS scores than the original ones obviously. For example, for the bias-corrected ensemble mean, the IVS scores of most extreme temperature indices are below 0.4, especially for the TNx and FD. The percentile extreme indices are relatively well simulated for both raw and improved GCMs, but with less pronounced results in correction ones. The improved performance in some models is apparent in simulating the inter-annual variation. For example, the IVS values from the raw CNRM-ESM2-1, HadGEM3-GC31-LL and MIROC6 are larger than other GCMs and their IVS scores are reduced after bias correction, suggesting that the corrected simulations evidently outperform the original simulations in capturing the interannual variability in China.

Fig. 1: The skill of no bias-corrected and bias-corrected CMIP6 models in simulating the interannual variability of extreme temperature indices during 1995-2014.
figure 1

a is the IVS of the ensemble mean for different extreme temperature indices; b shows the IVS of extreme temperature indices mean for different GCM.

Spatial distribution

We estimated the multi-model ensemble mean bias of 12 extreme temperature indices against the observation CN05 in spatial distribution from the 12 CMIP6 GCMs (Fig. 2). For the 4 extremal indices, the raw CMIP6 models show a warmer bias in simulating the warmest day (TXx) and night (TNx), whereas an obvious cold bias is found for the coldest day (TXn) and night (TNn) in the majority of China, especially for the west (exceeding -5 °C). However, these biases are greatly reduced after the bias correction in all four indices during the historical period, with a bias range from -1 ~ 1 °C.

Fig. 2: Multimodel ensemble mean bias in 12 ETCCDI temperature indices for the historical period (1995–2014).
figure 2

“BEF” represents the bias before correction, and “AFT” is the bias after correction.

Similar to the extremal indices, the biases of four absolute indices, such as SU, ID, TR and FD, are estimated in the outputs from the original and corrected CMIP6 GCMs in the baseline period. Compared with the CN05, no bias-corrected models exhibit substantial positive biases in simulating the icing days (ID), tropical nights (TR) and frost days (FD) while the number of summer days (SU) is underestimated thoroughly over most regions of China except for the Qinghai-Tibet Plateau. Again, the above considerable biases are successfully removed through the EDCDF bias correction. Though some spatially sporadic bias spots are still found in the northwest, the biases are limited to ~2 days in the majority of China. However, the frost days seem to be still overvalued in the west.

The improvement of bias-corrected GCMs for four percentile extreme indices is not as apparent relative to the results before correction. Both raw and revised multi-model ensemble mean compare well against the observation CN05. The biases of warm and cold days (TX90p and TX10p) are smaller than the other two indices. The warm nights (TN90p) are underestimated across the west but overestimated slightly in the southeast of China. For the cold nights (TN10p), obvious positive biases are found before and after bias correction.

Supplementary Figure S2 presents the performance (RMSE’ and RMSEstd) of the individual bias-corrected CMIP6 model in simulating the 12 extreme temperature indices in the baseline period over China. For RMSE’, the blue-shaded columns with negative value signifies better simulation skill. The right grey-shaded columns show the indices averaged standardized median of RMSE for each CMIP6 model. We regard the better performance of model as smaller RMSE’ and RMSEstd. In total, most bias-corrected models show relatively small biases in terms of the RMSEstd value, with about 0.5, while CanESM5, INM-CM4-8 and MIROC6 show larger errors among these models from the overall performance in simulating the temperature extremes. Nevertheless, there are still differences in simulating indices for the individual model. For example, though the overall performance of CanESM2-1 is the worst among the 12 models, this model has the best skill in simulating the TN10p. The results also exhibit that the ensemble mean is clearly better than any individual model on the whole, which suggested that the uncertainties from internal variability and systematic errors in an individual model can cancel each other out in the statistical mean13.

A comprehensive assessment of models (including IVS and RMSE) is undertaken in this study (Fig. 3). Models located in the left-bottom quadrant are regarded as having good performance for both criteria while ones in the right-top quadrant represent the worst performance. It can be deciphered that the models before bias correction are scattered in the left-top and right-top quadrants, especially for HadGEM3-GC31-LL and MIROC6, while most models after bias correction move into the best quadrant. In other words, the advance using the bias-corrected method is more intuitive.

Fig. 3: Scatter diagrams showing model’s performance based on IVS (x axis) and RMSEstd (y axis).
figure 3

a and b are the raw and bias-corrected b simulations, respectively. Each dot represents a model, identified by its number on the right and the values are the mean of all indices. Models in the left-bottom quadrant are of good performance for both criteria.

Changes at specified warming levels

Determination of global warming periods

Supplementary Figure S3 shows the global average surface temperature changes simulated by 12 CMIP6 models compared to the pre-industrial period. Under the SSP245 scenario, the global mean temperature will increase by about 3.0 °C at the end of the twenty-first century. The increase rate of temperature under SSP585 is more than that under SSP245, even warming up to 5.1 °C approximately by 2100. The arrival time of two warming levels in SSP585 occurs earlier than that in SSP245, especially for the 2.0 °C scenario.

Based on this, we calculate the arrival time of individual GCM for 1.5 °C and 2.0 °C warming levels (Supplementary Table 3). For the ensemble mean, the arrival year of 1.5 °C projected by the CMIP6 ensemble mean is 2030 (2021 ~ 2040) under SSP245 and the first year of the arrival of 2.0 °C is 2046 (2037 ~ 2056). Similarly, for the SSP585 scenario, the 1.5 °C and 2.0 °C reaching years are 2026 and 2039. The arrival year of 1.5 °C in SSP585 is earlier a little bit than that in SSP245, while 2.0 °C is nearly 7 years earlier than SSP245, suggesting that the faster warming rate under SSP585 marked, the earlier the 1.5 °C and 2.0 °C reaching year. It is noteworthy that the MIROC6 shows the latest reaching time among 12 CMIP6 models, especially for the 2.0 level under SSP245. On the other hand, the model of the fastest warming rate is CanESM5, whose arrival year of 1.5 °C (2.0 °C) is 2013 (2024) and 2012 (2022) under SSP245 and SSP585, respectively. The global warming target periods of CanESM5 are much earlier than those projected by MIROC6.

Changes in the spatial distribution

We depicted extreme temperature indices changes in spatial distribution (Fig. 4, supplementary Figures 4 and 5) and calculated the regional mean results (Table 1) across China at 1.5 °C and 2.0 °C warming levels. For the extremal indices, such as TXx, TXn, TNx and TNn, a notable feature is that all indices are expected to increase over China, and larger changes are found in higher warming level and emission scenarios. The changes are different in spatial distribution. Specifically, for TXx, the change is larger in the north than that in the south of China, particularly at the 2.0 °C warming level under the SSP585 scenario. The increase in Tibetan Plateau is evident relative to other regions of China, with above 4 °C, suggesting the number of warm nights will augment in this area under climate change. Compared with the baseline period, the averaged-region TXx is likely to increase larger than the other three extremal indices and tends to increase by about 1.35 °C and 1.93 °C (1.50 °C and 2.11 °C) for two global warming levels under SSP245 (SSP585), respectively. On the other hand, the changes for TXn are relatively smaller with an increase of 0.81 and 1.56 °C (1.10 and 1.52 °C) under SSP245 (SSP585). In addition, extreme night indices (TNx and TNn) show a large change range and even some models (i.e., CNRM2) project these indices to decrease at future warming levels, especially for the TNn, whose sign of change is different from most models. That means there are uncertainties in projecting the changes for cold night indices among CMIP6 ensemble models.

Fig. 4: Changes in spatial distribution for four extremal indices compared to the conditions of the period 1995–2014.
figure 4

The columns from left to right represent the TXx, TXn, TNx and TNn. ad is at global warming levels of 1.5 °C under SSP245. eh is at global warming levels of 2.0 °C under SSP245. il is at global warming levels of 1.5 °C under SSP585. mp is at global warming levels of 2.0 °C under SSP585. The light gray dot area indicates the statistical test with a Student-t statistical significance level of 5%.

Table 1 Changes in extreme temperature indices of averaged region under 1.5 °C and 2.0 °C global warming levels and SSP scenarios.

Overall, as warm indices, summer days (SU) and tropical nights (TR) at 1.5 °C and 2.0 °C warming levels over China will increase relative to the baseline period of 1995-2014. Except for the west, these indices will have an obvious increase in most regions of China, with even exceeding 18 days at the 2.0 °C warming level. The magnitude of the increase for SU is more extensive than that for TR. However, for some high-cold regions, i.e., the Tibetan Plateau, the changes in the SU and TR are smaller owing to their cold climate characteristics. Although the temperature increase in the context of global warming, the thresholds corresponding to the SU and TR indices may still be hard to reach13. On the contrary, the cold indices (i.e., icing days (ID) and frost days (FD)) will show a decreasing trend in the future. A large decrease (exceeding 20 days) in ID and FD will occur over the west of China. Moreover, the decreasing trend is larger at 2.0 °C than that at 1.5 °C global warming and this trend appears to be spreading from the west into central and eastern China. In view of the area mean, it can be seen that a higher emission scenario and warming level correspond to a larger magnitude of change for these frequency indices. For example, relative to the historical period, the regional mean SU over China will increase by 17.97 days at the 2.0 °C warming level under SSP585, ranging from 11.03 to 25.52 days. In the same index, the value is 11.40 days at 1.5 °C and SSP245. However, the change of TR under SSP585 is smaller than the SSP245. This exception is caused by the fact that one model (IPSL-CM6A-LR) projects a sharp reduction in TR under SSP585, with -28.01 and -25.1 days compared to the baseline period for 1.5 and 2.0 °C warming levels, respectively.

As shown in supplementary Figure S5, with global warming, the warm days (TX90p) and nights (TN90p) will increase over the whole of China. In spatial distribution, the magnitude of increase in the west is larger than that in the east of China, particularly at the higher warming level. On the other hand, two cold indices (i.e., TX10p and TN10p) will decrease slightly relative to the baseline period. Under the SSP585 scenario, the magnitude of the decrease is more obvious than that under the SSP245. This response is especially strong in the central-western regions of China where there will be fewer cold days and nights in the context of global warming. Moreover, we also calculated the regional mean changes of four percentile indices at different warming levels. Relative to the historical period, area-averaged TX90p and TN90p are expected to rise by ~5% and ~8% at 1.5 °C and 2.0 °C levels, respectively. The changes in TX10p and TN10p are opposite to above warm indices, with a decrease exceeding 2% and 3%, respectively. It is noteworthy that the decrease in TN10p is most remarkable (~4.0%) in the warming level of 2.0 °C under the SSP585 scenario.

Impact of additional 0.5 °C

In this section, we depict the incremental changes over China from 1.5 °C to 2.0 °C warming level over China. Figure 5 and supplementary Figure 6 show the impact of additional 0.5 °C for extreme temperature indices in spatial distribution under SSP245 and SSP585 scenarios. It is noted that we merely draw the regions with a change magnitude of more than 25%, and highlight the incremental impact of temperature extremes in spatial is higher than the global mean temperature. Under the additional 0.5 °C global warming, that is, as the global mean temperature increases to 1.5 °C and then 2.0 °C above pre-industrial levels, most extreme indices are expected to increase proportionately more (exceeding 25%) during the final 0.5° than during the first 1.5° across most regions of China. However, the incremental changes in spatial distribution are different among indices. For some warm indices, such as TXx, SU and TX90p, the largest incremental changes (from 1.5° to 2.0°) tend to occur in the southwest. It means when the increment of global mean temperature is confined to 1.5 °C, the changes in these indices would be reduced by ~60% or more over the southwest. Likewise, for TXn and TNx, the northwest of China also has the largest incremental changes, indicating that when global mean surface temperature is arrested at 1.5 °C rather than 2.0 °C, the TXx and TNx would decrease by far more than 25%. In contrast, the values of TXx, SU and TX90p in the northeast, and the TNn, TR and TN10p over the high-cold Tibetan Plateau are smaller from 0.5 °C less warming under the SSP245, suggesting future 0.5 °C additional warming would have little impact on these extreme temperature indices and regions. Under the SSP585, the incremental changes are similar to the change in the SSP245, but smaller in magnitude and spatial extent. In general, the potential risk of temperature extremes over China may be lower if the global mean temperature is controlled to 1.5 °C rather than 2.0 °C under both SSP245 and SSP585 scenarios.

Fig. 5: The spatial distribution of incremental changes for 12 extreme temperature indices in terms of global warming levels of 2.0 °C-1.5 °C under SSP245 scenario.
figure 5

ad The extremal indices: TXx, TXn, TNx and TNn. eh The absolute indices: SU, ID, TR and FD. il The relative indices: TX90p, TX10p, TN90p and TN10p.

The region-averaged incremental changes of 12 extreme temperature indices for the CMIP6 ensemble in China are further evaluated in Fig. 6. It can be seen that, despite uncertainties among models, the incremental changes of multi-model means for all indices are exceeding 25%. In other words, China will benefit from avoiding a consistently incremental impact due to the limitation of global warming to 1.5 °C rather than 2.0 °C. Meanwhile, similar to the above conclusion in spatial, the incremental changes under the SSP245 seem to be larger than the SSP585. For example, the incremental changes of multi-model means under SSP245 and SSP585 are approximately 49.46% and 30.11% for the TXn. For most indices, over half of the models show that incremental changes are exceeding 25%, indicating that the sign of the changes is significant. Besides, it should be noted that under SSP585 the spread among models is larger and even some indices (i.e., TXx, TNx, TR and TX90p) have some outliers in the bottom plot, implying that there are still uncertainties in projecting the changes of extreme temperature under future high emission scenario.

Fig. 6: The incremental changes in extreme temperature indices from 0.5 °C warming over China under SSP245 and SSP585 scenarios.
figure 6

Box plots show the inter-model spread and the cross depicts the multi-model mean. The bottom dotted line indicates 25% changes from 2.0 °C to 1.5 °C in global mean temperature. The top figure is the percentage of model number when the incremental changes exceed 25%. The vertical lines extending beyond the boxes are the minimum and maximum values for each temperature index and the red points represent the outliers of an ensemble in the box plots.

Discussion

In this study, the bias-corrected outputs of 12 GCMs from CMIP6 through an EDCDF method is used to assess the skill of simulating 12 extreme temperature indices across China. Overall, after bias correction, the improvements of CMIP6 models in simulating the extreme temperature indices over China are really evident, in terms of both the temporal evolution and spatial distribution. The bias-corrected models have smaller IVS scores than raw ones obviously, indicating that they have a reasonable skill in reproducing the observed interannual variability in temperature extremes in China. In spatial, the positive biases in the TXx, TNx, ID, TR and FD and negative biases in the TXn, TNn and SU from the original models are substantially reduced after the bias correction for most regions of China. However, these revised models still show more or fewer biases in simulating the FD and TN10p for the west of China, especially for the Tibet Plateau. It is difficult to simulate the climatology in the high-cold region of China. These biases may stem from the coarse resolution of the original GCM, because it is hard to appropriately simulate the atmospheric processes in a highly spatially heterogeneous and complex terrain24,25, despite bias correction.

Based on projections from the 12 bias-corrected CMIP6 GCMs, the extreme temperature will show an increasing trend over most regions of China under 1.5 °C and 2.0 °C warming scenarios compared with the historical period. For TXx, TXn, TNx and TNn, a notable feature is that all indices are expected to increase over China. It means the extremal temperatures, including the maximum and minimum temperature, will increase in the context of climate change. Accordingly, the summer days (SU), tropical nights (TR), warm days (TX90p), and nights (TN90p) will increase, whereas four cold indices, such as ID, FD, TX10p, and TN10p, are anticipated to decrease at future two warming levels. Moreover, the higher emission scenario and warming level are likely to correspond to a larger magnitude of change for these frequency indices. These conclusions are also in line with other studies8,26,27. In addition, our results also show that the changes in spatial distribution are similar under a specific SSP scenario (i.e., SSP245 or SSP585) for most temperature indices (i.e., Fig. 4, Supplementary Figures 4 and 5), though there is a small difference in view of the region-averaged changes. It seems to the pathway to 1.5° or 2.0° warming doesn’t matter very much.

Our results suggest that as the global mean temperature increases to 1.5 °C and then 2.0 °C above pre-industrial levels, most extreme indices are expected to increase proportionately more during the final 0.5° than during the first 1.5° across most regions of China. In other words, limiting the increasing magnitude of global mean temperature to 1.5 °C rather than 2 °C relative to the pre-industrial level is beneficial to reduce extreme temperature risk for China. For some warm indices, such as TXx, SU and TX90p, the largest incremental changes (from 1.5° to 2.0°) tend to occur in the southwest (~60% or more). Under the SSP585, the incremental changes are similar to the changes in the SSP245, but smaller in magnitude and spatial extent.

When ensemble analysis on the period with a prescribed global warming level rather than a fixed time period can robustly identify the regional patterns of temperature changes, owing to the removal of some of the uncertainty related to the global models’ climate sensitivity28. However, the bias-corrected CMIP6 models still have uncertainties, irrespective of the simulation in the baseline period or future projection under two global warming levels, even some models show an opposite change trend compared with most models, i.e., TR in SSP585. These uncertainties may result from the forcings, the magnitude of the internal variability, the climate sensitivity and resolution of individual models, as well as the definition of warming timing24. In the historical simulation, a part of the biases might be also due to the uncertainties of observation, especially in complex topography with a lack of meteorological stations29. For the projection in the near future, uncertainties may principally come from the interior of climate models, while the uncertainties from climate scenarios (i.e., SSP) mainly affect the more remote future projection8. In addition, some uncertainties could inherit from the biases in the baseline period30. Therefore, despite bias correction, the outputs from the CMIP6 model should be used with caution. Moreover, it is noted that the warming scenarios in this study is based on the transient simulations31,32, i.e., from the outputs of CMIP6 models rather than a near-equilibrium 1.5 °C or 2.0 °C warmer world, which is produced by the coupled earth system models (i.e., HAPPI model intercomparison project). Thus, more models and climate scenarios should be considered in temperature extreme projections to decrease the uncertainties in further study.

Methods

Definition of extreme temperature indices

In this paper, 12 temperature extreme indices are defined by the Expert Team on Climate Change Detection and Indices (ETCCDI)33 as shown in supplementary Table 1. These indices illustrate the extreme events in view of the intensity and frequency, and have been widely applied in the estimation and projection on the temperature and precipitation extreme events across different regions of the world33,34,35,36. In this study, we used 12 ETCCDI indices, including four annual extremal indices: the warmest day (TXx), the coldest day (TXn), the warmest night (TNx) and the coldest nights (TNn); four absolute indices: the summer days (SU), the icing days (ID), the tropical nights (TR) and the frost days (FD); four relative indices: the warm days (TX90p), the cold days (TX10p), the warm nights (TN90p) and the cold nights (TN10p). Four annual extremal indices can indicate the intensity of extreme temperature, and the absolute and relative indices represent the frequency of extreme events.

Bias Correction method

Before bias correction, a regridded procedure for the raw CMIP6 models is conducted. First of all, the monthly mean observed meteorological variables (i.e., daily maximum and minimum temperature) in the historical period are interpolated at the same resolution with the CMIP6 model using a bilinear interpolating technology. Then, the differences between the interpolated observation and each GCM are calculated. Thirdly, these anomaly fields are further interpolated to the original resolution and then add to the observed variables to generate the new output for each GCM.

The EDCDF method is employed in this study to correct the biases from raw CMIP6 outputs. This method is based on quantile mapping technique and has an assumption that the difference between simulated climate variables and observation in the reference (historical) period maintains during the correction (future) period for a given percentile29,37,38. EDCDF is superior to the general cumulative distribution function (CDF), that is, it can compare the difference of CDF between simulation and observation in a given reference period and adjust the CDF of model rather than merely adjusting the mean and variance of model output39,40,41. This approach is defined as follows:

$${\tilde{x}}_{f}={x}_{f}+{F}_{o}^{-1}({F}_{f}({x}_{f}))-{F}_{s}^{-1}({F}_{f}({x}_{f}))$$
(1)

where \({\tilde{x}}_{f}\) is corrected daily minimum and maximum temperature from CMIP6 in the future period; \({x}_{f}\) is the projected raw climatic variables; \({F}_{f}\) refers to the cumulative distribution function (CDF) of the model in future period; \({F}_{o}^{-1}\) and \({F}_{s}^{-1}\) are the quantile functions (inverse CDF) for observation and simulation in the reference period, respectively.

Timing of global warming levels

The 1.5 °C and 2.0 °C warming levels proposed in the 2015 Paris Agreement are based on the increase in global annual mean temperature compared to the pre-industrial level (1850 ~ 1900). To reduce the uncertainties related to the large interannual variability in defining the warming levels, the time series of global annual mean temperature is smoothed using the triangular moving average method with a 20-year window firstly for each CMIP6 model42. The specific thresholds are determined as the first year when the increase in global mean temperature reaches 1.5 °C or 2.0 °C above the pre-industrial equivalent for individual GCM. As a consequence, a 20-year period extending from 9 years prior and 10 years after the warming target year is as the future projection period under two warming levels. Likewise, the period (1995 ~ 2014) with the same time length is as the historical or baseline period to validate the performance of CMIP6 models. On this basis, we followed Tang, et al.43. Kim and Bae44 to define the incremental impact of temperature extremes from 1.5 to 2.0 °C warming levels.

Model performance metrics

To quantify the agreement between the GCM model simulations and observation in spatial, the root-mean-square errors (RMSE) and relative RMSE are computed in the baseline period (1995-2014), which are defined as:

$$RMSE=\sqrt{\frac{\sum {({X}_{m}-{X}_{o})}^{2}}{n}}$$
(2)
$$RMSE{\prime} =\frac{RMSE-RMS{E}_{m}}{RMS{E}_{m}}$$
(3)

where \({X}_{m}\) and \({X}_{o}\) are simulated and observed values; n is the number of grids; \(RMS{E}_{m}\) represents the median value of RMSE for all CMIP6 models and \(RMSE{\prime}\) is the relative performance for each GCM. If the \(RMSE{\prime}\) value of one GCM is negative, indicating that the performance of this model is superior to half of all models, and vice versa13 However, \(RMSE{\prime}\) cannot be good at reflecting the range of the errors between GCM and observation. Thus, the standardized RMSE (\(RMS{E}_{std}\)) is also introduced in this study. It is calculated by standard deviations of the extreme temperature indices in observation, as follows:

$$RMS{E}_{{\rm{std}}}=\frac{RMSE}{\sqrt{\frac{1}{n}\sum {({X}_{o}-\overline{{X}_{o}})}^{2}}}$$
(4)

In addition, we applied an inter-annual variability skill score (IVS) to evaluate the ability of CMIP6 in simulating the inter-annual variation45, defined as follows:

$$IVS={\left(\frac{{\sigma }_{m}}{{\sigma }_{o}}-\frac{{\sigma }_{o}}{{\sigma }_{m}}\right)}^{2}$$
(5)

where \({\sigma }_{m}\) and \({\sigma }_{o}\) represent the interannual standard deviation of model simulations and observation respectively. A smaller IVS value means the inter-annual variation of model is in line with the observation.

The observed daily gridded dataset CN05.1 (hereinafter referred to as CN05) at 0.25° for China for the 1961-2014 period is firstly collected from the China Meteorological Administration. CN05 is based on meteorological observations from more than 2000 stations in China46. It has been popularly used in various hydroclimatic applications47,48. Here, we employ it to validate the performance of models before and after bias correction. Then, we obtained daily maximum and minimum temperatures for historical, SSP2-RCP4.5 and SSP5-RCP8.5 scenarios (hereinafter referred to as SSP245 and SSP585) from 12 CMIP6-GCMs (Supplementary Table 2) from the Earth System Grid Federation (ESGF) Peer-to-Peer (P2P) distributed data portal (https://esgf-node.llnl.gov/search/cmip6/). Due to the different spatial resolutions between these models and observation, the original model applied a bilinear interpolation method to match the reference data.