Development of global monthly dataset of CMIP6 climate variables for estimating evapotranspiration

Reliable projection of evapotranspiration (ET) is important for planning sustainable water management for the agriculture field in the context of climate change. A global dataset of monthly climate variables was generated to estimate potential ET (PET) using 14 General Circulation Models (GCMs) for four main shared socioeconomic pathways (SSPs). The generated dataset has a spatial resolution of 0.5° × 0.5° and a period ranging from 1950 to 2100 and can estimate historical and future PET using the Penman-Monteith method. Furthermore, this dataset can be applied to various PET estimation methods based on climate variables. This paper presents that the dataset generated to estimate future PET could reflect the greenhouse gas concentration level of the SSP scenarios in latitude bands. Therefore, this dataset can provide vital information for users to select appropriate GCMs for estimating reasonable PETs and help determine bias correction methods to reduce between observation and model based on the scale of climate variables in each GCM.


Background & Summary
Evapotranspiration (ET) is an important component of the water cycle and plays a major role in agriculture and water management 1 .Earth systems have shown robust change signals of climate variables since the 20th century 2 .Recently, the climate crisis facing humanity has been mainly due to the changes in the water cycle caused by changing patterns of precipitation, temperature, and surface runoff 3 .Furthermore, the changes in the hydrological cycle due to an increase in the atmospheric water vapor content are related to extreme changes in the various factors involved in the general circulation 4 .Numerous studies have been conducted to characterize regional and continental scale surface water losses to the atmosphere by estimating potential ET (PET).The Food and Agriculture Organization (FAO) of the United Nations has recommended using the Penman-Monteith (PM) method to estimate potential evapotranspiration 5 , and some studies are used as reference models for other methods with less input data [6][7][8][9] .However, PM requires many climate variables to estimate PET.Therefore, climate models that simulate sufficient climate variables can only be used to estimate PET.
General circulation models (GCMs) have been widely used to estimate PET because many models simulate several climate variables for historical and future periods [10][11][12][13][14][15][16][17] .GCMs are continuously being updated, with the incorporation of new physical processes and biochemical cycles and simulations at higher spatial resolutions.However, CMIP6 GCMs have issues with Equilibrium Climate Sensitivity (ECS) 18 .Currently, some studies have provided scientific evidence for the reason for the high ECS of CMIP6, claiming that CMIP6 GCMs were more sensitive to greenhouse gases and exhibited stronger temperature increases than previous models during the 21st century due to enhanced cloud feedback 19 .Many studies compared the performance of past reproducibility of CMIP6 GCMs with their previous versions in terms of various evaluation metrics and showed their better performance than the earlier versions [20][21][22][23][24][25][26] .The advance in these model performances has made strides in providing scenarios for better future climate projection.Furthermore, Shared Socioeconomic Pathways (SSPs) representing the future greenhouse gas concentration of CMIP6 include future mitigation, adaptation, and efforts on climate change future social and economic changes based on the radiative forcing levels of Representative Concentration Pathways (RCPs) 27 .Therefore, SSP scenarios are an excellent source for exploring climate change by estimating realistic climates.
Climate impact assessment using newly presented SSPs is documented in IPCC's assessment report 6.These concepts were provided in ScenarioMIP, an improved version of the integrated assessment models (IAM) based on change by anthropogenic causes, such as land use, greenhouse gas, and aerosol 27,28 .Furthermore, these concepts of scenarios would also reduce the knowledge gap of radiative forcing and temperature overshoots in the future.
This study generates a raw types dataset of monthly climate variables (average temperature, minimum temperature, maximum temperature, wind speed, relative humidity, solar radiation) to estimate the global PET of 14 CMIP6 GCMs for the four main SSPs (Fig. 1 presents six climate variables represented by ACCESS-CM2 model as an example).Global climate data were re-gridded to 0.5° spatial resolution using linear interpolation.The re-gridded dataset of the CMIP6 GCM is freely available online in NetCDF format 29 .The generated climate variables can also be used to estimate global PET using Python code available in the 'pyeto' package 30 .The PET code is also provided in a 'py' file format 29 .The Penman-Monteith (PM) used in this study requires location (latitude and longitude), temperature, humidity, radiation, and windspeed for estimating evapotranspiration.Especially this dataset is needed to adjust some weather parameters for the local average value of atmospheric pressure, and it was used to compute extraterrestrial radiation and daylight hours.Latitude is directly involved in the calculation process of extraterrestrial radiation and daylight hours.The detailed description of the computational process in Penman-Monteith (PM) can be found under 'pyeto' in the Python archive.The projection period was divided into the near (2031-2065) and far (2066-2100) futures, and PET changes for both futures compared to the base period (1985-2014) were calculated.The projected future evapotranspiration using Python code and dataset increased in most scenarios.These results mean that the thermal energy for the future water cycle increases as the greenhouse gas concentration increases.

Methods
Data.The dataset generated in this study provides six climate variables (Relative humidity: Hurs; Maximum temperature: tasmax; Minimum temperature: tasmin; Average temperature: tas; Solar radiation: rsds; Windspeed: sfcWind) of 14 CMIP6 GCM to estimate PET over the globe, as shown in Table 1.The raw data for CMIP6 GCMs used in this study were collected from the CMIP6 archive 31 .The future climate variables for four SSPs (SSP1-2.6,SSP2-4.5,SSP3-7.0, and SSP5-8.5)were used.GCM simulation for the historical and future periods was re-gridded to 0.5° × 0.5° resolution using linear interpolation.The users can confirm a metadata summary in xlsx file format 27 .The variant label of the dataset was the r1i1pif1.
Potential evapotranspiration estimation method.The projected historical and future ET can be estimated using a Python code developed by us, and we developed it based on the Penman-Monteith (PM) method.PM method proposed by Allen et al. 5 , to estimate monthly global PET using six climate variables of CMIP6 GCMs.PM represents a standard value of PET.It can be calculated using Eq.(1), as below: where PET is the monthly PM, R n is the net radiation at the crop surface, T av is the monthly mean temperature at a 2 m height (°C), u 2 is the average monthly wind speed at a 2-m height (m/s), e s is the saturation vapor pressure (kPa), e a is the actual vapor pressure (kPa), Δ is the slope of the saturation vapor pressure versus temperature curve, G presents the soil heat flux density (MJ/m 2 month −1 ), and γ is the psychrometric constant.

Data Records
The six climate variables of the CMIP6 GCM dataset for estimating potential evapotranspiration are available in NC-formatted files and can be freely downloaded from the repository 29 .Furthermore, a metadata summary of the available CMIP6 GCM dataset is provided as an xlxs formatted file from the repository 29 .Table 2 presents the information depending on each latitude in CMIP6 GCMs.The resolution of climate variables in the data archive was re-gridded as 0.5°× 0.5°.The global latitudes were separated into the six (e.g.L1: 0° to 29.5° and 0° to −29.5°, L2: 30° to 59.5° and 30° to −59.5°, L3: 60° to 90°).The six climate variables for the historical period span from 1950 to 2014.The projected climate variables were divided into the near (2031-2065) and the far (2066-2100) futures.The projected PM ETP for historical and future periods was estimated using six climate variables.

Technical Validation
The PM ETP dataset generated in this study was validated for similarity with Earth's climate variables in latitude bands depending on the radiative forcing levels of SSPs.First, this study confirms the similarity in climate variables between 14 CMIP6 GCMs and Earth using statistical metrics.It validates the estimated historical evapotranspiration using the PM method based on the climate variables.Second, the projected climate variables for the future period were compared to confirm the projection model's performance technology, and we verified that the projected future climate variables adequately reflected the greenhouse gas concentration levels of the SSP scenarios.Finally, we estimate the future evapotranspiration using PM based on the climate variables of    SSPs, and the changes in PM ETP were calculated for the near and far futures compared to the historical period to confirm the relative changes based on the greenhouse gas concentration of SSPs.

Validation of global climate variables and PM in the historical period.
This study used five statistical metrics (Maximum, Minimum, Median, Standard deviation, and Interquartile range) to compare the range of climate variables in GCMs depending on the latitude bands.Figure 2 presents the statistical performance of the climate variables of 14 CMIP6 GCMs in simulating the historical climate for the five ranges of latitudes.The relative humidity in SL1 and NL1 (low latitudes in both hemispheres) was low compared to the other latitudes, whereas the relative humidity in NL3 was the highest.The maximum, minimum, and median relative humidity at NL1 was the lowest compared to the other latitudes, whereas the relative humidity in NL2 and NL3 showed the opposite results.The standard deviation and Interquartile range (IQR) were the highest at SL1, whereas those were lower at NL3 compared to the other latitudes.On the other hand, the maximum, minimum, and median wind speed was the lowest at SL1, whereas the wind speed in SL2 was higher than in other latitudes.The variability of wind speed was the highest in SL2.The solar radiation's maximum, minimum, and median were the highest at low latitudes (SL1 and NL1) of both hemispheres and the values in NL3 were the lowest than in other latitudes.Furthermore, solar radiation's standard deviation and IQR were larger at low latitudes than at mid-latitudes of both hemispheres.The maximum, minimum, and average temperatures for the 14 CMIP6 GCMs were similar, with no significant differences between the GCMs.These results showed that the temperatures in low latitudes (SL1 and NL1) of both hemispheres were the highest, whereas the NL3 was the lowest.The standard deviation and IQR of temperatures at NH were also the largest compared to the other latitudes.Table 3 shows the range of climate variables for the 14 CMIP6 GCMs in the historical period.Overall, the variability of historical maximum temperature was the largest for FGOALS-g3, while IPSL-CM6A-LR had the lowest variability.In contrast, the variability of minimum temperature was the smallest in MRI-ESM2-0, while FGOALS-g3 had the highest variability.For average temperature, ACCESS-CM2 had the highest variability, while MRI-ESM2-0 had the lowest variability.Significantly, the relative humidity of CAS-ESM2-0 was anomalously larger than the other GCMs with an upper bound of 2628.1%,suggesting that it is unreasonable to use the climate variables of CAS-ESM2-0 to estimate historical PM ETP.Therefore, bias correction must be performed to estimate PM ETP using CAS-ESM2-0.This study developed the Python code to estimate a historical monthly PM ETP sample using climate variables of ACCESS-CM2. Figure 3 presents the spatial and temporal variation of annual PM ETP for ACCESS-CM2 in the historical period.Furthermore, Table 4 presents the historical PM ETP ranges (Upper, Lower, and Median)  depending on the 14 CMIP6 GCMs.The PM ETP was generally high in almost areas of NL1 and SL1. and high in some areas of SL2 .In contrast, the PM ETPs at NL2 and NL3 were estimated 3.9-67.0mm and 2.1-24.2mm for the historical period, respectively.The PM ETPs at SL2, NL2 and NL3 showed a gradual increase, while there was no significant change at SL1 and NL1.Consequently, the dataset of climate variables generated in this study provides a reasonable estimate of ET over the globe.
For NL1, the historical PM ETP estimated from CanESM5 had the largest difference between the lower and upper bounds, while the PM ETP estimated from INM-CM5-0 had the smallest compared to the other GCMs.The historical PM ETP estimated from INM-CM4-8 had the largest difference between the lower and upper bounds in NL2, and INM-CM5-0 showed the second largest.On the other hand, the historical PM ETP estimated from GFDL-ESM4 had the smallest difference between the lower and upper bounds.For NL3, the difference in PM ETPs estimated from most GCMs between the lower and upper bounds was calculated as 2 to 3, except for CMCC-ESM2.On the other hand, the PM ETP estimated from CMCC-ESM2 was calculated as 4.9 mm, and the difference between the lower and upper bounds was the most significant.For SL1 in SH, the difference in historical PM ETP between the lower and upper bounds estimated from CanESM5 was the largest at 12.1 mm, while PM ETP estimated from INM-CM5-0 had the smallest difference at 4.7 mm.For SL2, the difference in historical PM ETP estimated from MPI-ESM1-2-HR between the lower and upper bounds was the largest at 5.5 mm, while the estimated from ACCESS-ESM1-5 was the smallest.Significantly, the PM ETPs estimated from CAS-ESM2-0 differed from phenomenon in Earth's latitudes, and PM ETPs of all latitudes needed to be better estimated.Therefore, it is recommended to use a bias correction for estimating PM ETPs using climate variables of CAS-ESM2-0.

Validation of projected global climate variables and PET for the future period.
This study compared the projected six climate variables of four SSPs in the future period (2015-2100).Figure 4 presents the climate variables for each SSP scenario generated by equally weighted (0.071) the 14 CMIP6 GCMs.The projected solar radiation increased at most latitudes of SH.In contrast, the projected solar radiation decreased for SSP  scenarios with higher greenhouse gas concentrations than those with lower concentrations at most latitudes in NH.The projected trend in wind speed for four SSPs at mid-latitudes (NL2 and SL2) of both hemispheres was lower in the far than in the near futures.However, the projected wind speed at low and high latitudes of both hemispheres was higher for the high-emission scenarios than for the low-emission scenarios.The relative humidity was projected to decrease at all latitudes for all scenarios.Especially the wind speed in NH has decreased the most.Likewise, the decreased signals of relative humidity were lower for the high-emission scenarios than for the low-emission scenarios.All temperatures showed an increase in the future at all latitudes for all SSPs.The increase signals was the highest for SSP5-8.5 at all latitudes compared to other scenarios.The trends in climate variables for different SSPs reflect the greenhouse gas emission levels considered in developing the SSPs.Supplementary Table S1 presents the upper and lower bounds for future climate variables of SSP scenarios.For relative humidity, most GCMs gradually decrease with higher greenhouse gas concentrations.In particular, the difference in the relative humidity between the upper and lower was the largest in CAS-ESM2-0.Furthermore, the maximum relative humidity of CAS-ESM2-0 was above 200%, and the difference was significant compared to the relative humidity projected by other GCMs.For maximum, minimum, and average temperatures, the MIROC6 was higher than other GCMs.Moreover, the variability of projected future average temperature was the largest in CanESM5, while the variability of ACCESS-CM2 and ACCESS-ESM1-5 was the smallest.Meanwhile, the variability of maximum temperature for all scenarios was the smallest in FGOALS-g3 and MIROC6.On the other hand, the projected minimum temperature for the future was the highest in MIROC6.The variability of minimum temperature was the smallest in ACCESS-ESM1-5, whereas CanESM5 had the highest variability compared to other GCMs.The variability of solar radiation was the smallest for ACCESS-CM2, while MPI-ESM1-2-LR had opposite results.Notably, MPI-ESM1-2-LR had the largest insolation across all scenarios.For wind speed, FGOALS-g3 had the highest variability, while ACCESS-ESM1-5 had the lowest variability.
This study used the SSP scenarios dataset to project future PM ETP, as shown in Fig. 5.The projected PM ETP showed a gradual increase in all SSPs.The projected PM ETP in high greenhouse gas concentrations increased steeper than in low greenhouse gas concentrations.Furthermore, the upper and lower bounds of projected PM ETP were the highest in SSP5-8.5 in NL2 and NL3, whereas SSP1-2.6 was the lowest.The upper bound of projected PM ETP at all latitudes of SH was the highest in SSP5-8.5.However, the lower bound of projected PM in most scenarios was similar in the far future.
Table 5 presents the projected future annual PM ETP ranges based on the scenarios.Overall, the projected PM ETP of the upper bound in the high greenhouse gas concentration scenario was higher than in the low greenhouse gas scenario.In contrast, the lower bound for PM ETP had the opposite results.For PM ETP estimated from the low-emission scenarios, the difference between the upper and lower bounds was greatest in ACCESS-CM2 compared to the other GCMs, while the difference in GFDL-ESM4 was the smallest.On the other hand, the difference in PET ETP of high emission scenarios between the upper and lower bounds was greatest in ACCESS-CM2 and lowest in CMCC-ESM2.Significantly, the PM ETP estimated from CAS-ESM2-0 was unusually lower compared to the other scenarios, which suggests that the relative humidity projected in CAS-ESM2-0 is unusually high compared to the other GCMs.
Projected changes in annual and seasonal potential evapotranspiration.The spatially interpolated changes in annual PM ETP for different SSP scenarios are shown in Fig. 6.Overall, PM ETP for all scenarios at NL2 showed an increase in the near future compared to the historical period.Furthermore, the increased signals were more in the far future than in the near future.Especially the change in PM ETP for the high-emission scenarios at all latitudes was higher than for the low-emission scenarios.Therefore, the PM ETP changes at all latitudes were aligned with the emission levels of SSPs.Supplementary Table S2 shows the range of change in annual and seasonal PM ETP based on the four SSP scenarios.For SSP1-2.6, the annual and seasonal PM ETP of CMCC-ESM2-0 had the largest variability, while the PM ETP of CAS-ESM2-0 had decreased compared to the historical period.Furthermore, the annual PM ETP of INM-CM4-8 had the smallest variability.Furthermore, the variability of seasonal PM ETP was the smallest in INM-CM4-8 (Winter), FGOALS-g3 (Spring), and MPI-ESM1-2-LR (Fall), respectively.For SSP2-4.5, the variability of annual and seasonal PM ETP was the smallest for FGOALS-g3, while CMCC-ESM2-0 was the opposite results.Meanwhile, the variability of annual and seasonal PM ETP estimated in SSP3-7.0 was the smallest for INM-CM4-8, and the variability of annual PM ETP for SSP5-8.5 was also the lowest for INM-CM4-8.On the other hand, the GCMs with lower seasonal variability for SSP5-8.5 were all estimated differently.

Usage Notes
This global dataset can improve robust projections of the future climate for SSPs using various GCMs.It can be used to analyze the climate change impact and quantify the effectiveness of adaptation and mitigation policies.Its applicability can be extended in the future by adding simulations for more GCMs and SSPs.

Fig. 3
Fig. 3 Spatial patterns and temporal changes in annual PM ETP of ACCESS-CM2 in the historical period (1950-2014).

Fig. 4
Fig. 4 Trends in climate variables of a multi-model ensemble of equally weighted 14 CMIP6 GCMs for the four main SSPs in the future period (2015-2100).

Fig. 5
Fig. 5 Trends in PM ETPs of 14 CMIP6 GCMs for the four main SSPs in the future period (2015-2100).

Table 2 .
Information of each latitude constructed in CMIP6 GCMs dataset.
Fig. 2 Statistical performance of six climate variables of 14 CMIP6 GCMs for the historical period (1950-2014) based on five metrics.

Table 5 .
Projected the annual PM ETP (mm) ranges (Upper, Lower, and Median valuse) of main four SSPs in the future period (2015-2100).