Identifying and ranking of CMIP6-global climate models for projected changes in temperature over Indian subcontinent

Selecting the best region-specific climate models is a precursor information for quantifying the climate change impact studies on hydraulic/hydrological projects and extreme heat events. A crucial step in lowering GCMs simulation-related uncertainty is identifying skilled GCMs based on their ranking. This research performed a critical assessment of 30 general circulation models (GCMs) from CMIP6 (IPCC’s sixth assessment report) for maximum and minimum temperature over Indian subcontinent. The daily temperature data from 1965 to 2014 were considered to quantify maximum and minimum temperatures using a gridded spatial resolution of 1°. The Nash–Sutcliffe efficiency (NSE), correlation coefficient (CC), Perkins skill score (PSS), normalized root mean square error (NRMSE), and absolute normalized mean bias error (ANMBE) were employed as performance indicators for two different scenarios, S1 and S2. The entropy approach was used to allocate weights to each performance indicator for relative ranking. Individual ranking at each grid was achieved using a multicriteria decision-making technique, VIKOR. The combined ranking was accomplished by integrating group decision-making, average ranking perspective, and cumulative percentage coverage of India. The outcome reveals that for S1 and S2, NRMSE and NSE are the most significant indicators, respectively whereas CC is the least significant indicator in both cases. This study identifies ensemble of KIOST-ESM, MRI-ESM2-0, MIROC6, NESM3, and CanESM5 for maximum temperature and E3SM-1-0, NESM3, CanESM5, GFDL-CM4, INM-CM5-0, and CMCC-ESM2 for minimum temperature.


Study area and data collection
The Indian subcontinent lying in the northern hemisphere, with longitude and latitude ranging from 67.5° to 97.5° E and 7.5° to 37.5° N, respectively covering 335 numbers of one-degree spatial resolution grids, was considered as the study area.The model-simulated data from World Climate Research Programme (WCRP) was utilized to acquire GCMs under CMIP6 (https:// esgf-node.llnl.gov/ search/ cmip6/) as a part of IPCC's sixth assessment report 1 .Outputs from 30 GCMs for maximum and minimum temperature (designated as TMAX and TMIN) with daily temporal resolution were used as historical simulated data.The details of 30 GCMs under CMIP6 used in this study are tabulated in Supplementary Table S1.The gridded daily TMAX and TMIN data (https:// www.imdpu ne.gov.in/ lrfin dex.php) for 50 years (1965-2014) at 1˚ spatial resolution are collected from the Indian Meteorological Department (IMD).These data are used as the historical observed data to evaluate the performance of climate models.The base period 1965-2014 was selected considering CMIP6 historical simulation data sets' availability until 2014.Also, CMIP6 is an updated version of CMIP5 to produce relatively higher resolution data with an increased number of distinct climate models and eight future scenarios representing shared-socioeconomic pathways.All the gridded GCMs data available at different spatial resolutions were brought down to a common grid resolution of 1° × 1° using bilinear interpolation techniques 3,43,44 .
It was observed from the previous studies that model selection should be done rationally for climate change impact studies 4,10,11,20,45 .Each CMIP6 model differs from each other, and within each model, different ensemble members result in different GCM outputs.Considering these facts, 30 GCMs were selected for the analysis such that all the models belong to the same modeling structure (i.e., Atmospheric General Circulation Model) and with the same ensemble realizations r1i1p1f1 (indicating realization index, initialization index, physics index, and forcing index as 1).

Normalization and weight computation of performance indicators
Normalization methods were adopted to measure various non-proportional performance indicators on a common scale 21,22,31,[46][47][48] .In this study, the normalization is carried out using the Max-Min method to obtain the decision matrix.After normalization, the equity contribution for each indicator is calculated using the Sum method.Using the contributing values, the weights are computed using the Entropy method.The lower entropy value of the indicator corresponds to its more valuable information, i.e., larger entropy-based weight.Finally, the weighted normalized decision matrix is calculated which will subsequently be used as an input in MCDM method.Mathematical representation of above discussed normalization steps is shown in supplementary equations S6 to S10.

Multicriteria Decision-Making using VIKOR
VIKOR (VlseKriterijuska Optimizacija I Komoromisno Resenje), primarily developed by Opricovnic in 1979, is a well-known multicriteria decision-making (MCDM) method 49 .VIKOR, a compromise ranking method, yields the feasible solution nearest to the ideal, hence helping the decision makers to conclude final solutions 39,47 .The methodology for ranking GCMs to obtain compromise solution using this method is described in Fig. 1.The computation of utility measure (S i ), regret measure (R i ) and the index values (Q i ) are carried out using the following equations: Here, ϑ is a balancing factor between the utility measure (overall benefit) and the regret measure (maximum individual deviation).The value of ϑ ranges between 0 and 1, with "Voting by majority rule" ( ϑ > 0.5) or "by consensus" (for ϑ = 0.5) or "with a veto" (for ϑ < 0.5) 29,30,47,48 .

Group decision-making method
The study area consists of 335 grids, each with a distinctive rank.In order to create a combined rank for the study area, a group decision-making process 25 was adopted.The steps involved in this method are as follows: At each grid, rankings were first separated into two halves and organized in descending order.GCMs with ranking 1 to X make up the first half of the sample (X = I/2, where I is the total number of GCMs).The GCM i strength is stated as follows: where, q k iz = 1 if GCM i is in rank z for the grid point k and zero in all other case.i corresponds to the GCM in the first half portion and z ranges from 1st to xth rank, and k represents the grid points.
The weakness of GCM i is given as: where, q k iz = 1 if GCM i is in rank z for the grid point k and zero in all other case.i corresponds to the GCM in the second half portion and z ranges from 1st to yth rank up to last ranking in the portion, and k represents the grid points.
Net strength is calculated as: The GCM with the highest net strength was regarded as the most appropriate or the best, and others were ordered in accordance with their values.

Ethical approval
Ethics violation has not been done in the study.

Results and discussions
Due to the complex atmospheric processes, model structure, and parameterization variability in representing land surface processes (vegetation dynamics, soil moisture, and land-atmosphere interactions), the temperature data may be over or underestimated from different CMIP models.Variable numerical schemes, grid configurations, spatial grid resolutions in capturing small-scale features, and representation of climate forcing using datasets and methods (for aerosols, land-use changes, greenhouse gas concentrations, ocean circulations, ice and snow albedo, and aerosols) contribute to differences among GCMs, even for same realization, initialization, physics, and forcings 20,[50][51][52] .Therefore, there is a need to appraise the uncertainties associated with climate change data before incorporating it in hydro-climatological studies.Hence, the analysis was carried out for the entire Indian sub-continent consisting of 335 grids.But for demonstrating the performance evaluation of various GCMs for TMAX and TMIN, a grid with a longitude 94.5° E and a latitude 26.5° N located in North-East India was selected.The detailed description of the behavior of different GCMs on different performance indicators are explained in the following sub section.

Analysis of performance indicators, entropy and VIKOR method at a grid (94.5° E, 26.5° N)
For demonstrating entropy and the VIKOR method, minimum temperature at a grid with a longitude 94.5° E and a latitude 26.5° N located in North-East India was selected.The analysis used the performance indicators under two scenarios, S1: ANMBE-CC-NRMSE-PSS and S2: ANMBE-CC-NRMSE-NSE-PSS and the results are listed in Table 1.From Table 1, it can be observed that the GCM, NESM3 is having the maximum similarity, PSS (97.02%) with the observed PDF and is the most preferred GCM.Other performance indicators also suggest that the same GCM performs better with the values of ANMBE (0.0218), NRMSE (0.1118), and NSE (0.8697).
Similarly, INM-CM4-8 showed the least similarity (51.79%) with the observed PDF and was least preferred.Also, BCC-ESM1 was the least preferred in the case of indicators ANMBE (0.4965), NRMSE (0.5183), and NSE (− 1.8007).NorESM2-MM was the best correlated to observed data with a value of 0.9379, while KACE-1-0-G was the worst correlated to the observed data with a value of − 0.1273.The above analysis reveals that the indicators behave differently with distinct GCMs.Further, all the indicators were normalized using the Max-Min method and then the equity contribution for each indicator was calculated using the Sum method.Indicator values are made consistent with the requirements of the entropy approach by the normalization procedures, which also guarantee that large range indicators do not overpower the small range indicators.For S1, among the four indicators, PSS has the highest importance (33.73%) which shows that its effect on GCM ranking is the highest, followed by NRMSE (32.90%),ANMBE (28.44%) and CC (4.93%).There is no significant difference in the contribution of PSS, NRMSE, and ANMBE and their total contribution amount to 95.07%, making them an equally important indicator for GCM ranking for S1 at grid (94.5°E, 26.5° N).For S2, among the five indicators, NSE has the highest importance (35.93%), followed by PSS (21.61%),NRMSE (21.08%),ANMBE (18.22%), and CC (3.16%).The entropy method makes it easy to rank 30 GCMs by providing differential weights opportunities instead of equal weights.Weights computed by entropy methods were used to obtain a normalized weighted decision matrix, subsequently used as inputs to the VIKOR method.The compromise solution is obtained by computing utility measure (S i ), regret measure (R i ), and index values (Q i ) using Eqs.(1-3).In this study, the balancing factor ϑ is taken as 0.5.GCM, GFDL-CM4 is identified as the compromise solution for both scenarios S1 and S2 by satisfying the conditions C1 and C2.From Table 1, it can be observed that GFDL-CM4 ranks the best by the measure Q (minimum value 0.1656 for S1 and 0.2385 for S2) and by measure R (minimum value of 0.5054 for S1 and 0.6137 for S2).The compromise solution was accepted as obtained by the minimum individual regret (minimum R value) of the "opponent".

Analysis of performance indicators, entropy and VIKOR method for India
The procedure described in previous section was repeated for the entire India comprising of 335 grids for minimum and maximum temperature using MATLAB and Python in-house developed code.For each grid, all the 30 GCMs were considered and 4 (or 5) indicators for scenario S1 (or S2) were used in achieving compromise solutions.It is noted that weights vary with indicators and with grids.The range of indicators depict a significant variation in the performance of various GCMs.A GCM may perform well in accordance with an indicator, and at the same time, the same GCM performs poorly in accordance with another indicator.One can refer the Supplementary Figs.S1-S10 for individual indicator values corresponding to all the 335 grids and all the GCMs over India, for TMAX and TMIN.For scenario S1, NRMSE is the most crucial indicator with a mean weightage of 41.18% and 45.88% for maximum and minimum temperatures, respectively.For scenario S2, NSE dominates NRMSE, with mean weightage of 35.30% and 42.95% for maximum and minimum temperature, respectively (Supplementary Fig. S11).Therefore, instead of assigning equal weights for indicators, differential weight opportunities were adopted for indicators using the entropy method.The distribution of weights in various weight ranges (in %) obtained by the entropy method for the entire India, for scenarios S1 and S2 are listed in Table 2.It was observed that the number of grids with a weight less than 10% is the highest for CC (i.e., 323 for TMAX (S1), 318 for TMIN (S1), 333 for TMAX (S2), and 329 for TMIN (S2)), indicating that it is the least prominent indicator for ranking GCMs.Similarly, a greater number of grids in higher weight ranges indicate that NRMSE and NSE are the most prominent indicators for S1 and S2, respectively.
The compromise solutions were computed for entire India, and their solutions were accepted as obtained by the maximum group utility (minimum S value) of the "majority" and the minimum individual regret (minimum R value) of the "opponent".From Figs. 3 and 4, it can be noticed that 301 grids (89.85%) for maximum temperature and 318 grids (94.92%) for minimum temperature, yield at least one same best-ranked GCMs for scenarios S1 and S2.By analyzing maximum and minimum temperatures from Figs. 3 and 4, it can also be observed that only 36 grids (10.75%) for S1 and 27 grids (8.06%) for S2, produced at least one same best-ranked GCMs.A uniform ranking pattern was seen in both scenarios as indicated by high similarity in compromise solutions (89.85% for maximum temperature and 94.92% for minimum temperature).Moreover, a nonuniform ranking pattern existed between maximum and minimum temperature under both scenarios (i.e., similarity under S1 was 10.75%, and S2 was 8.06%).
Ensembles of CNRM-CM5, FGOALS-s2, and MIROC5 for the maximum temperature and MIROC4h, NorESM1-M, MIROC5, and CESM1-CAM5 for the minimum temperature were already proposed in the literature 11 using compromise programming and a group decision approach.It is found out that the GCMs ensemble for maximum temperature included MIROC5 and MIROC6, respectively, from this study as well as from the literature 11 .Both the GCMs are from the same modeling institution, Atmosphere and Ocean Research Institute, University of Tokyo, Japan.Both studies produce distinct GCMs ensemble suggestions for maximum and minimum temperature, which might be a result of different chosen performance indicators, decision-making approaches, spatial resolutions, and model selections.

Conclusions
This study deals with identifying the best ensemble of GCMs for the Indian subcontinent for studying the futuristic climate change impact.The identification was based on five performance indicators under two scenarios, S1 (ANMBE-CC-NRMSE-PSS) and S2 (ANMBE-CC-NRMSE-NSE-PSS), for ranking 30 CMIP6 GCMs.Grid wise performance was evaluated using these indicators at 335 grids for maximum and minimum temperature.The entropy method was operated to assign weights to the indicators, after normalizing their values using the max-min and the sum methods.Based on indicators and their assigned weights, a multicriteria decision-making method VIKOR was used to rank GCMs and obtain compromise solutions at all grids.Group decision-making, average ranking perspective and cumulative percentage coverage of India, collectively, were used to suggest an ensemble of GCMs.It is understood that seasonal changes and precipitation influences surface temperature.However, this study has not accounted seasonal influences for identifying the best GCMs.A detailed study is

Figure 1 .
Figure1.Methodology for ranking GCMs to obtain compromise solution using VIKOR method.
N i = S i − W i Vol.:(0123456789) Scientific Reports | (2024) 14:3076 | https://doi.org/10.1038/s41598-024-52275-1www.nature.com/scientificreports/KACE-1-0-G and KIOST-ESM are the worst correlated to the observed data with a CC value of − 0.0548 and 0.2774, respectively.Most of the GCM for TMAX (25 in number) had CC values between 0.6 and 0.8, exhibiting moderate matching with the observed data.Similarly, from Fig.2b, it can be observed that the CC values of 25 models falls between 0.90 and 0.95 for TMIN.Hence, most of the models performed well in simulating the observed minimum temperature for the demonstration grid compared to TMAX.The KACE-1-0-G was the only GCM with small negative correlation for TMIN and TMAX, and hence not shown in Fig.2.It is not prudent to ascertain the best GCMs for TMAX and TMIN only based on CC.Therefore, the following section further evaluates the GCMs based on other performance indicators before ascertaining the best GCMs for TMAX and TMIN.

Figure 3 .
Figure 3. Spatial distributions of compromise solutions of maximum temperature for scenario S1 (a-c) and S2 (d-f)(Note: each figure is a complete compromise solution) (Maps created using ArcGIS Desktop 10.6.1, url: https:// www.arcgis.com/ index.html).

Figure 4 .
Figure 4. Spatial distributions of compromise solutions of minimum temperature for scenario S1 (a-c) and S2 (d-f) (Note: each figure is a complete compromise solution) (Maps created using ArcGIS Desktop 10.6.1, url: https:// www.arcgis.com/ index.html).

Figure 5 .
Figure 5. Net strength of GCMs under scenarios S1 and S2, for maximum and minimum temperatures.

Table 1 .
Performance indicator (PI) values, utility measure (S), regret measure (R), and index values (Q) for minimum temperature, for each GCM under both scenarios S1 and S2, for the grid (94.5°E, 26.5° N) in North-East India.

Table 2 .
Distribution of weights to performance indicators in various ranges under scenarios S1 and S2, over 335 grids of India.
arcgis.com/ index.html).For scenario S1, NRMSE is the most crucial indicator, with a mean weightage of 41.18% and 45.88% for maximum and minimum temperature, respectively.For scenario S2, NSE dominates NRMSE, with mean weightage of 35.30% and 42.95% for maximum and minimum temperature, respectively.2.Number of grids with weight less than 10% is the highest for CC, indicating it as the least prominent indicator.More number of grids in higher weight ranges indicate NRMSE and NSE as the most prominent indicators for S1 and S2, respectively.3. Weights vary with indicators and grids, and a particular GCM may perform well considering an indicator, while the same GCM performs poorly considering other indicators.So, it necessitates considering multiple criteria for GCM assessment.4. A uniform ranking pattern was seen in both scenarios as there was 89.85% similarity in compromise solutions of maximum temperature for S1 and S2, whereas it was 94.92% for minimum temperature.A nonuniform ranking pattern was observed for maximum and minimum temperature under both scenarios (i.e., similarity under S1 was 10.75%, and S2 was 8.06%).

Table 3 .
Combined ranks of GCMs for maximum and minimum temperatures, under both scenarios S1 and S2, using group decision-making and average perspective methods.Bold values represent the significant ranks of GCMs.

Table 4 .
Number of grids over which respective GCMs are compromise solutions (compromise solutions less than seven grids are not tabulated).