RETRACTED ARTICLE: Simulation and spatiotemporal pattern of air temperature and precipitation in Eastern Central Asia using RegCM

Central Asia is a region that has a large land mass, yet meteorological stations in this area are relatively scarce. To address this data issues, in this study, we selected two reanalysis datasets (the ERA40 and NCEP/NCAR) and downscaled them to 40 × 40 km using RegCM. Then three gridded datasets (the CRU, APHRO, and WM) that were extrapolated from the observations of Central Asian meteorological stations to evaluate the performance of RegCM and analyze the spatiotemporal distribution of precipitation and air temperature. We found that since the 1960s, the air temperature in Xinjiang shows an increasing trend and the distribution of precipitation in the Tianshan area is quite complex. The precipitation is increasing in the south of the Tianshan Mountains (Southern Xinjiang, SX) and decreasing in the mountainous areas. The CRU and WM data indicate that precipitation in the north of the Tianshan Mountains (Northern Xinjiang, NX) is increasing, while the APHRO data show an opposite trend. The downscaled results from RegCM are generally consistent with the extrapolated gridded datasets in terms of the spatiotemporal patterns. We believe that our results can provide useful information in developing a regional climate model in Central Asia where meteorological stations are scarce.

A regional climate model with a high spatial resolution can accurately simulate the spatial distribution of temperature and precipitation at a regional scale, thereby satisfying the demand for raster meteorological data in an area short of sites. Compared with the interpolated and extrapolated data, the regional climate model data has a higher resolution, which can describe more accurately the physical process and its parameterization. It can also obtain more precise information, which can better describe the regional characteristics such as the terrain and underlying surface; this leads to a more accurate description of the regional and local atmospheric circulation system 15,16 . In this study, the regional climate model RegCM4.3 was selected to downscale the re-analysis data of the ERA40 17 and NCEP/NCAR 18,19 to 40 km * 40 km in the Central Asian region (Fig. 1) and to simulate the variation of temperature and precipitation between 1958 and 2001. The results were then compared with the three site-based extrapolated datasets (the CRU, WM and APHRO). The spatial pattern of climate change over the past 50 years in Central Asia was subsequently analyzed by studying the trend of anomalies in the two types of data on an inter annual scale.

Results
Evaluation of air temperature simulation. The spatial distribution of the mean air temperature is consistent between the ERA40-RegCM (Fig. 2a) and NCEP-RegCM (Fig. 2b). The low-value center (less than −15 °C) appears in the Karakoram mountain area, with the area being larger in Fig. 2b. The spatial distribution is also generally consistent between the RegCM output and extrapolation datasets, with correlation coefficients of 0.58-0.77, which are significant at the 0.05 level, and the NX has the highest correlation coefficients among the three regions ( Table 1). Table 2 indicates that the RegCM output is significantly correlated with the data extrapolated from the stations (p < 0.05) in the three studied areas. Among them, the spatial distribution of precipitation in the RegCM data has a relatively high consistency with that in the APHRO and WM data, with average correlation coefficients of 0.58 and 0.57, while its correlation coefficient with the CRU is only 0.46. However, the RegCM precipitation data can reflect more accurately the spatial pattern of the precipitation in the mountain-basin type landforms in the NX between the Altai Mountains and the Hangai Mountains, and between the Altun Mountains and the Qilian Mountains (Fig. 3a,b). Apart from the APHRO data (Fig. 3e), which reflects the characteristics of the Alaja Mountain Precipitation Center, the CRU and WM did not reflect the impacts of the mountain-basin structure in Xinjiang and its surrounding areas on the regional precipitation pattern (Fig. 3c,d). The annual precipitation from the RegCM data in the Tianshan mountainous area is 1.34 times that of the CRU data, 1.30 times the WM data, and 1.26 times the APHRO data.

Evaluation of precipitation simulation.
A trend analysis of temperature and precipitation anomalies. Figure 4a,c, and e show that the variations of temperature anomalies are consistent between the RegCM dataset and the data extrapolated from the two stations. From the late 1950s onwards, the temperature anomalies in the Xinjiang area showed a downward trend. The temperature rose slightly in the mid-1960s and then declined significantly, however an increasing and decreasing trend are again observed in the 1970s and the mid-1980s respectively. In the 1990s and first 10 years of the 21st century, the temperature anomalies remained at a high level. The trend of precipitation anomalies in the NCEP-RegCM data is similar to that in the data extrapolated from the stations, with a linear trend of −1.9 mm/10a. The linear trend of RegCM's annual precipitation anomaly is 0.85 mm/10a from 1958 to 2001. Table 3 shows that the mean temperature obtained from the RegCM data (the average of the NCEP-RegCM and ERA40-RegCM) is lower than that from the data extrapolated from the stations (the CRU and WM mean) in all three study areas. The annual average temperature from the RegCM data is 3.11 °C, 6.60 °C and 0.75 °C in the NX, SX, and Tianshan mountainous area respectively, while the data extrapolated from the stations is 5.16 °C,    10.14 °C and 3.84 °C correspondingly. The maximum difference between the two datasets is found in the NX (3.54 °C) and the minimum difference is found in the SX (2.05 °C). During the period 1958-2001, the NCEP-RegCM, CRU and WM all show a warming trend, in which the temperature in the NCEP-RegCM increased significantly in the NX (p < 0.05), and the data extrapolated from the stations also reveals significant warming in the three study areas. No significant change is observed in the ERA40-RegCM data and the temperature trend was even negative in the SX and Tianshan mountainous areas, with rates of −0.02 °C/10a and −0.01 °C/10a respectively. The MK mutation test shows that the temperature change from the four datasets was consistent with the significant temperature increase obtained by linear regression. The average temperature after the NCEP-RegCM mutation was 1.43 times that before mutation. A significant mutation from the data extrapolated from the stations in the Tianshan mountainous area occurred in the late 1980s.  Table 4 shows the variation of precipitation between1958 and 2001. The annual mean precipitation of the RegCM (the mean of the NCEP-RegCM and ERA40-RegCM) was higher than that of the extrapolated data (the mean of the CRU, WM, and APHRO) in the three study areas. The annual average precipitation from the RegCM data is 184.62 mm, 80.71 mm, and 377.79 mm in the NX, SX, and Tianshan mountainous area respectively, while that from the data extrapolated from the stations is 157.10 mm, 69.07 mm and 291.01 mm correspondingly. If one goes by the 426.7 mm precipitation recorded by the station in the Tianshan mountainous area 20 , the RegCM data better reveals the actual precipitation, however the data extrapolated from the stations(136 mm) is significantly underestimated and does not correspond with the fact that the Tianshan mountainous area is regarded as the "Water Tower in Central Asia" 1,21 . The maximum difference between the two sets of data also appears in the Tianshan mountainous area (86.78 mm), while the minimum difference appears in the SX (11.64 mm). Compared with the other three datasets, the ERA40 has the highest precipitation value in the three study areas, with an average annual precipitation of 191.72 mm, 90.07 mm, and 399.88 mm in the NX, SX, and Tianshan mountainous area respectively.

R E T R
Regarding the trend of precipitation, the WM in the NX and the ERA40-RegCM, WM, and APHRO in the SX all show a significant increase in precipitation (p < 0.05), with rates of 1.24 mm/a, 0.49 mm/a, 0.78 mm/a, and 0.46 mm/a respectively. In the NX, apart from the APHRO (−0.23 mm/a), the other four datasets reflect the increasing trend of precipitation. In the SX, all the data reflect the increasing trend of precipitation with the largest value of 0.78 mm/a in the WM. In the Tianshan mountainous area, all data shows a decreasing trend in precipitation except for the ERA40-RegCM.
The results of the precipitation mutation test show that the mutations occurred in the NX and SX in the WM, and the SX in the ERA40-RegCM, with the corresponding years of mutation being 1989, 1988 and 1973 respectively. In the WM, the precipitation before and after the mutation is very different in the SX, with the average precipitation after mutation being 1.4 times that before mutation. The precipitation data from other datasets do not show a significant mutation, which is in line with the previous trend analysis. The analysis reveals that the increase in precipitation in Xinjiang led to the frequent occurrence of floods in the 1980s 20,22 .

Discussion
The temperature and precipitation in the mountainous area in Central Asia change drastically with elevation. The temperature above the mid-mountain area is low and the water vapor in the westerly circulation is concentrated in the mountainous area due to the influence of the terrain of the mountains and basins. However, most of the stations used in the CRU, WM, and APHRO datasets are located in relatively dry and hot, low mountains and   plain oases (an explanation for this is provided in the following analysis), and even the stations in the mountainous areas are located in valleys (Fig. 5). Due to the sparse and inhomogeneous distribution of the meteorological stations, the extrapolation of data from meteorological stations in Central Asia can be problematic 11 . In order to further analyze the influence of the spatial distribution of meteorological stations on the accuracy of the extrapolated data, 47 stations that provide precipitation and temperature data in Xinjiang were selected. According to the criteria proposed by 23 , the stations are categorized into three types: three High Mountain Stations (HMS) with an elevation of 2650-5445 meters; ten Mid-Mountain Stations (MMS) with an elevation of 1700-2650 meters; and 34 Low Mountain Stations (LMS) with an elevation of less than 1700 meters (Fig. 5). Here, the HMS and MMS are also collectively denoted as Mountain Stations (MS).The mean temperature (1960-2011) recorded by the meteorological stations in the HMS, MMS, and LMS is 1.4 °C °C, 5.3 °C, and 8.8 °C respectively and the errors are 62%, 34%, and 35% respectively. For precipitation, the means are 123.6 mm, 268.5 mm and 106.9 mm respectively and the errors are 88%, 38%, and 68% respectively. The temperature from the LMS is 4.6-7.4 °C higher than that from the MS, and the precipitation in MLS is 2.5 times higher than that in the LMS. Six pairs of the LMS and MS with short distances (less than 100 km) between each other were selected to conduct the pair analysis, and the results (Table 5) show significant differences between the data of the LMS and the data from the MS with regards to both temperature and precipitation (p < 0.01).
The above analysis shows that the climate of the study area is significantly affected by the terrain. Without any correction to the relationship between precipitation, temperature, and elevation, the data extrapolated from the observations of stations is likely to underestimate the precipitation and overestimate the temperature in the mountainous areas. For example, the CRU, WM, and APHRO data show that the average annual precipitation in the Tianshan mountainous area is less than 300 mm, which is significantly lower than the actual annual precipitation of 426.7 mm 20 . The dynamic downscaling of the ERA40 by the RegCM shows an annual precipitation of 400 mm, which is closer to the true value. The above results show that the dynamical downscaling by the RegCM   can obtain high-resolution, spatial gridded data with a relatively high accuracy, which takes into account the influence of terrain on the local climate and avoids the deviation caused by the inhomogeneous distribution of observation stations. Meanwhile, without considering the close relationship between meteorological elements and elevation, the gridded dataset that extrapolates directly from the sparse and in homogeneously distributed stations might be erroneous, resulting in errors in the spatial distribution of meteorological elements. It is noted that some previous studies have used extrapolated data, such as that from the CRU, as a reference for evaluating the accuracy of regional climate simulations and concluded that regional climate models overestimated precipitation in arid and semi-arid areas in Central Asia 24,25 . However, if the extrapolated data itself is underestimated with regards to precipitation, the veracity of these conclusions can be doubted. For example, Small (1999) 26 used the 0.5°Legates & Willmott Climatology (LWC) data to verify the simulation results of the RegCM, and found that these results were significantly higher than those of LWC in the high-elevation regions of the western Tianshan Mountains.
According to Small (1999) 26 , this is because the lack of ground observations in the LWC data leads to errors. Other studies (such as Jiang and Wu (2005) 27 ) ignored the effects of the scale and terrain, and comparisons were made by linearly interpolating the model simulation directly into the stations' location. Due to the fact that the stations in Xinjiang are mostly located in arid valleys and foothills, the values of the observations are usually lower than the simulated values (which often represent a larger area). In addition, the interference from low temperatures and strong winds in the Tianshan mountainous area may lead to a systematical underestimation of precipitation in the in-situ data 28,29 . Since the CRU, WM, and APHRO are generated using extrapolations, the exact location of the stations are not in the public domain, and the stations are sparsely distributed (especially in the mountains), it is difficult to use other in-situ observations to evaluate these extrapolation datasets whilst excluding the sites that are used to generate these datasets. In the future, with the maturity of global precipitation remote sensing products, such as TRMM 30 (which is only used in the tropics), it will be possible to evaluate the accuracy of these extrapolation datasets objectively and climate models independently. At the current stage of development, the RegCM simulation shows that the annual precipitation of nearly 400 mm in the Tianshan mountainous area corresponds more with the fact that this area is covered by a large number of forests (the forest ecosystem generally needs an annual precipitation of more than 400 mm to survive) than the results from the extrapolated data (less than 290 mm).
The spatial distribution of two datasets (NCEP and ERA40) are generally consistent. In terms of the spatial distribution of air temperature, the area with low values for ERA40 is less than that for NCEP. The comparison of CRU and WM showed model results with ERA40 are closer to the extrapolated observation datasets and NCEP would lead to larger underestimation. This is because the air pressure in summer of NCEP were much underestimated for most China. In fact, this cold bias of air temperature was also from the RegCM itself 31 . In terms of precipitation, model results with NCEP were closer to extrapolated observation datasets and NCEP would lead to higher precipitation. Similar to the air temperature problem as stated above, the overestimation of precipitation was also from RegCM itself 32 .
From the above analysis and discussion, we have summarized a few primary findings as follows: (1) The average annual precipitation of the Tianshan Mountains as simulated by the RegCM is 356-400 mm, which is close to the 426.7 mm reported in previous studies 20 . The number obtained from data extrapolated from stations, such as the CRU, WM, and APHRO, is only 290 mm, which is not only significantly lower than the previous estimates but also not able to support the fact that there is a dense spruce forest in the Tianshan Mountains. Although there are some large areas of uncertainty regarding the RegCM simulation in the plateau and mountainous areas 24 , it can better represent the precipitation pattern in the mountainous areas and arid basins in Central Asia 27 when compared to the data extrapolated from the stations, because the model considers the impact of the terrain on the climate system 33,34 . (2) Using linear regression and the MK test, both the RegCM and extrapolated data indicate that the NX has the highest warming rate (0.25 °C/10a) in the three study areas. The mutation of the temperature occurred in 1977, and the mean temperature after the mutation (1978-1001) is 43% higher than that before the mutation (1958)(1959)(1960)(1961)(1962)(1963)(1964)(1965)(1966)(1967)(1968)(1969)(1970)(1971)(1972)(1973)(1974)(1975)(1976). The two types of datasets also both indicate the increasing trend of precipitation in the SX, with a rate of 0.37 mm/a. The WM has the largest increase in precipitation among all the datasets between 1958 and 2001, in which the trend is 1.24 mm/a in the NX and 0.78 mm/a in the SX. The mutation years for these two regions are 1989 and 1988 respectively, and the average precipitation increased by 24% and 40% after the mutation. (3) Compared with the extrapolated data, the dynamically downscaling simulation based on the physical mechanism can avoid the adverse effect of the inhomogeneous distribution of sites on the compilation of gridded datasets. This simulation is valuable for regions like Central Asia where meteorological stations are scarce, the terrain is complicated, and the spatial differences of meteorological elements are obvious.
However, the improper setting of regional climate simulations can also cause errors. For example, Small's (1999) 26 simulation of the precipitation in five countries in Central Asia resulted in a low value near the northwest boundary, because the simulation domain was too close to the boundary. Since the RegCM sets the amount of cloud and precipitation as zero in the domain boundary during the simulation, and the occurrence of precipitation requires the amount of cloud and precipitation to be more than a threshold, a large buffer is needed to converge the balance in the inflow direction of the water vapor (the direction of the westerly circulation) for a long period so as to ensure that the amount of cloud and precipitation is raised above the threshold in this direction in the studied area. Small (1999) 26 failed to increase the amount of cloud and precipitation to a reasonable value during the simulation, as the buffer zone between the simulation domain and the studied area was too narrow in the northwest direction. This study drew lessons from previous studies and left a large space between the simulation area and the study area  Fig. 1). In spite of this, considering the strong influence of the complicated terrain in Central Asia on the mesoscale climate systems 24 , the 40 km resolution used in the study is insufficient for accurately representing the temperature and precipitation patterns in the Tianshan Mountain area. In order to accurately simulate the climatic characteristics of Central Asia, especially for the mountainous areas, it is necessary to further improve the spatial resolution of the regional climate model and employ a better representation of the underlying surface.

Data and Methods
The reconstructed ERA40 and NCEP obtained by the assimilation of multi-source data were used as the drives for the RegCM and the two sets of results are denoted as the NCEP-RegCM and ERA40-RegCM respectively. The extrapolated gridded datasets from the irregularly distributed stations, the CRU 8 , WM 35 and APHRO 10 , were also employed. More detailed information regarding these data is shown in Table 6.
The parameterization of the RegCM4.3 is now outlined. The horizontal resolution is 40 km with 112 and 84 grid points in meridional and zonal directions respectively in a Lambert projection centered at 74.21°E, 44.76°N. The top level is 50 hPa with 23 layers in the vertical direction. The initial boundary field uses the ERA40 and NCEP/NCAR re-analysis data with a horizontal resolution of 2.5° and the time series is selected as the temporal intersection between the re-analysis data and extrapolation data. The sea surface temperature is GISST (1948GISST ( -2002; the planetary boundary layer is selected as the Holtslag format; the cumulus convection scheme is in the Emanuel MIT format; the lateral boundary condition is an exponential relaxation; and the land surface process is described by Biosphere-Atmosphere Transfer Scheme (BATS). The terrain data selected in the model is the USGS GTOPO30 digital elevation model (DEM), and the surface vegetation data selected is the USGS Global Land Cover Characterization (GLCC) in a30″ resolution.
The region of interest in this study is selected as 52°E-96°E, 30°N-51°N, and is further divided into three sub-regions based on the landscape features and climate ( The annual mean temperature/precipitation of a certain sub-region is calculated through averaging all the temperature/precipitation grids in that sub-region. A correlation analysis is conducted along with a significance test between the RegCM data and the data extrapolated from the stations. In a time series trend analysis, the Mann-Kendall test is a nonparametric test method that does not require a sample to be within a certain distribution and is not affected by a small number of outliers; thus it is suitable for studying the trend of a time series such as precipitation, runoff, temperature, and water quality 36 . Therefore, the Mann-Kendall (MK) method is employed here to study the trend's characteristics and the mutation of temperature and precipitation.