Introduction

Natural protected areas are essential for maintaining species diversity and ecosystem integrity. Studies have consistently shown that protected areas of global land have a higher level of biodiversity1,2. The establishment of natural protected areas, including World Natural Heritage Sites, is a key strategy in combating biodiversity loss3, capable of effectively reducing environmental threats such as deforestation and fires4,5. Policies related to natural reserves can effectively prevent natural landscapes from being indiscriminately developed into impervious layers, promoting sustainable and healthy development of the region6. World Natural Heritage Sites are special natural protected areas designated for the conservation of natural features, wildlife habitats, and natural landmarks that hold exceptional global significance. However, the establishment of protected areas such as world natural heritage sites is not a permanent solution. Due to the continuous changes in climatic conditions and human activities, many protected areas are still facing continuous threats such as ecosystem destruction, habitat fragmentation and species loss7. Therefore, post-establishment continuous monitoring is particularly critical to identify and assess the impacts of climate change and human activities on these areas8.

Land cover changes is a reliable indicator of ecological and ecosystem shifts7,9. Habitat fragmentation and biodiversity loss resulting from land cover conversion and increased land use are significant factors contributing to the decline in biodiversity9,10,11,12. Understanding land cover change processes in natural protected areas and quantifying these changes are crucial for promoting the sustainable development of protected areas13,14.

Many scholars have conducted quantitative analyses of potential drivers of land cover change in protected areas, aiming to identify the key factors influencing these changes. The goal is to explore the underlying impacts of climate change and human activities on land cover in protected areas, laying the groundwork for future management policy planning. Scholars have employed a variety of methodologies, such as OLS regression, GWR regression15,16, logistic regression17,18, and other methods, to assess the impact of each driving factor on changes in land cover. Their approach is based on the assumption of a linear relationship between these factors. However, the actual influence of each factor on land cover change exhibits a non-linear nature, characterized by intricate relationships. Linear regression methods have inherent limitations when it comes to quantitatively gauging the significance and importance of these drivers. Moreover, many of the real-world drivers contend with multicollinearity issues, and these variables often play a pivotal role in shaping alterations in land cover. Random forest algorithm and Geodetector model do not need to address issues of multicollinearity. Random forest algorithm automatically capture nonlinear relationships and assess the importance of features by constructing multiple decision trees. In contrast, Geodetector model is spatial analysis method based on the second law of geography. They are employed to detect spatial dissimilarities and reveal the driving forces behind them19,20. The Random Forest algorithm and Geodetector model have been widely employed in quantitative evaluations of driving factors21,22,23,24,25,26,27,28. In addition, some scholars combine the Random Forest algorithm with the Geodetector model to enhance the information on the importance of factors and improve the robustness and reliability of the driving force analysis29,30.

This paper takes Kalajun-Kuerdning as a representative case of alpine world natural heritage sites to study the long-term land cover changes in protected areas such as World Heritage Sites and identify more important land cover driving factors in the context of protection and development. The main research contents are as follows: (1) Conducting supervised classification of remote sensing imagery in cloud-prone alpine regions; (2) Investigating the process and spatial patterns of land cover change in the Kalajun-Kuerdening area from 1994 to 2023; (3) Employing both the Random Forest algorithm and Geodetector model to prioritize the driving factors of land cover change; (4) Performing overlay analysis on land cover change and factors with high explanatory power to identify optimal intervals for land cover transition.

Materials and methods

Study area

Tianshan mountain system (Fig. 1a,b), one of the largest mountain systems in the world, is located in Central Asia and span four countries: China, Kazakhstan, Kyrgyzstan, and Uzbekistan. As one of the four components of Xinjiang Tianshan, Kalajun-Kuerdening was inscribed on the UNESCO World Heritage List in 201331. The study area features a nearly east–west orientation of the mountains, with an elevation range of 1336 to 4334 m and an average elevation of 3000 m (Fig. 1c). It represents the most typical distribution of the second and third-level planation surfaces of the Tianshan mountain system, characterized by flat and rounded high terrace landforms at the mountain tops. Kalajun-Kuerdening is located in the Ili Valley and experiences a temperate continental semi-humid climate, characterized by warm and humid conditions. Kuerdening is the region with the highest precipitation in the Tianshan Mountains. The multi-year average precipitation, temperature, and evapotranspiration in the study area are 366.2 mm, −2.2℃, and 548.4 mm, respectively. The study area boasts the richest biodiversity in the Xinjiang Tianshan World natural heritage site32. Kalajun is a typical representative of the Tianshan mountain meadow-steppe biotic zone, while Kuerdening exemplifies the Tianshan coniferous forest within the "Central Asian Mountain Steppe and Woodlands Ecoregion"33. The primary ecosystem in this region is the evergreen coniferous forest, represented by Schrenk's Spruce (Picea schrenkiana). This area is the central distribution and origin of Schrenk's Spruce, a species endemic to the Tianshan region, playing a crucial role in water conservation, soil protection, and climate regulation. The primary goal of the region is to ensure the sustainable development of the forest-grassland ecosystem and the protection of diverse species habitats. There are no permanent residents within the heritage site area. Every year, from June to September, herders enter heritage site periphery for seasonal grazing and grassland management. In recent years, limited tourism activities have commenced, primarily in the northwestern and northeastern areas of the heritage site's buffer zone. All maps in this research were created using ArcGIS 10.8 and are original works by the authors.

Figure 1
figure 1

(a) the location of the Tianshan Mountains34 in China; (b) the location of the study area in the Tianshan Mountains; (c) the elevation of the study area and the scope of the heritage site nomination and buffer zone. China map source: GS (2019)1822. The base map has not been modified.

Research design

The framework of this research encompasses three phases (Fig. 2): (1) Land cover classification. Leveraging the Google Earth Engine (GEE) platform, we conducted a supervised classification of Landsat remote sensing images from the years 1994, 1997, 2007, 2017, and 2023 for the Kalajun-Kuerdning region. Utilizing the Random Forest algorithm, the classifications primarily encompassed four land cover types: Grassland, Forest, Ice/snow, and Bareland. (2) Spatial–temporal patterns of land cover changes. This phase delved into a comprehensive spatio-temporal assessment of land cover transition types across the four time intervals. Additionally, we use intensity analysis to analyze the temporal changes in land cover. (3) Driving factors of land cover changes. Using the Random Forest algorithm and the Geodetector model to analyze the driving factors behind land cover changes, we identify the factors with strong explanatory power. These factors are then overlaid with land cover changes to determine the main distribution areas and changing trends under each influencing factor.

Figure 2
figure 2

Research framework for spatiotemporal patterns and driving factors of land cover changes.

Dataset and data process

Land cover data

After screening for available scenes and cloud cover, this study identified the years 1994, 1997, 2007, 2017, and 2023 as key temporal nodes to analyze changes in the study area. Regarding data sources, the remote sensing images for 1994, 1997, and 2007 was obtained from the "LANDSAT/LT05/C02/T1_L2" dataset, while the data for 2017 and 2023 came from the "LANDSAT/LC08/C02/T1_L2" dataset. These datasets correspond to the Landsat 5 TM sensor and the Landsat 8 OLI/TIRS sensor, respectively, both accessed via the Google Earth Engine platform. To ensure consistency and accuracy in the analysis, data from June to August each year were processed using a median filtering technique. Details regarding the selection of specific image names and the determination of years are thoroughly documented in the Supplementary Table S1.

Driving factors data

The driving factors can be primarily categorized into four parts (Fig. 3). The first category comprises climate factors, including Precipitation (PRE), Temperature (TEM), and Evapotranspiration (EVA). The second category relates to topographical factors, primarily involving Elevation (DEM), Aspect (ASP), and Slope (SLOP). The third category encompasses other natural factors, with a primary focus on the Distance from rivers (DFRI) and Soil type (SOIL). The fourth category deals with human activity factors, primarily considering Distance from major tourist roads and grazing roads (DFRO) and Distance from grazing sites (DFGS).

  1. (1)

    Climate factors. We selected precipitation (PRE), temperature (TEM), and evapotranspiration (EVA) as the three factors to represent the climate conditions of the study area. Climate data from a single year may be influenced by extreme weather or anomalous events, which can cause deviations from the long-term climate trends of the region. By using more than 20 years of data, we can smooth out these short-term fluctuations, reduce the impact of interannual variability, and more accurately reflect the climate averages of the area. Precipitation (PRE), temperature (TEM), and evapotranspiration (EVA) data were sourced from the National Earth System Science Data Center, National Science & Technology Infrastructure of China35 (http://www.geodata.cn). Monthly precipitation, temperature, and evaporation data spanning from 1994 to 2022 were acquired and subsequently converted into TIFF format using ArcGIS. Precipitation and evapotranspiration data were obtained by adding the data for each month of the year and then averaging the data over multiple years. Temperature data is obtained by adding up the data for each month of the year, dividing by 12, and then averaging the data over the years. The units for precipitation and evapotranspiration are millimeters (mm), and for temperature, it's degrees Celsius (°C).

  2. (2)

    Topography factors. The elevation data (DEM), measured in meters (m), is sourced from NASA DEM (https://lpdaac.usgs.gov/) with a resolution of 30 m. The aspect (ASP) and slope (SLOP), both measured in degrees (°), were generated using ArcGIS software's Aspect and Slope tools, also with a resolution of 30 m. ASP is divided into 9 categories based on the angles of azimuth.

  3. (3)

    Other ecological factors. We created a river network using hydrological analysis data in ArcGIS, refined it through Openstreetmap (downloaded on December 15, 2023) and field survey data collected from July 15–20, 2021, and June 30-July 7, 2022, and generated the distance to rivers using the Euclidean Distance tool in the ArcGIS platform. We set a cumulative distance threshold at 1000 m, with a resolution of 30m. SOIL, including types such as Mollic Gleysols (GLm), Haplic Chernozems (CHh), Luvic Chernozems (CHl), Mollic Leptosols (LPm), Gelic Leptosols (LPi), and Glaciers (GG), is sourced from the Harmonized World Soil Database (https://www.fao.org/soils-portal/), utilizing the attribute field SU_SYM 90 for soil type, with a resolution of 1000 m. And SOIL is classified into 6 categories based on the original data.

  4. (4)

    Human activity factors. The road data, derived from OpenStreetMap data, was refined based on field survey data. Given that the roads in the study area are primarily unpaved, and considering the walking distances for tourists and daily travel of residents, an accumulated distance threshold of 1000 m was established. The DFRO data was generated using the Euclidean Distance tool in ArcGIS, with a resolution of 30m. The distance from grazing points (DFGS) was created based on the recorded latitude and longitude data during the survey, the point data provided by local management personnel, and the distribution areas of yurts in the remote sensing images. We employed the Euclidean Distance tool in the ArcGIS platform, setting the cumulative distance threshold at 5000 m, considering the daily activity range of cattle and sheep. The distances to grazing points were generated with a resolution of 30 m based on the grazing point data. As the study area is a nature reserve and World Heritage site, the construction of permanent roads is strictly restricted, and there are no permanent residents, only a few summer grazing points that have remained in nearly the same locations for an extended period.

Figure 3
figure 3

Driving factors after reclassification. PRE represents the multi-year average precipitation; TEM represents the multi-year average temperature; EVA represents the multi-year average evaporation; DEM represents the elevation; ASP represents the aspect; SLOP represents the slope; DFRI represents the distance to rivers; SOIL represents the soil type; DFRO represents the distance to roads; and DFGS represents the distance to grazing and settlement points.

Except for SOIL and ASP, all the remaining driving factors are divided into 9 categories based on the Jenks natural break point method of ArcGIS. The coordinate system for all the data is WGS_1984_UTM_Zone_43N and the uniform resolution is 30 m.

Methodology

Land cover supervision classification

(1) Cloud masking was performed on the remote sensing images to remove clouds and shadows, followed by applying median filtering to synthesize a single image from the June to August images for each year. Due to excessive missing pixels after cloud masking in 2017, the median-filtered image from 2016 was used for interpolation; (2) The NDVI, NDBI, and MNDWI bands were calculated and added to the dataset; (3) Sample points were selected based on Google Earth and field survey data. Given that the river widths in the study area are much smaller than the image resolution and difficult to identify, the land cover in Kalajun-Kuerdening was broadly classified into four categories: Grassland, Forest, Ice/Snow, and Bareland; (4) 70% of the sample points were used as training samples and 30% as validation samples, with supervised classification conducted using the Random Forest algorithm; (5) Accuracy validation was performed on the classified data. (6) To address missing pixels after cloud removal and in cases where adjacent years' remote sensing images also lack data, the mode interpolation method was used to complete the images. The specific steps are as follows: traverse the raster data matrix and check the position of each pixel for missing data values. For pixels with missing data, extract the surrounding 3 × 3 neighborhood matrix and flatten it into a one-dimensional array. Identify the positions of missing data values within the one-dimensional array, remove these missing values, and retain the valid values. If the array is empty after removing the missing data values, no further action is taken; if not empty, calculate the frequency of each value in the array, identify the most frequently occurring value (the mode), and replace the current missing data value with this mode. This method enhances data integrity by addressing data gaps; (7) Accuracy validation. First, 1000 randomly generated points were used for overall consistency checks of the supervised land cover classification to ensure comprehensive coverage and preliminary validation of the data. The validation data include the widely recognized Huang Xin’s36 30 m CLCD land cover dataset and Esri's 10 m Sentinel-2 land cover dataset (https://livingatlas.arcgis.com/landcoverexplorer/). Given that the CLCD dataset is updated through 2022, while the Sentinel-2 dataset has been updated since 2017, we conducted consistency checks by comparing land cover classification data from 1994, 1997, and 2007 with the CLCD dataset, and compared data from 2017 and 2023 with the Sentinel-2 dataset. Additionally, to further verify the accuracy of land cover classification, verification points were selected in collaboration with various authors involved in the supervised classification. These points were chosen based on the area proportion and ecological significance of each land cover type. Specifically, 200 points were chosen for Grassland, 150 for Forest, 50 for Ice/Snow, and 100 for Bareland, using Landsat raw remote sensing images and Google Earth imagery as references. The steps for supervised classification of the study area are shown in Fig. 4.

Figure 4
figure 4

Steps for supervised classification based on Random Forest algorithm in GEE platform.

Intensity analysis

Intensity analysis is based on the land cover change matrix to analyze the magnitude of land cover changes, simultaneously comparing the intensity of each level of land cover change with uniform intensity37,38. In the context of inconsistent intervals between years of land cover change, it is possible to conduct an in-depth quantification of land cover changes and elucidate the underlying processes.

  1. (1)

    Interval level: Interval level assesses whether land cover changes relatively quickly or slowly during specific time intervals within the study period. It serves as the foundation for category and transition analyses. Equation (1) calculates the intensity \({S}_{t}\), of land cover change within the time period \(\left[{Y}_{t},{Y}_{t+1}\right]\) in the study area, while Eq. (2) provides the uniform intensity \(U\), for the entire study period \(\left[{Y}_{1},{Y}_{T}\right]\). When \({S}_{t}>U\), it indicates that land cover changes faster during that time interval compared to the uniform rate of change across the entire study period; otherwise, it changes more slowly than the uniform rate.

$$\begin{array}{c}{S}_{t}=\frac{change\, area \,during\,\left[{Y}_{t},{Y}_{t+1}\right]}{area \,of \,region\,}\times \frac{1}{{Y}_{t+1}-{Y}_{t}}\times 100\%\end{array}$$
(1)
$$\begin{array}{c}U=\frac{change\, area \,during\,\left[{Y}_{T},{Y}_{1}\right]}{area \,of \,region\,}\times \frac{1}{{Y}_{T}-{Y}_{1}}\times 100\%\end{array}$$
(2)
  1. (2)

    Category level: Calculating the size and intensity of the total losses and total gains of different categories at different time intervals can analyze whether the changes in a certain category within a certain time interval are stable. The unified intensity \({S}_{t}\) in a certain time interval calculated by Eq. (1). If \({G}_{ti}>{S}_{t}\) in a certain time interval, it means that category i changes relatively actively in this time interval. If \({G}_{ti}<{S}_{t}\), it means Category i is in a relatively dormant state. And the loss state is the same.

$$\begin{array}{c}{G}_{ti}=\frac{gain\, area \,of\,category \,i \,during\,\left[{Y}_{t},{Y}_{t+1}\right]}{area \,of \,category \,i \,at\, {Y}_{t+1}}\times \frac{1}{{Y}_{t+1}-{Y}_{t}}\times 100\%\#\end{array}$$
(3)
$$\begin{array}{c}{L}_{tj}=\frac{loss\, area \,of \,category \,j \,during\,\left[{Y}_{t},{Y}_{t+1}\right]}{area\, of \,category \,j \,at \,{Y}_{t}}\times \frac{1}{{Y}_{t+1}-{Y}_{t}}\times 100\%\#\end{array}$$
(4)
  1. (3)

    Transition level: This level involves assessing the magnitude and intensity of transitions from other categories to a specific category. Equation (5) is used to represent the transition intensity from category i to category n within a specific time interval. Equation (6) represents the uniform transition intensity from other categories to category n. Using the new transition matrix model39 allows for a clear visualization of transition intensities, magnitudes, and the stability of transitions between different land categories. In Eq. (7), the ID (Intensity Deviation) represents the stability of transitions between land categories. The size of the bubble represents the scale of the transition, while the color of the bubble (ID, Intensity Deviation) represents the stability of the transition intensity.

$$\begin{array}{c}{R}_{tin}=\frac{trasition\, area \,from \,i \,to \,n \,during\,\left[{Y}_{t},{Y}_{t+1}\right]}{area\, of \,category \,i \,at {\,Y}_{t}}\times \frac{1}{{Y}_{t+1}-{Y}_{t}}\times 100\%\#\end{array}$$
(5)
$$\begin{array}{c}{W}_{tn}=\frac{gain\, area \,of \,category \,n \,during\,\left[{Y}_{t},{Y}_{t+1}\right]}{all \,areas \,except \,category \,n \,at\, {Y}_{t}}\times \frac{1}{{Y}_{t+1}-{Y}_{t}}\times 100\%\#\end{array}$$
(6)
$$\begin{array}{c}ID={R}_{tin}-{W}_{tn}\#\end{array}$$
(7)

Driving factors importance ranking

The importance of land cover changes drivers is closely tied to land use transition types and development stages. In this study, we employed the Random Forest algorithm and Geodetector model to investigate the importance of factors driving changes in land cover over four-time intervals, spanning from 1994 to 2023, within the study area. Initially, we conducted separate assessments of the importance rankings for individual factors. Then, we used Geodetector model’s Interaction-detector to explore how any two driving factors interact with each other.

Random forest algorithm

Random Forest algorithm is an ensemble learning method that enhances prediction accuracy and mitigates overfitting by combining multiple decision trees. A notable advantage of Random Forest algorithm is its independence from cross-validation or the need for an independent test set to obtain an unbiased error estimate. The Random Forest algorithm relies on two vital parameters, mtry and ntree. Mtry specifies the number of variables employed for binary tree construction at each node, while ntree defines the quantity of decision trees within the model. In our study, we conducted iterations with Random Forest models, initially setting mtry to 1 and evaluating the Mean Squared Error (MSE) for each model. Upon comparing the results, we found that the Random Forest model exhibited the lowest MSE when mtry was configured as 3. Ntree corresponds to the count of decision trees within the model. Given the mtry setting at 3, we selected 300 decision trees with stable Out-of-Bag (OOB) values (Fig. 5). Increase in node purity (IncNodePurity) is measured by the reduction in residual sum of squares (RSS), representing the impact of each variable on the heterogeneity of observations at each node in a classification tree. This metric is used to compare the importance of different variables. Specifically, when a variable is used to split a node, it reduces the heterogeneity of observations within the node (i.e., increases node purity). The corresponding reduction in RSS is the IncNodePurity value for that variable. A larger value indicates that the variable contributes more to the model’s predictive accuracy, meaning it ranks higher in terms of variable importance. We choose the value of IncNodePurity as the basis for the importance ranking in the Random Forest algorithm.

Figure 5
figure 5

Determination of parameters mtry and ntree Random Forest algorithm.

Geodetector model
  1. (1)

    Single factor effect

Factor detection can quantitatively detect the extent to which a certain driving factor can explain the spatial differentiation of the dependent variable. The expression of the q value (Wang et al., 2010) is:

$$\begin{array}{c}q=1-\frac{\sum_{h=1}^{L}{N}_{h}{\sigma }_{h}^{2}}{N{\sigma }^{2}}=1-\frac{SSW}{SST}\end{array}$$
(8)

In the formula: the value range of q value is [0,1]. The larger the q value, the greater the explanatory power of the driving factor \({X}_{i}\) on the dependent variable Y; h refers to the Strata of the dependent variable Y or the driving factor \({X}_{i}\); N and \({\sigma }^{2}\) refers to the number of units in the whole area and the variance of the Y value; \({N}_{h}\) and \({\sigma }_{h}^{2}\) refer to the number of units in layer h and the variance of the Y value; SSW refers to within sum of squares; SST is the total sum of squares.

  1. (2)

    Two-factor interaction

Evaluate whether the two driving factors \({X}_{1}\) and \({X}_{2}\) will increase or decrease the explanatory effect on the dependent variable Y when they work together. The first step is to calculate the values of q(\({X}_{1}\)), q(\({X}_{2}\)) and q(\({X}_{1}\cap {X}_{2}\)) respectively, where q(\({X}_{1}\cap {X}_{2}\)) refers to the q value when \({X}_{1}\) and \({X}_{2}\) interact; the second step is to compare The values of q(\({X}_{1}\)), q(\({X}_{2}\)), q(\({X}_{1}\cap {X}_{2}\)), the interaction types are as shown in the Table 1.

Table 1 Type of interaction between drivers \({X}_{1}\) and \({X}_{2}\).

Results

Land cover patterns and spatial–temporal variations of changes

Land cover classification results and patterns

Using the Google Earth Engine platform, land cover classification was performed for the Kalajun-Kuerdening using the Random Forest algorithm. The overall accuracy and Kappa coefficient for five years were both higher than 0.94. The supervised classification results were checked for consistency with Huang Xin’s36 land cover product and Esri’s Sentinel-2 land cover dataset, and the overall accuracy and kappa coefficient were both over 0.7. However, due to the differences in the locations of Bareland and Ice/snow, the overall values were relatively low. In addition, 500 sample points were used to further verify the accuracy of the supervised classification data in this study. Details of the accuracy validations can be found in the Supplementary Tables S2, S3 and S5.

Kalajun-Kuerdening features a south-to-north slope, primarily composed of Grassland and Forest, followed by areas of Bareland and Ice/snow. The densest forest distribution in the study area is concentrated in the northwest of Kalajun and the northeast of Kuerdening. In the central part of the study area, a transition occurs from Forest to Grassland, culminating in areas of Bareland. The southern part of the study area is characterized by glaciers and permanent snow cover. From 1994 to 2023, significant changes occurred in the land cover of the study area (Fig. 6a–e), with bar graphs of land cover categories from 1994 to 2023 (Fig. 6f) revealing clear trends. Forest showed a significant growth trend, with an increase of 55.96 km2, corresponding to a growth rate of 16.61% between 1994 and 2023; Grassland exhibited a slight increase, expanding by 18.16 km2 (a growth rate of 1.64%).

Figure 6
figure 6

Maps of land cover types and histogram of land cover changes.

Spatial–temporal patterns of land cover changes

According to Fig. 7, during the period from 1994 to 1997, Forest decreased by 8.27 km2, followed by a marked growth trend after 1997, with the fastest expansion occurring between 2007 and 2017, growing by 12.87%. From 1994 to 1997, the Grassland decreased by 63.5 km2, more specifically, 34.59 km2 converted to Forest, 21.13 km2 to Ice/Snow, and another 43.13 km2 turned into Bareland. Between 1997 and 2007, 30.79 km2 of Forest converted to Grassland, marking the highest conversion from Forest to Grassland during the study period. Figure 8 (1)–(3) indicate that areas where Forest converted to Grassland were concentrated on the periphery of the study area, particularly in the Kalajun and central Qiaxi, where some "Forest to Grassland" areas reverted to Forest during 2007–2017, a phenomenon possibly linked to timber production in state-owned forest farms and subsequent natural forest conservation policies. Between 2007 and 2017, 63.86 km2 of Grassland converted to Forest, marking the phase of most rapid Forest increase. Throughout the study period, the area of Bareland continually decreased, with Fig. 7 showing that the largest proportion of Bareland converted to Grassland, mainly occurring in the central and southern parts of the study area. From 2017 to 2023, land cover transitions were relatively stable.

Figure 7
figure 7

Overall and detailed changes in land cover at different intervals.

Figure 8
figure 8

Overall and detailed changes in land cover at different intervals.

Intensity analysis of land cover changes

Interval level

The division of research intervals is inconsistent. Due to different time intervals in the study area, considering only the overall changes in land cover area between intervals does not fully reflect the pace of transitions between land cover types within those intervals. From the right side of Fig. 9, it can be seen that the average annual change in area during the entire study period accounted for 1.78% of the total study area. Although the change in area was smallest from 1994 to 1997, the annual average intensity of land change reached a peak of 3.2% during this period. Between 2017 and 2023, the annual average change intensity was 2.30%, ranking second. The annual average change intensities for the periods 1997–2007 and 2007–2017 were 1.18% and 1.64%, respectively.

Figure 9
figure 9

Interval change area and intensity of total area at different intervals.

Category level

We can quantitatively analyze the activity of different categories during the research time intervals at the category level. Figure 10 illustrates the extent and intensity of gains and losses for four land categories across different time intervals. Between 1994 and 2023, there has been a notable instability in changes across various land categories. Specifically, from 1994 to 1997, various categories saw relatively intense gains and losses, and this intensity slightly decreased during the periods of 1997–2007 and 2007–2017.

Figure 10
figure 10

Annual change area and intensity of land cover categories’ gains and losses at different intervals.

Between 1994 and 1997, the loss of Grassland and Forest exceeded their respective gains. The intensity of Forest change during this period was significantly higher than the average intensity, indicating substantial fluctuations. Grassland changes during this period also exhibited the greatest magnitude among the four time intervals studied. In the three subsequent intervals after 1997, the gains in Grassland and Forest consistently surpassed the losses. Notably, from 2007 to 2017, the Forest growth intensity reached 1.74%, marking the highest change intensity for forests among the four intervals. From 2017 to 2023, both Grassland and Forest increases exceeded the decreases, with the annual average change intensity falling below the overall average for the study period.

Overall, only during the period from 1994 to 1997 did Forest and Grassland experience a net loss. In the other three intervals, both land cover types exhibited net gains. The annual change intensity for Forest and Grassland showed a decreasing trend across the four time intervals, indicating a move towards stability.

Transition level

Figure 11 shows the scale and intensity of mutual conversion between the four land types in the four time intervals. The transformation between different land types in different intervals is analyzed. Various patterns of transition emerge among these land categories.

Figure 11
figure 11

Transition pattern at four time intervals.

The transformation intensity from Ice/Snow to Grassland, from Grassland to Bareland, and from Bareland to Forest is lower than the average intensity during the study period, showing a relatively stable avoidance tendency. Due to the majority of the southern part of the study area being located in high-altitude regions, the precipitation and snow conditions in a given year significantly influence the transition between Bareland and Ice/snow. Across the four research intervals, the transition scale and intensity between Bareland and Ice/snow are substantial and quite active. In 1997, the Ice/snow not only extensively covered the southern mountainous areas but also affected the northern forests and grasslands. Consequently, from 1994 to 1997, the scale and relative intensity of transitions from the other three land categories to Ice/snow were considerable. In the three intervals from 1994 to 2017, the transitions of Grassland to Forest were notable in terms of scale and relative intensity, with the highest relative intensity observed during the period from 2007 to 2017. It can also be seen from the “Forest” line in “Transition from I” that the Forest is converted into Grassland at most, and the transition between Forest and Bareland shows stationary characteristics. The scale of conversion from Grassland to Bareland gradually decreases, while the intensity of conversion from Bareland to Grassland increases. Most of the benefits to Grassland come from Bareland.

In summary, the most intense transformations occur between Bareland and Ice/snow. Forest primarily originated from Grassland, and Grassland benefited mostly stem from Bareland.

Driving factors of land cover changes

Single factor effects

The importance ranking of the driving forces behind land cover changes in the study area, using the Random Forest algorithm and the Geodetector model, is shown in Fig. 12. Additionally, the R2 of the Random Forest algorithm for analyzing the driving factors of land cover changes at four time intervals are 0.799, 0.813, 0.801 and 0.758, respectively, indicating that the Random Forest algorithm well and stably explained the changes in land cover in the study area at four time intervals. And all variables passed the significance test of p < 0.01 in the Geodetector model’s factor detector. The Spearman correlation measurement is used with the Random Forest algorithm and Geodetector model to conduct a consistency test on the importance ranking of the driving factors of land cover changes in the study area. According to Table 2, it can be seen that the Random Forest algorithm and the Geodetector model in the four time intervals have significantly consistent characteristics in the importance ranking of driving factors. We have reason to believe that driving factors with the same ranking have stronger explanatory power and can provide a stronger reference for explaining the causes of land cover changes.

Figure 12
figure 12

The factor importance ranking of Random Forest algorithm (left) and Geodetector model (right).

Table 2 Spearman correlation results of Random Forest algorithm and Geodetector model drivers importance ranking.

Overall, the combination of these two methods enhances the explanatory power for the common key driving factors behind land cover change. As depicted in Fig. 13, both the Random Forest algorithm and the Geodetector model for the period of 1994–1997 reveal that elevation, precipitation, and temperature play a significantly important role in driving land cover change in the study area. For the time intervals of 1997–2007, 2007–2017 and 2017–2023, the top four driving factors influencing land cover changes in the study area remain consistent and maintain the same rankings: DEM, PRE, TEM, and EVA. The comprehensive analysis of driving factor detection across these two methods and the four time intervals indicates that topographical factors, including DEM and ASP, as well as climatic factors such as PRE, TEM, and EVA, have a relatively strong explanatory power for land cover change in the study area from 1994 to 2023. This could be attributed to the low density of tourist roads and primary livestock routes within the study area. While rankings exhibit some fluctuations, the influence of DFRO and DFGS remains relatively minor.

Figure 13
figure 13

The intersection of the top four driving factors quantified by the Random Forest algorithm and the Geodetector model. Note: The green circles represent the top four driving factors affecting land use change calculated by the Random Forest algorithm, the blue circles represent the top four driving factors affecting land use change calculated by the Geodetector, and the middle represents the common top four factors calculated by the two methods.

Two-factor interaction

The complexities of geographical processes often arise from the interplay of multiple factors rather than the isolated influence of individual factors. Employing Geodetector model’s interaction detector can unveil patterns of interaction among these driving factors. Figure 14a–d illustrates that the interactions between pairs of driving factors within the study area surpass the impact of single factors, thereby enhancing the explanatory capacity for land cover changes. According to (a), (b), and (c) of Fig. 13, the majority of interaction types are classified as double-factor enhancement, while the rest fall into the category of nonlinear enhancement. In Fig. 14d, most interaction types are categorized as nonlinear enhancement, with the remaining few falling under double-factor enhancement. This suggests that land cover changes in the study area is driven not by a single factor or specific category but by the spatial interactions of multiple driving factors over time and space.

Figure 14
figure 14

Interaction factor of Geodetector model at four time intervals.

DEM and ASP exhibit the most substantial interaction across the first three time intervals, while the most significant interaction between 2017 and 2023 is observed between DEM and DFRI. When climatic factors, such as precipitation, temperature, and evapotranspiration, interact with aspect, their subsequent impact on land cover changes becomes more pronounced. Factors related to the distance from rivers, soil types, proximity to tourist roads, distance from main livestock routes, and proximity to primary grazing points show relatively weaker explanatory power in factor detection. However, they demonstrate stronger explanatory power when interacting with elevation. Overall, combining the results from single factor results, it can be concluded that elevation is the most crucial driving factor influencing land cover changes.

Overlay analysis of land cover changes and driving factors

Through a comprehensive analysis of both single-factor and two-factor influences, elevation, precipitation, temperature, and evapotranspiration have been identified as the four dominant factors driving land cover changes. The interaction between elevation and aspect further enhances the explanation of land cover dynamics, thus selecting precipitation, temperature, evapotranspiration, slope, and elevation as key influencing factors. Considering the focus on Grassland and Forest conservation in the Kalajun-Kuerdening, the analysis of land cover changes concentrates on four types of transitions: from Bareland to Grassland, Grassland to Bareland, Grassland to Forest, and Forest to Grassland. Statistical analysis of the area covered by these four types of land cover transitions under the same values of influencing factors allows for the clear identification of their main distribution areas and changing trends.

Figure 15 reveals that the majority of Grassland-to-Forest transitions occur under conditions of average yearly precipitation between 275 and 375 mm, average yearly temperatures of −2 to 3 °C, annual evapotranspiration rates of 450–600 mm, and at elevations ranging from 1800 to 2600 m. The changes in land cover transition areas follow a sinusoidal curve with respect to aspect, indicating that forests in the Kalajun-Kuerdening are primarily distributed on shaded and semi-shaded slopes. Conversions from Grassland to Forest are more extensive around aspects of 0–110° and 220°–359.9°. Areas where Forest transitions to Grassland are predominantly found at elevations of 1600–2500 m in the northern fringe zones of the study area.

Figure 15
figure 15

Overlay analysis of land cover change and influencing factors.

The distribution of Grassland converting to Bareland reaches a turning point at an elevation of 2800 m, where the vegetation above this altitude is predominantly alpine meadow. Figure 15 clearly shows that in these regions, the area of Grassland transitioning to Bareland is significantly less than that of bareland transitioning to grassland. This indicates that from 1994 to 2023, the alpine meadow ecosystems in the Kalajun-Kuerdening area have demonstrated a positive development trend.

Discussion

Land cover changes and the impact of related policies

The land cover in the study area exhibits significant spatial variations. Figures 7 and 8 show that during the periods of 1994–1997 and 1997–2007, there was a phenomenon of small, concentrated Forest converting to Grassland in the northwest and northern regions (Kalajun and Qiaxi) of the study area. Between 2007 and 2017, some of these "Forest-to-Grassland" areas reverted to Forest (Fig. 8c). This phenomenon is not incidental.

Firstly, it is noteworthy that Kalajun-Kuerdening falls under the jurisdiction of the Western Tianshan Forestry Bureau in Xinjiang, one of the 135 key state-owned forestry enterprises established after the founding of the People's Republic of China. Since the 1950s, large-scale forest resource development in this region, primarily driven by timber harvesting, led to significant consumption of natural forests. Some of the forests were cleared and gradually converted to grassland (Fig. 8a). However, the implementation of the Natural Forest Protection Project (NFPP) in 1998 marked a critical turning point. With this policy, China began to restrict and eventually halt commercial logging of natural forests, especially in key ecological protection zones40. For the Kalajun-Kuerdening, the Xinjiang Uygur Autonomous Region decided in 2004 to completely stop logging natural forests, meaning that areas previously used for timber production ceased logging activities. Under this policy context, areas previously converted to grassland due to logging had the opportunity to recover. The NFPP not only halted further forest depletion but also actively promoted forest resource restoration41. Measures such as forest nurturing, afforestation, and natural regeneration gradually took effect between 2007 and 2017, leading to the reforestation of some areas initially converted to grassland42,43. This recovery process reflects the effectiveness of the protection policy and demonstrates that ecosystems in these areas can naturally recover once disturbances cease.

Additionally, the establishment of the West Tianshan National Nature Reserve in 2000 and the inclusion of the area in the World Natural Heritage List in 2013 further strengthened the protection of the regional ecosystem44. These protective measures not only limited human impacts on the environment but also created favorable conditions for forest restoration.

From 1994 to 2023, policies evolved from prioritizing timber economy to balancing protection and timber economy, eventually focusing primarily on protection. The forest area in the study region initially decreased and then increased, with a notable reforestation of "Forest-to-Grassland" areas in Kalajun and Qiaxi during 2007–2017. The trends in land cover changes align closely with the policy evolution, reflecting the ongoing restoration of ecological impacts from timber harvesting and validating the effectiveness of natural forest protection policies. This process indicates that well-designed policy interventions can facilitate ecosystem recovery within a relatively short period and underscores the importance of ecological protection policies. The changes observed in the Kalajun-Kuerdening region after the implementation of the NFPP highlight that the old model of relying on timber harvesting for livelihood is no longer viable and that comprehensive protection of natural forest resources must continue to be strictly enforced.

Driving mechanism of land cover changes

Using the Random Forest algorithm and Geodetector model, analysis from 1994 to 2023 shows that natural geographical factors like elevation, aspect, precipitation, temperature, and evapotranspiration play a more significant role in land cover changes than other factors. This finding aligns with studies in other alpine regions that emphasize the impact of climate and topography in less human-impacted areas on land cover dynamics45,46,47.

Kalajun-Kuerdening, located in a high, arid alpine region, is significantly affected by altitude and hydrothermal conditions on its land cover. The area's elevation varies from 1300 to 4300 m, transitioning from alpine forests at lower altitudes to meadows and eventually snowy and ice-covered terrains as altitude increases. Moving from north to south, the elevation rises, and with it, temperatures fall and precipitation increases, affecting vegetation patterns. The area's rugged, high alpine terrain, alongside restrictions from conservation policies, limits human activities like tourism and grazing, making natural factors the primary drivers of land cover changes in this region.

Although the area of alpine meadows above 2800 m converting to Bareland is significantly less than that of Bareland converting to meadows, the degradation of meadows above this altitude warrants attention from conservation and management personnel. While this study demonstrates that human activities play a relatively small role in explaining land cover changes within the study area, close vigilance must still be maintained for land cover changes occurring at and outside the study area. In addition, long-term ecological assessment of tourist areas is crucial for effective management of the area and regulating the intensity of tourism development.

Limitation and prospective

Due to the lack of imagery data in the study area before 2006, it has been impossible to obtain long-term continuous or consistent interval land cover change information since 1994, representing the greatest limitation of this study. Additionally, the rivers in the study area are primarily fed by snowmelt, and their channels typically measure only one to two meters in width, which is far less than 30 m. As a result, 30-m resolution remote sensing imagery is insufficient for accurately identifying the locations of these rivers. With the ongoing development and widespread availability of high-resolution satellite remote sensing data, future research can utilize continuous high-resolution remote sensing data for long-term monitoring of land cover changes in the study area, enabling timely adjustments to conservation and management policies.

Furthermore, given the significant explanatory power of climate on land cover changes, our future research can utilize long-term, high-resolution data to conduct more detailed analyses of changes in the upper limits of forests. Additionally, the effects of climate change on glacier retreat can also be explored. This will help provide a more comprehensive understanding of the role of climatic factors in alpine ecosystems.

Conclusion

Taking the alpine heritage site of Kalajun-Kuerdening as a case study, our research presents an instance of the in-depth investigation into the patterns of change in land cover types and the prioritization of driving factors in high mountain heritage landscapes. Compared to previous studies, we overlay the analysis of land cover changes with factors of high explanatory power, aiming to identify suitable intervals for land cover changes, thereby laying the foundation for further optimization of conservation policies.

  1. (1)

    The use of Mode Interpolation effectively resolves issues of land cover classification completeness in high-altitude areas, which are caused by cloudy conditions and missing image data in preceding and subsequent years.

  2. (2)

    Between 1994 and 2023, Forest and Grassland increased by 55.96 km2 and 18.16 km2, respectively. During the period from 1994 to 1997, Forest and Grassland exhibited a net loss, with the highest annual conversion intensity and the most significant changes. The primary increase in forest area occurred from 2007 to 2017, with a growth rate reaching 12.87%. The trends and spatiotemporal distribution of forest changes align with the evolution of forest protection policies. A substantial amount of bare land has been converted to grassland, indicating a positive shift in land cover within the study area.

  3. (3)

    Consistent validation through the Random Forest algorithm and the Geodetector model indicates that elevation, precipitation, temperature, and evapotranspiration have high explanatory power for land cover changes.

  4. (4)

    According to the overlay analysis, the transition from Grassland to Forest is most favorable under specific environmental conditions: annual precipitation between 275 and 375 mm, annual temperature between −2 and 3 degrees Celsius, annual evapotranspiration between 580 and 750 mm, elevation between 1800 and 2600 m, and slope directions between 0 to 110 degrees and 220 to 359.9 degrees.

Overall, through continuous monitoring of land cover changes and their driving factors in the typical alpine World Natural Heritage site of Kalajun-Kuerdening, valuable insights have been provided for the scientific and precise implementation of future land management and conservation policies. In the future, we need to adhere to the principle of prioritizing conservation, focusing on areas prone to negative land cover transitions, and continuously monitoring the impacts of climate change on land cover. Additionally, it is essential to pay special attention to the land cover status in areas with increasing human activities, such as tourism development.