Introduction

Roads constitute an archetypical example of the tension between human development and environment degradation, especially in mountainous areas, as they are necessary for the former and can lead directly to the latter1. Landslides are one of the world’s worst natural disasters and lead to loss of life and property, blocking of roads and rivers, disruption of communication and triggering of floods2. Based on major disasters between 1980 and 2002, 640 persons in Colombia in 1987, 472 persons in Nepal in 2002 and 400 persons in India in 1995 were killed by slides, including avalanches and landslides, while Nepal suffers 116.25 average deaths per landslide and 71.41 in China3. Many landslides are derived from or related to road networks4. In particular, mountain roads are the most prodigious source of landslide sediments of all widespread land uses5. During road construction, roads are embedded into or through steep hillsides by blasting and excavating, which can create large areas of instability by the use of cut- and fill- construction. Cutting into hillsides and then removal of the toe of slopes or filling slopes to widen and reinforce roads both effectively reduce the slope cohesion and strength, and contribute to slope failures (Fig. 1). On the down slope side of built roads, stack areas of discarded material from road construction are common. Moreover, road construction interrupts surface drainage, ditches and culverts, and alters subsurface water movement, changes the distribution of mass and increase erosion because of road-related deforestation and construction activities2,5. All of the above factors could facilitate landslides during and after road construction. For example, the volume of slide material in the western Cascade Range, Oregon, removed from road right-of-way has been 65470 m3/km2, which is 30 times the rate of slide activity in undisturbed forested areas6. Unprecedented rates of landslides and surface erosion were noted after the construction of Weixi-Shangri road (23.5 km) in Yunan province, China. These rates averaged up to 9600 t ha−15.

Figure 1
figure 1

Typical road-induced landslides: fillslope failure (FSF) and cutslope failure (CSF) are their main interpretation keys of remote sensing. Both main scarps of FSF and CSF are relatively obvious border and brighter than surrounding colour. From image texture, we can see the traces and stacks caused by movement of the displaced material and surface erosion. The vegetation coverage is relative low. The CSF with excavation signs always locate above and adjacent roads with steep slope, and the FSF always locate under and adjacent roads with relative big slope. The image was provided by DigitalGlobe and obtained from Google Earth 7.1.

However, along with social and economic development, people in mountainous areas who are always poor have a strong desire to be wealthy. Constructing transportation networks is usually the first and key step to support tourism, trade, agricultural development and local travel. Thus, considerable scientific literature has been published regarding the development of a landslide inventory, including susceptibility7,8,9,10 and hazard zoning for land-use planning, avoiding landslide-prone areas, engineering design11 and developing efficient ways to reduce future damage, but only a few studies have directly focused on road-induced landslides4,12,13,14,15.

Based on the development of computer and satellite technologies, Remote Sensing (RS) techniques have been widely used in landslide studies and are well suited to acquire and analyse spatial data related to landslides16. Satellite imagery offers an economical and fast method to monitor and map landslides over large and inaccessible areas. The last few decades have witnessed the increasing use of RS techniques, such as interpretation of aerial photography, stereoscopic image analysis, interferometry studies, and Light Detection and Ranging (LiDAR) for identifying, detecting, monitoring, cataloguing, assessing risk, and mapping17,18,19,20,21,22, but few studies have reported the automated methods for extracting road-related landslide inventories16,23. Borghuis24 showed how unsupervised classification could detect 63% of all landslides mapped manually. The automated and semi-automated methods can improve working efficiency compared to time-consuming visual interpretations, which are fraught with the subjectivity of the visual interpreters, and lower workload19,20.

Therefore, in view of huge successes using Normalized Difference Vegetation Index (NDVI) and free Landsat Operational Land Imager (OLI) sources, our study aimed at designing a new remote sensing index, Normalized Difference Road Landside Index (NDRLI), in conjunction with object-oriented classification methods using Landsat OLI satellite images and digital elevation model (DEM) derived slopes to automatically classify road-induced landslide (i.e., cutslope and fillslope failures resulting from cut-construction and fill-construction, respectively) locations and area. The NDRLI-based method was tested in Yunnan province, southwest China, the most landslide-prone area in the country, which has experienced or is currently experiencing extensive roads construction. This method is a new solution for mapping landslides for road management and other relevant applications.

Results

According to NDRLI-based, road-induced landslide classification method, described in Materials and Proposed NDRLI-based method section, we calculated the NDRLI to classify potential road-induced landslides and the Shape Index of Spectral Curve (SISC) to reduce bare soil and farmland noise. Then, we removed shadow areas using object-oriented classification methodology and further reduced farmland by using the angle of slope rule. The entire process was performed using ENVI, ArcGIS and Google Earth. The final classification of road-induced landslides was shown in Fig. 2. The total area of road-induced landslides in study area is 4.38 km2 and accounts for 4.47% of the total study area.

Figure 2
figure 2

The distribution of road-induced landslides from Normalized Difference Road Landside Index (NDRLI)-based method based on Landsat 8 OLI data26. The map was generated using ArcGIS10.1 (http://www.esrichina.com.cn/softwareproduct/ArcGIS/).

The accuracy of classification is the primary issue for the application of NDRLI-based methods in many fields, including road planning and risk assessment. Therefore, it is necessary to evaluate the performance of NDRLI-based methods. We sampled some validated areas from Google Earth as real surface regions of interest to calculate a confusion matrix. Precision evaluation was performed for the classified images by using the confusion matrix in image processing software. The overall accuracy is 98.49%, and the Kappa coefficient is 0.51, which belongs to moderate level. By comparison, error classification increases with the distance to roads. Previous research has shown that the distance from roads increases the landslides constituting declines and landslides usually occurred at the distance range of 0~50 m25. In addition, Fig. 2 also clearly shows that most landslides are near the S233 road and the accuracy will decrease with the increase of distances to roads. The Kappa coefficient increased to 0.74 when we reduced our study area to a 100 m buffer area along the road.

Conclusion and Discussion

The NDRLI-based method is new method that has been developed primarily to extract road-induced landslides and to enhance their presence in remotely sensed, digital imagery, while simultaneously removing bare soil, farmland, water and vegetation features in cooperation with other information. This new method can quickly and efficiently discriminate road-induced landslides from background. Noise from vegetation, water, bare soil and farmland can be reduced and even removed. This method is automatic and would be very useful for large regions, especially low accessibility mountainous areas. In a non-road situation, the landslides caused by human engineering activity may be extracted by the NDRLI-based method because of their similarity with road-induced landslides, but the accuracy is probably lower due to the increase of misclassification with the distance to roads.

Sentinel-2 sensor is very similar to Landsat-8 at least in the bands 1–7 used in this study, and has the advantage in spatial resolution over Landsat 8. We firstly assumed that Sentinel-2 images will perform well in the NDRLI-based method, and then the adaptation was done step by step. Sentinel-2A data of the study area (20 m and 10 m spatial resolution) from European Space Agency (https://scihub.copernicus.eu/) were downloaded and pre-processed. The NDRLI was made the corresponding changes (Equation 1) according to the band differences. The SISC used 1.15 as the threshold value obtained through the trial-and-error attempts described in Optimizing results section. After that, the road-induced landslides were extracted from Sentinel-2A. Moreover, we build a 7.5 m buffer according to actual road width along a digitized centreline of provincial road S233 in Google Earth to delete the road noises due to the coarse resolution of Shuttle Radar Topography Mission (SRTM) DEMs. Finally, the area of road-induced landslides was identified to be 5.52 km2 which accounts for 5.63% of the total study area (Fig. 3). The overall accuracy is 94.63%, and the Kappa coefficient is up to 0.81. Compared to Landsat results, the landslide areas and Kappa coefficient have increased by 1.14 km2 and 0.07, respectively, mainly due to the increase of spatial resolution of Sentinel-2A. The overlap area of these two results accounts for 70.3% of total landslides area derived from Landsat-8, while the unoverlap area mostly comes from small landslides derived from Sentinel-2.

$${\rm{NDRLI}}=\frac{SWIR-BLUE}{SWIR+BLUE}$$
(1)
Figure 3
figure 3

The distribution of road-induced landslides from NDRLI-based method based on Sentinel-2 data.

There are still several aspects for further study. First, terrain shadow is an important impact factor on classification accuracy. Extracting landslides from shadow OLI images is very difficult, so our method didn’t consider this situation. Second, misclassification of bare soil and farmland still exists. If the threshold value of NDRLI is appropriate, the noise could be removed. These thresholds should be tested to find suitable values for different areas. Third, each pixel of Landsat 8 OLI image is up to 900 m2, so the edges of landslide pixel may have other features. Mixed pixels will reduce classification accuracy. Fourth, we used Google Earth images as the ground truth map of the landslides without field investigation21 or using LiDAR22, which might be slightly unreliable.

Although Kappa coefficient is not high, we still believe that this study will be meaningful because of the mixed characteristics of landslides. In the future, shadow effect can be overcome by collaboration with shadow enhancement and detection, and high resolution images. Using other features (e.g., texture) are directions to try. This method could provide insights for further studies.

Materials and Proposed Ndrli-Based Method

Study area and data

The Lancang River runs through the Hengduan Mountains with its complex landforms. This area is known as a remote mountainous region and is the most landslide-prone area in the country, especially in the midstream region. Approximately 1184 km of roads, including highways, national roads and provincial roads, were built in the midstream region with its steep mountains and deep valleys. The road density is extremely high. Thus, we selected a section of S233 (a provincial road) and G214 (a national road) along the middle of the Lancang River as targets to build a 2 km wide road buffer to establish our study areas (Fig. 4).

Figure 4
figure 4

Locations and basic information of the study area, generated by ArcGIS10.1 (http://www.esrichina.com.cn/softwareproduct/ArcGIS/). The length of the S233 in the validation area is 49 km.

Landsat 8 OLI images (24-12-2014, Path 132/ Row 041) were downloaded from the United States Geological Survey (http://glovis.usgs.gov/)26. SRTM DEMs with 30 m spatial resolutions were generated by National Aeronautics and Space Administration and provided by the Data Center for Resources and Environmental Sciences, Chinese Academy of Sciences (RESDC) (http://www.resdc.cn). Google Earth was chosen as the major data source to extract and validate road-related landslide as it allows users to obtain free high-resolution satellite images from around the world and to measure the length and height of target objects. It becomes a very popular and reliable data source or an additional option in many studies27,28,29. For example, Goudie27 used Google Earth to quantify pan and creek characteristics of salt marshes on Google Earth imagery at 100 × 100 m scale.

Technical framework

NDRLI-based methodology includes three major stages: sampling road-induced landslides; designing RS indexes; and optimizing results to eliminate other mixed types as shown in Fig. 5. In the first stage, the locations and areas of cutslope and fillslope failures along roads were depicted and randomly sampled using Google Earth by visual interpretation according to interpretation keys (Fig. 1). To reduce the impact of moisture on landslide interpretation results, images during drought periods are good options because the landslide surfaces were often mixed soil, rock, and detrital grain. Landsat OLI satellite images were pre-processed, including radiometric calibration, atmospheric correction and image cutting. For the second stage, we identified the location of known, road-induced landslides and then sampled spectral signatures at the same locations in OLI images. Plotting spectrum curves and analyzing the correlation of 7 bands of OLI images provided some different spectral laws between landslides and other surface features that helped to develop a new RS index method (NDRLI). In the final stage, other mixed surface features like bare land from potential road-induced landslide region were eliminated to optimize final landslide area.

Figure 5
figure 5

Workflow for NDRLI-based road-induced landslides classification.

Designed NDRLI

According to Fig. 5, 194 samples, including 155 fillslope and 39 cutslope failures, were depicted along S233 and G214 road in Google Earth. Spectrum curves of these points were then plotted in Fig. 6. Spectrum curves are mainly determined by material composition, such as soil, sand and rock, and are affected by moisture.

Figure 6
figure 6

Cumulative spectrum curves of road-induced landslides (n = 194).

Figure 6 clearly shows that reflectance of major road-induced landslides reaches its maximum level at band 6 and its minimum level at band 2. Ground objects with high temperatures typically have a high shortwave infrared (SWIR) reflectance. Thus, landslides primarily covered by bare soil, gravel and detrital grain have a higher reflectance at SWIR than other objects at ambient temperatures (e.g., vegetation, water, soil) because of the landslide surface has a higher temperature. Spectrum curves of landslide peak magnitudes based on SWIR (1.6 μm) were at a minimum at the Blue band (0.45 μm). While band 6 is better in identifying bare soil and low water content areas, band 2 is better at identifying soil and vegetation.

To find further differences among the 7 bands, the principal component analysis between bands was conducted after atmospheric correction of the images. According to the covariance matrix (Tables 1 and 2), band 6 has the maximal covariance with other bands, which means that band 6 could represent more information. Table 2 shows that near-infrared reflectance (NIR) band has relatively better independence than other bands in general, and next is band 2 and band 6. The correlation coefficient between band 2 and band 6 is 0.84 that is the second smallest. Band 6 is carrier of with ample information.

Table 1 The covariance matrix of each band of Landsat 8 × 103.
Table 2 The correlation matrix of each band of Landsat 8.

The band-ratio method takes advantage of the differences in the reflectance of different wavelengths of light from any given surface30. For example, NDVI uses the condition where the features that have higher NIR and lower red light reflectance will be enhanced, while those with low red light reflectance and very low NIR reflectance will be suppressed or even eliminated.

In conclusion, the NDRLI was designed using similar principles that were learned from NDVI. The NDRLI is calculated as follows:

$${\rm{NDRLI}}=\frac{{\rm{SWIR}}1-{\rm{BLUE}}}{{\rm{SWIR}}1+{\rm{BLUE}}}$$
(2)

where BLUE is blue light band, and NDRLI ranges from −1 to +1. The index is designed to (1) maximize reflectance of road-induced landslides and probable bare soil by using SWIR1, (2) minimize the low reflectance of blue light by water and (3) take advantage of the high reflectance of SWIR by road-induced and bare soil features. As a result, road-induced landslides and bare soils are enhanced, while water usually has negative values and therefore is suppressed. In addition, vegetation and farmland are also more enhanced than road-induced landslides. After many trial-and-error attempts, we found that the NDRLI range of 0 to 0.5 are probably road-induced landslide areas, around 0.4~0.7 is probably farmland and bare soil, and around 0.7~1 is probably vegetation. Thus, there are overlaps in the NDRLI thresholds among road-induced landslides, farmland and bare soil. This is consistent with the mixed characteristics of road-induced landslides.

Optimizing results

The information of delineated, road-induced landslide was often mixed with the noise from farmland, bare soil and a little built-up land. This happens because most road-induced landslides are mixtures, and the surfaces of bare soil with lower vegetation coverage rate and pre-planting farmland are mostly bare. For example, some unstable landslides are fully or partly covered with vegetation after several years of road construction. Their reflectance pattern in the blue light band and SWIR1 is similar to that of road-induced landslides, i.e., they both reflect shortwave infrared light more than they reflect blue light. As a result, the computation of the NDRLI also produces a positive value for farmland and bare soil.

To remove the farmland and bare soil noise from NDRLI’s potential landslide area, we carefully plotted the spectral reflectance patterns of common land cover types and landslides (cutslope and fillslope failure) from the test area of this study. A detailed examination of the signatures in Fig. 7 reveals that the average reflectance derived from farmland and bare soil at bands 3 (green band), 4 (red band) and 5 (NIR band) formed concave curves, but landslides approximated convex curves. Therefore, if the mean of band 3 and band 5 is divided by band 4, farmland and bare soil should be greater than 1, but landslides, less than or equal to 1. Based on this assumption, we designed a SISC to remove noise. The SISC can be expressed as follows:

$${\rm{SISC}}=\frac{({\rm{GREEN}}+{\rm{NIR}})/2}{{\rm{RED}}}$$
(3)
Figure 7
figure 7

Spectral reflectance patterns of road-induced landslides (155 FSF and 39 CSF), farmland (n = 133), and bare soil (n = 99), in raw Landsat OLI satellite images23.

To further assess the result of SISC, we plotted the scatter plots to compare the band 4 to the mean of band 3 and band 5, from the test areas of this study, as shown in Fig. 8. It is very clear that the computations of SISC for farmland and bare soil are greater than 1, while the landslide results are close to 1. In view of the fact that most road-induced landslides usually contain soil, rock, and vegetation, we have tested the threshold of SISC and, after many trial-and-error attempts (Table 3), found 1.100 to be a suitable value to remove farmland and bare soil noise, e.g., while ground objects with higher SISC values (>1.100) will be removed as noise (Fig. 9).

Figure 8
figure 8

Scatter plots of the band 4 digital number (DN) with the average DN of the band 3 and band 4 (a) FSF (n = 155), (b) CSF (n = 39), (c) farmland (n = 133) and (d) bare soil (n = 99).

Table 3 Three test scenarios with different SISC values.
Figure 9
figure 9

(a) FSF (n = 155), (b) CSF (n = 39), (c) farmland (n = 133) and (d) bare soil (n = 99).

The geometry and radiometry of satellite imagery are significantly affected by the topography of mountainous areas due to shadow effects31. Many classification methods fail in mountainous areas, where terrain shadow effects are wide and difficult to eliminate. The same features under terrain shadows have different spectral reflectance. Therefore, we did not consider landslides in the shadow area. We randomly selected 20 shadow and non-shadowed samples from the image after atmospheric correction for region of interest statistical analysis in ENVI 5.3. We found the smallest overlap of spectral values of each band between shadow and non-shadowed area is within the range from 0~380 in NIR band. Then spectral values from 0~380 in NIR band was used to create a classification rule for extracting shadow area with the Feature Extraction tool of ENVI 5.3. Finally, the shadow area was eliminated from the study area with an object-oriented classification method.

We also can use slope to remove farmland from potential road-induced landslides. The relationship between landslides and slope is well understood. Slope is a key factor of landslides and internal conditions that trigger landslides. Many statistics indicated that landslide-prone slope is 20°~50° 32. On the other hand, most farmland is located at slopes less than 15°. If farmland has a slope greater than 25°, it should be returned to forestland in China. Therefore, a slope of less than 20° was used as a threshold to further remove farmland and improve classification accuracy.