# Assessing and mapping multi-hazard risk susceptibility using a machine learning technique

## Abstract

The aim of the current study was to suggest a multi-hazard probability assessment in Fars Province, Shiraz City, and its four strategic watersheds. At first, we construct maps depicting the most effective factors on floods (12 factors), forest fires (10 factors), and landslides (10 factors), and used the Boruta algorithm to prioritize the impact of each respective factor on the occurrence of each hazard. Subsequently, flood, landslides, and forest fire susceptibility maps prepared using a Random Forest (RF) model in the R statistical software. Results indicate that 42.83% of the study area are not susceptible to any hazards, while 2.67% of the area is at risk of all three hazards. The results of the multi-hazard map in Shiraz City indicate that 25% of Shiraz city is very susceptible to flooding, while 16% is very susceptible to landslide occurrences. For four strategic watersheds, it is notable that in the Dorodzan Watershed, landslides and floods are the most important hazards; whereas, flood occurrences cover the largest area of the Maharlou Watershed. In contrast, the Tashk-Bakhtegan Watershed is so sensible to floods and landslides, respectively. Finally, in the Ghareaghaj Watershed, forest fire ranks as the strongest hazard, followed by floods. The validation results indicate an AUC of 0.834, 0.939, and 0.943 for the flood, landslide, and forest fire susceptibility maps, respectively. Also, other accuracy measures including, specificity, sensitivity, TSS, CCI, and Gini coefficient confirmed results of the AUC values. These results allow us to forecast the spatial behavior of such multi-hazard events, and researchers and stakeholders alike can apply them to evaluate hazards under various mitigation scenarios.

## Introduction

The Sendai Framework, with its comprehensive vision, recommends more efforts to decrease disaster risk and increase sustainable development. Especially communities who are increasingly susceptible to natural hazards should adhere to these guidelines and plan accordingly. In this regard, the multi-hazard approach is often used in risk reduction projects and studies addressing risks associated with human activities or climate change on a regional and local scale1. It is obvious that introducing a universal set of multi-hazard assessment techniques is of fundamental importance for reducing disaster risk, and constitutes a valuable asset to share with other stakeholders, including the private sectors, local governments, and other stakeholders.

The use of the term multi-hazard in the current research is related to the objective of risk reduction among natural hazards, including flood, landslides, and forest fires, in a specified spatial distribution in this study2,3. Recently, susceptibility modeling approaches related to single processes have advanced considerably for river floods4 and landslides5,6,7. However, there is still neither a common terminology nor a uniform conceptual approach for analyzing multiple hazards in conjunction. This is not unexpected because multi-hazard analyses are not the sum of single-hazard examinations. The various hazard characteristics and the methods used to analyze them are completely different8. A variety of quantification measures and susceptibility descriptions exist, which need to be adapted to enable the comparison of multiple hazards9. Also, natural processes have various effects on different elements at risk, and the techniques used to determine vulnerability diverge between hazards3. These topics constitute the main challenges for multi-hazard analyses.

The possibility of predicting which areas are susceptible to a specific type of disaster, including landslides or forest fires, is undisputed. The prediction techniques have proven valuable for predicting various characteristics of a natural disaster that has occurred10. Many researchers recognized that the occurrence of landslides and forest fires is influenced by various aspects that involve human activities and climate conditions11,12. Several methods for spatially modelling landslides and forest fires have been developed13,14.

### Studying the susceptible watersheds of natural hazards

Dorodzan Watershed is one of the strategic areas of Fars Province and plays a very important role in the agricultural production and self-sufficiency of Iran. It is an area which supplies water resources to Tashk and Bakhtegan lakes and is affected by wind erosion. Figure 6 illustrates that 36.35% and 68.64% of Dorodzan Watershed are covered by the low class of susceptibility to flood and forest fire, respectively. However, regarding landslides, the moderate susceptibility class covers the largest area (42.45%). The Maharloo Watershed, as a second grade watershed of the Ministry of Energy, is the main source of the Kor River. In this watershed (Fig. 6), the moderate class covers the largest area (27.76%), although the classes of low susceptibility to forest fires (88.81%) and landslides (36.48%) covered the greatest area in Maharloo Watershed. The Ghareaqaj Watershed, which is currently used for drinking and agricultural purposes, is one of the most important rivers in Fars Province. The construction of the Salman Farsi Dam in Qir and Karzin and studies on the construction of the Kavar Dam on this river indicates the importance of the river in the mentioned province. In this watershed (Fig. 6), all three hazards (floods, forest fires, and landslides) pose a low risk (37.68%, 64.71%, and 42.80%). Moreover, the most important source of water supply are the Bakhtegan and Tishak lakes. Based on Fig. 6 (Tashk-Bakhtegan watershed), the low susceptibility class covers the largest area of flood (33.85%) and forest fire (77.59%), while, based on the landslide susceptibility map produced by the RF model, 38.19% of the total area was covered by the moderate class.

### Studying the validation of natural hazard susceptibility maps

In order to produce natural hazard susceptibility maps, all hazards were divided into two data sets: one for modeling and one for validation. The accuracy of the three maps produced by the RF model was verified using ROC curves (Table S3). The AUC values for the flood, forest fire, and landslide maps were 0.834, 0.943 and 0.939, respectively. Regarding the standard error, floods had the highest value (0. 028), followed by forest fires (0.016), and landslides (0.023). Further, the forest fire map had excellent accuracy (0.958), while the model considered the landslide and flood maps as very good. Also, the results of the other measures (Table 4) confirmed the accuracy of the three hazard maps, as, according to Table 4, the F-measure, specificity, and sensitivity of each hazard is more than 0.77. Furthermore, the TSS index is 0.541 for floods, which indicates fair accuracy, whereas its values for landslides and forest fires were 0.889 and 0.850, respectively, indicating an excellent model, based on the findings of Allouche et al. (2006)44. Also, according to published reports, a Gini coefficient value above 0.6 (60%) indicates a good model in terms of accuracy. On the other hand, when the CCI (overall accuracy) is between 0.6–0.8, it shows that the accuracy of the model is good. So, the RF model is known as an accurate classifier for the three depicted hazards.

## Discussion

In this study, the importance of factors controlling landslide, flood, and forest fire locations was analyzed using the Boruta algorithm. The Boruta algorithm provided quantitative results, which is a significant advantage that allows the potential comparison of studies in different regions around the world. As it was already stated, the study area is prone to combinations of landslides, floods, and forest fires. Generally, the development and formation of these natural hazards are controlled by several factors, and the distribution of these hazards cannot be random.

The most statistically significant relationship among factors, based on the Boruta algorithm, was found between flood location and land use, and land use presented as the most important factor influencing flood hazards among all considered variables. Wheater and Evans (2009)45 implied that land use affects the hydrology that determines water resources leading to flood hazards. It is increasingly identified that the management of water and land are strongly linked. Generally, steeper slopes are more vulnerable to massive erosion, including landslides. The steepness of slopes is reported as a factor of primary importance that promotes high runoff velocity, which results in this type of erosion. Regarding slope and aspect, forest fires predominantly occur in the steep slopes of the southern areas, as vegetation is typically dry. The effects of slope and aspect on fire behavior in the occurrence of forest fires have been reported by Adab et al. (2013)46. According to Pourghasemi (2016)47, topographic data (i.e. slope and aspect) are the most important factors for forest fire assessment. Contrary to the above results, Bui et al. (2017)48 found that NDVI (Normalized Difference Vegetation Index) had the strongest impact on the occurrence of forest fires. However, Hong et al. (2017)49 and Gigovic et al. (2019)50 respectively demonstrated that slope has a significant positive effect on the occurrence of forest fire events. In relation to flood effective factors, the research carried out by Liu et al. (2005)51 confirmed that the urbanization scenario has a strong influence on heightening flood volume. For instance, afforestation has a positive impact, while deforestation has a negative impact on the occurrence of floods.

Besides determining variable importance, the RF model was used to prepare susceptibility maps for landslides, floods, and forest fires, first separately (Fig. 2), and then jointly in the form of a multi-hazard map (Fig. 4). The susceptibility maps for floods, landslides, and forest fires revealed that most of the study area is characterized by low susceptibility to each hazard when analyzed separately (Fig. 3). The multi-hazard probability map modeled by RF revealed that the most parts of the study area are not susceptible to any hazards, whereas few areas are at risk of all three hazards together (Fig. 3). Floods are recognized as the most dangerous hazard in the study area, followed by landslides and forest fires (Fig. S4). Further, effective flood risk reduction requires more analysis of this individual hazard and its interaction with the other hazards. Additionally, the validation of the RF models determined an excellent accuracy of the forest fire and landslide susceptibility maps (Table S3). Pourghasemi et al. (2019)52 produced a susceptibility map for three hazards (i.e. landslides, floods, and earthquakes) using the ensemble model named SWARA-ANFIS-GWO. They showed that 17.14% of the area is affected by no hazards, whereas most parts were susceptible to landslide and flood hazards together (33.70%). They also indicated accuracies of 84% and 80% for flood and landslide maps, respectively. Skilodimou et al. (2019)53 applied the analytical hierarchy process (AHP) to produce separate maps for landslide, flood, and earthquake hazards and combined them into a single multi-hazard map. They showed that 80% of the landslide occurrences and all the recorded flood events fall within the boundaries of the moderate, low and very low susceptibility classes.

There are several advantages that make the RF model suitable for the approach in the present study. First, it is a simple, fast algorithm that makes no statistical assumptions and is characterized by a high prediction performance54,55. It produces an internally unbiased evaluation of generalizability with an accurate classifier during the forest building processes26 and provides better consistency of results and robustness of forecasts56. The RF can precisely handle heterogeneous inputs of different nature and scalability from different sources55,57. Another important benefit of the RF model is that there are significant criteria that indicate the importance of each predictor variable55,58. However, it has some sources of uncertainty that are frequently unacknowledged or even unrecognized.

One source of uncertainty in the modeling process is related to the gathered data. It is important to consider non-linear correlations among dependent and independent variables; this problem can be solved by machine learning techniques. One of the advantages of machine learning techniques in comparison to traditional methods (bivariate and multivariate statistical methods) is that the ML algorithms can deal with noises in the data and are also accurate in the presence of uncertain data and limited measurement errors. Quality of data is also important. In the current study, different extensive field surveys were conducted to collect suitable data for all three hazards; however, according to the accuracy of the flood susceptibility map (Tables S3 and 4), there appears to be greater uncertainty compared to the landslide and forest fire hazards, because the selection of flood locations is so difficult compared to other hazards. Another uncertainty source is the accuracy of the built model. For solving this problem, different techniques were applied, and the results are presented in Table 4. According to Table 4, the achieved results of the AUC values confirmed the accuracy of the built model for the three examined hazards, namely floods, landslides, and forest fires. Also, dividing the entire dataset into two sets for training (70%) and validation (30%) can be effective in decreasing uncertainties in a model’s performance. Another uncertainty source is limitations of the learned model that the ML techniques such as the RF isn’t faced to this problem, meanwhile this algorithm for removing this uncertainty, used from error rates (Table 4) and out-of-bag indicator. Results of the out-of-bag values for forest fires, landslides, and floods were 3.55%, 15.6%, and 22.27%, respectively.

Nowadays, the necessity of using machine learning techniques is increasingly emphasized in the susceptibility modeling of geomorphological features and processes37. A universal framework describing which factors to compare is required. This general framework can be semi-quantitative, qualitative, or quantitative3. It should be suitable for both single hazard and multi-hazard assessments, because multi-hazard evaluation plays the main role in reducing disaster risk and provides crucial information for sharing with the other stakeholders, such as local governments and private sectors55. Considering multi hazards jointly and applying the same technique to analyze them can give us a comprehensive view of the changes occurring in the environment. Further, a synthesized multi-hazard probability map supports planners in sustainable development and adaptive management because this map provides homogenized information about different environmental hazards for a specific area64. It means that the potential use of hazard evaluation becomes obvious when considering all hazards together, on the basis of which plans and projects can be implemented considering this comprehensive view of a region59. From this point of view, a multi-hazard probability map can be used for integrated and comprehensive watershed management and land use planning and, consequently, for the sustainable development of a region.

## Conclusion

A better understanding of the factors controlling flood, forest fire, and landslide occurrence is crucial to the sustainable development of regions prone to these three hazards, such as the Fars Province. In this study, 365 floods, 358 forest fires, and 179 landslides were mapped for an area of 133,400 km2. The Boruta algorithm enabled us to analyze the impact of effective factors on the occurrence of three different natural hazards. According to the Boruta algorithm, the most important factor controlling flood occurrence in the study area was land use, followed by drainage density, and TWI. Among the different factors controlling forest fire occurrence, residential areas ranked highest, followed by slope, and aspect. Moreover, the highest rank of conditioning factors regarding landslide occurrence was found to be slope, followed by distance from rivers, and lithology. The RF model was also applied to prepare a susceptibility map of flood, landslide, and forest fire locations. The multi-hazard probability map produced for floods, forest fires, and landslides in Fars Province revealed that the majority of the land is not prone to any hazards. Total areas of 17.26%, 5.95%, and 14.16% were found to be at risk of floods, landslides, and forest fire, separately. However, 2.67% of Fars Province was found to be at risk of all three hazards together. Based on the AUC values, the best accuracy was determined for the forest fire susceptibility map, followed by the maps produced for landslides, and floods. Further, the multi-hazard probability map prepared in this study can be used for integrated and comprehensive watershed management and land use planning and, consequently, for sustainable development in the study region.

## Acknowledgements

The study was supported by the College of Agriculture, Shiraz University (Grant No. 96GRD1M271143) and by the Austrian Science Fund FWF through the GIScience Doctoral College (DK W 1237-N23) at the University of Salzburg.

## Author information

Authors

### Contributions

H.R.P., N.K. and M.A., M.E., M.Z., A.C. and T.B. designed experiments, run models, analyzed results, wrote and reviewed manuscript. T.B. critically discussed the results and helped with the writing. All authors reviewed the final manuscript.

### Corresponding author

Correspondence to Hamid Reza Pourghasemi.

## Ethics declarations

### Competing interests

The authors declare no competing interests.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Pourghasemi, H.R., Kariminejad, N., Amiri, M. et al. Assessing and mapping multi-hazard risk susceptibility using a machine learning technique. Sci Rep 10, 3203 (2020). https://doi.org/10.1038/s41598-020-60191-3

• Accepted:

• Published:

