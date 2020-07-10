Study area

Bandhavgarh Tiger Reserve (BTR) is located between 23° 27′ 00″ to 23° 59′ 50″ North latitude and 80° 47′ 75″ to 81° 15′ 45″ East longitude in the Umaria district of Madhya Pradesh, in central India (Fig. 9). The core zone of the reserve includes the Panpatha Wildlife Sanctuary (PWS) in north and Bandhavgarh National Park (BNP) in the south, together with having an area of 716 km2. The surrounding buffer zone has an area of 820 km2, adding the total area of the reserve to 1536 km2. The reserve is surrounded by the fragmented and human-dominated territorial forest ranges of the North Shahdol Forest Division (NSFD) in the north and northeast and south Shahdol Forest Division (SSFD) in the south-southeast. The territorial forest division of district Umaria (UFD) surround the reserve in extreme south and southwest, and the Katni forest division (KFD) is located to the west of the reserve. The Sanjay-Dubri Tiger Reserve (SDTR) and Guru-Ghasidas Tiger Reserve (GGTR) are located about 80–150 km from the BTR in the northeast and southeast, respectively. The whole landscape (BTR, SDTR, NSFD, SSFD, and GGTR) is regarded as an important tiger and elephant conservation unit (Fig. 10).

Figure 9 Location of the Bandhavgarh Tiger Reserve, Madhya Pradesh, India. Map of the study area was created using ArcGIS (v 10.3) software developed by ESRI. https://www.esri.com. Full size image

Figure 10 Map showing the Bandhavgarh Tiger Reserve (BTR), Sanjay-Dubri Tiger Reserve (SDTR) in the west, North Shahdol Forest Division (NSFD), South Shahdol Forest Division (SSFD), between BTR and SDTR and Umaria Forest Division (UFD) in south and south east of BTR. The map was created using ArcGIS (v 10.3) software developed by ESRI. https://www.esri.com. Full size image

BTR represents the moist deciduous vegetation dominated by sal (Shorea robusta) and sal mixed forests. The overall vegetation of the BTR comprises moist peninsular low-level sal forest, northern dry mixed deciduous forest, dry deciduous scrub, dry grassland and west Gangetic moist mixed deciduous forest73. BTR supports a wide variety of faunal assemblages from small invertebrates to the largest bovid in Asia. There are 35 mammalian species, over 250 species of birds, and a wide variety of butterflies in reserve. The deer species include chital (Axis axis), sambar (Rusa unicolor) and barking deer (Muntiacus munjtak). Indian gazelle (Gazella bennetti), four-horned antelope (Tetracerus quadricornis) and Indian blue bull (Boselaphus tragocamelus) are the three antelope species in BTR. Northern plains gray langur (Semnopithecus entellus) and rhesus macaque (Macaca mulatta) represent the two primate species, and the suidae family is represented by a wild pig (Sus scrofa). The reserve also holds a good population of re-introduced gaur (Bos gaurus).

Major large carnivore species include tiger (Panthera tigris), leopard (Panthera pardus), sloth bear (Melursus ursinus), Indian wolf (Canis lupus), Asiatic wild dogs (Cuon alpinus) and striped hyena (Hyaena hyaena). Golden jackal (Canis aureus), Indian fox (Vulpes bengalensis), jungle cat (Felis chaus), Asiatic wildcat (Felis lybica ornata), rusty-spotted cat (Prionailurus rubiginosus) and fishing cat (Prionailurus viverrinus) are the medium-sized carnivores in reserve.

Species occurrence data and spatial auto-correlation

The species occurrence data was obtained by collecting the scats of tigers and leopards in the study area. We collected 381 and 343 scats of tigers and leopards between 2017 and 2018. The identification of the scats was based on secondary evidence such as diameter range, and presence of associated ancillary signs like tracks74,75. Andheria et al.76 confirmed the accuracy of scat identification using the same features with fecal DNA tests. The scats where the identity of the predator was ambiguous were not collected. We obtained an additional 95 and 74 camera trap detection of tigers and leopards in a camera trap survey in the buffer zone of the reserve. We implemented spatial filtering using the SDM toolbox77 in ArcGIS (10.3) to reduce the inherent spatial bias in the species presence records. The scats and camera trap photo-captures of tigers and leopards were spatiality rarified at the distance of 1,000 m from each other. Lacking the real absence points, we randomly generated pseudo-absence points in ArcGIS (10.3) in an approximately equal number to the original occurrence points of the tiger and leopard to deal with the problems arising from unbalanced prevalence78. This was achieved by first generating 500 random pseudo absence points and then discarding the absence points within the buffer radius of 500 m of the original occurrence points of tiger and leopards to reduce the number of false negatives79. The buffer distance can be either set arbitrary or based on species attributes80. Out of 476 and 417 occurrence records for tigers and leopards, we retained a total of 184 and 261 spatially rarified occurrence locations and an equal number of pseudo absence points of tigers and leopards for final random forest modeling.

Environmental predictors

A total of 40 environmental predictor variables (Table 2) were used in predicting the species habitat relationships. We grouped predictor variables into five broad categories as climatic, topographic, landscape composition, vegetation, and human-influenced.

Table 2 The set of 40 predictor variables used in the multi-scale habitat modelling of tiger and leopard. First and second columns represent the type and the name of the variables, and third and fourth columns represent scale at which each predictor best explained the occurrence of tiger and leopard respectively. Full size table

These predictor variables were selected based on the similar habitat relationship studies of large carnivores. We obtained the bioclimatic variables from the WORLDCLIM database (https://www.worldclim.org). We tested the correlation among predictor variables at (|r| > 0.50) and subsequently removed the highly correlated variables using R packages “rfUtilities” and “randomForest”81 implemented in R82 to account for multi-colinearity among predictor variables83 as multi-colinearity may alter the model predictions to the significant extent84. The package “rfUtilities” removes the redundant variables using qr matrix decomposition (0.05 threshold) and thus only the least correlated variables (|r| < 0.50) were retained for further modelling (Fig. 11). We obtained a digital elevation map of the study area from the Shuttle Radar Topography Mission (SRTM) elevation database85,86. Slope, aspect, topographic ruggedness index was derived from the elevation layer using surface analysis tools in the Spatial Analyst toolbox in ArcGIS (10.3). We obtained the land use land cover (LULC) from the Indian Institute of Remote Sensing (IIRS, https://iirs.gov.in). The LULC layer included nine land use categories as dense sal dominated forests, sal mix forests, moist deciduous forests, dry deciduous forests, scrub habitats, grasslands, agriculture, water bodies and permanent human settlements. We calculated seven topographic variables including elevation, slope, aspect, topographic roughness, road density and river density. We calculated road and river density using the line density tool in ArcGIS at the spatial scale of 1,000, 2,000, 3,000 m. We used road and river density instead of Euclidean distance because of the high concentration of roads in the buffer zone of the reserve. Thus we used the percentage of roads and rivers within the radius 1,000, 2,000, and 3,000 m of the specie presence-absence. Monthly Normalized Difference Vegetation Index (NDVI) version 6 (MOD13Q1) generated every 16 days available at the spatial resolution of 250 m was obtained from the MODIS website (https://lpdaac.usgs.gov/products/mod13q1v006/). We reclassified the 23 NDVI layers into three seasons corresponding to summer, wet, and winter seasons. We resampled all the variables at the spatial resolution of 90 m using the SDM toolbox in ArcGIS (10.3).

Figure 11 Multi-colinearity among the predictor variables used in the final random forest modelling of tiger and leopard. The multi-colinearity was tested at (r > 0.50) using the R package “rfUtilities” and correlogram was produced using the R package “ENMTools”. The road density is represented at two spatial scales (1 km and 4 km) as shown in the top right corner of the correlogram as rd1km and rd4km, and moistdec, drydec represent the moist and dry deciduous forests respectively and hump denotes the human population density defined as the number of persons per square km. Full size image

Future climatic data

At present, climatic models are the best tools for simulating future climatic scenarios88. However, the variations within and among the different climatic models may pose problems in identifying the most robust and optimal model to use88. Though, no clear guidance on how and which climatic models to select exists, and researchers have little objectivity in selecting the climatic models87. The final decision may sometimes be influenced by the assumption (though not always correct) that a climatic model developed in a particular country will be more robust in that region87. In this study, we modeled the change in the potential distribution of tigers and leopards under two Representative Concentration Pathway Scenarios (RCP 2.6 and RCP 8.5) developed by the Japanese research community called Model for Inter-disciplinary Research on Climate change (MIROC5)89. These scenarios project the global greenhouse gas emissions based on the assumptions for a wide range of variables such as human population size, global energy consumption, and change in land-use patterns. We downloaded Global Climate Models (GCMs) from the WordClim website (https://www.worldclim.org/cmip5_30s). We aimed to predict the change in the distribution under the most conservative emissions scenario (closely corresponding to the current rate of greenhouse gas emissions) and the most severe emission scenarios. The climatic models used in this study represent two extreme scenarios of greenhouse gas emissions. The RCP 2.6 assumes that global CO 2 emissions would peak around 2020 and then fall to values around zero by 2080 and RCP 8.5 is regarded as the worst climatic scenario with higher predicted greenhouse gas emissions. RCP 8.5 assumes that the global CO 2 emissions would increase at a higher rate during the first half of the century and stabilize by 2100; the concentrations are however three times those in 200090,91,92,93.

Multi-scale data processing

We calculated the focal mean of each predictor variable across eight spatial scales (3,500–28,000 m) surrounding each species occurrence location (presence/pseudo absence) using a moving window analysis with the focal statistic tool in ArcGIS (10.3). Each spatial scale ranging from 3,500 m to 28,000 m surrounding each location was used as search radii for calculating the focal mean of all the predictor variables expect road and river density. The output of the focal statistics was the raster layers of each predictor variable at eight spatial scales and .dbf file of extracted raster values around each location of tigers and leopards.

Scale selection and univariate random forest models

The best predictive scale in multi-scale modeling approaches is usually selected by measuring potential environmental predictor variables within different buffer sizes (scales) around species locations (presence/absence) and then to regress each predictor variable against the response for each scale94,95. Following McGarigal et al.96 and Cushman et al.34, we ran a series of univariate random forest models for each predictor variable across eight spatial scales (3,500–28,000 m) to select the appropriate scale at which the predictor variable best explained the probability of species occurrence. The univariate random forest models were run with the underlying assumption that best fit identifies the most predictive, and therefore, the single most meaningful, spatial scale across all predictor variables15.

Random forest constructs a regression or classification tree by successively splitting the data based on single predictors. Each split forms a branch in the decision tree and trees are grown without pruning. Random forest utilizes bagging (bootstrap aggregation) that builds a large number of tress and the model output is obtained by averaging the aggregated tress or by maximum vote. During bagging, a bootstrap sample is randomly drawn to build each tree and the data not included in the bootstrap sample is termed as ‘out-of-bag’ (OOB) which is used to estimate an unbiased error rate and to rank variable importance. We used OOB rates as a measure of selecting the best predictive spatial scale. In the calculation of OOB error rates, a training data set is created by sampling with replacement from two-third of the data for each classification tree in a random forest. Each tree is then used to predict the remaining one-third (‘out of bag’ or ‘bootstrap sample’) of the data. Finally, the OOB error is computed as the proportion of times that the predicted class is not the same as the true class26,81. The scale with the minimum OOB error rates was selected as the best spatial scale of the predictor variables.

Multi-scale random forest modeling

Multi-scale random forest models were created using the scale optimized predictors of tigers and leopards with R package ‘randomForest’81 implemented in R82. Model Improvement Ratio (MIR) was used to identify the most parsimonious random forest model. In the model selection process using MIR, the variables were subset using 0.10 increments of MIR values, and all variables above this threshold were retained for each model. This subset was always performed on the original model’s variable importance to avoid overfitting. Comparisons were made between each subset model, and the model with the lowest OOB error rate and lowest maximum within-class error was selected as the final model.

Random forest variable selection

There are several variable selection procedures available in random forest97,98,99 we followed the approach of variable selection developed by Genuer et al.99. This approach is based on the un-scaled permutation importance that is calculated by permuting each predictor in turn and using the difference in prediction error (OOB error) before and after permutation as a measure of variable importance15,81,100. This approach is a stepwise procedure whereby a sequence of RF models is estimated by iteratively eliminating or adding variables according to their importance measures (such as MIR)101. The MIR shows variable importance measured as the increased mean square error (%IncMSE), which represents the deterioration of the predictive ability of the model when each predictor is replaced in turn by random noise. Higher % IncMSE indicates greater variable importance. In this way, we selected only those variables that improved model performance.

Model assessment

We used AUC (area under the receiver operating characteristic curve) ROC and True Skill Statistic (TSS) as a means of model performance. Ponitus and Milones102 reported that Kappa Statistics does not provide a meaningful statistical measure of predictive success. Thus, we avoided the use of Kappa Statistics as a measure of model performance. Ponitus and Si103 also argue that transforming the continuous predicted probabilities of a predictive model into binary response requires the use of certain threshold cut-point values, which makes the actual quality of prediction less informative. Cushman and Wasserman28 while comparing the multi-scale habitat selection of American martens using logistic regression and random forest also recommend the use of AUC instead of Kappa Statistics as a measure of model performance. Models with AUC values of 0.7–0.9 are considered useful whereas the values higher than 0.9 are regarded as models with excellent discrimination abilities or high predictive power94,95.

Multiscale random forest distribution maps

Following the procedures of univariate random forest models (scale optimization), selection of important variables, and model assessment, we used scale optimized variables to predict the final distribution maps of tiger and leopard using the R package ‘randomForest” in R81. The future distribution maps of the tiger and leopard were predicted using the same scale optimized variables expect, the bioclimatic variables corresponding to the greenhouse gas emission scenarios (RCP 2.6 and RCP 8.5) were used in future prediction maps for the years the 2050s and 2070s.

Niche identity and niche background tests

Methods to quantify and test the environmental niche similarities rely either on ordination techniques104 or environmental niche models (ENMs)105. We used ENMs of tiger and leopard (Fig. 2) to perform Schoener’s niche equivalency (identity) test (D) and Warren’s niche background test (I)37 using the R package ‘Humboldt’44. Niche equivalency is a one-tailed statistical test used to test out the null hypothesis that two species have identical environmental niches. The niche equivalency test compares the observed niche similarity between the ENMs of two species and a niche background test assesses the power to detect the differences between the ENMs of two species. The values of niche similarity (D) range from 0 indicating complete dissimilar niches to 1 indicating complete similar niches37,44. The statistics calculate how similar the occupied niches of two species are to each other based on original input occurrences by calculating the Schoener’s D. The observed values of Schoener’s D are then compared to the indices obtained by resampling and reshuffling the species occurrence locations. At each resampling, the occurrences of species 1 and species 2 are pooled and then assigned randomly to one of the two groups. At each iteration, the Schoener’s D and Warren’s I are measured between any two reshuffled groups. The actual observed values of Schoener’s D and Warren’s I based on original occurrences are then compared with the null distribution created from all the values obtained from the reshuffled occurrences. Thus background test compares the observed niche similarity based on original occurrence locations between species 1 and species 2 to the overlap generated between species 1 and the random shifting of the spatial distribution of species 2 in geographic space and then measuring how this shift in geography changes occupied environmental space. In brief, the background test determines if the two distributed species are more different than would be expected given the underlying environmental differences between the habitats in which they occur.

A non-significant equivalency statistic and a significant background statistic support the underlying null hypothesis that species environmental niches are identical. A statistically significant equivalency statistic, regardless of the significance of background statistics, results in the rejection of the null hypothesis of niche equivalency44. If both the equivalency statistic and background statistic are statistically non-significant, it implies that observed niche similarity is a result of space limitations and that there is a low power for the equivalence statistic to detect the meaningful and significant differences among the species niches44.

Realization of species fundamental niche from observed niche

The identification of species' fundamental niche from the species occupied niche remains one of the major challenges in the studies of niche analysis106. Most studies usually overlook how bad or how well a species occupied niche can reflect the species' fundamental niche. The package ‘Humboldt’ provides a way to characterize the fundamental niche by truncating species occupied E-space by the available E-space in its environment. There is a directly proportional relationship between the portion of the occupied niche in E-space truncated and the risk that occupied niche poorly represents the species fundamental niche. Thus higher the proportion truncated, greater the risk that occupied niche poorly reflects the species fundamental niche. Brown and Carnaval44 introduced the concept of the Potential Niche Truncation Index (PNTI) implemented in package ‘Humboldt’ which quantifies the amount of observed E-space truncated by the available E-space. It specifically measures the overlap between the 5% kernel density isopleths of species E-space and the 10% density isopleths of the available or accessible E-space in the environment. This value is the realization of how much of the perimeter of the species E-space abuts, overlaps or falls outside the margins of the environment’s E-space. The values of PNTI in the range of (0.15–0.3) have moderate risks and the values greater than (0.3) have a high risk that observed niches do not represent the fundamental niches due to niche truncation driven by limited available E-space44.