Ecological Niche Modeling Predicting the Potential Distribution of African Horse Sickness Virus from 2020 2060

African horse sickness is a vector-borne, non-contagious and highly infectious disease of equines caused by African Horse Sickness viruses (AHSv) that mainly affect horses. The occurrence of the disease causes huge economic impacts because of its fatality rate is high, trade ban and disease control costs. In planning of vectors and vector borne diseases, the application of Ecological niche models (ENM) used an enormous contribution in exactly delineating the suitable habitats of the vector. We developed an ENM with the objective of delineating the global suitability of AHSv outbreaks retrospective based on data records from 2005–2019. The model was developed in R software program using Biomod2 package with an Ensemble modeling technique. Predictive environmental variables like mean diurnal range, mean precipitation of driest month(mm), precipitation seasonality (cv), mean annual maximum temperature (oc), mean annual minimum temperature (oc) mean precipitation of warmest quarter(mm), mean precipitation of coldest quarter (mm) mean annual precipitation (mm), solar radiation (kj /day), elevation/altitude (m), wind speed (m/s) were used to develop the model. From these variables, solar radiation, mean maximum temperature, average annual precipitation, altitude and precipitation seasonality contributed 36.83%, 17.1%, 14.34%, 7.61%, and 6.4%, respectively. The model depicted the sub-Sahara African continent as the most suitable area for the virus. Mainly Senegal, Burkina Faso, Niger, Nigeria, Ethiopia, Sudan, Somalia, South Africa, Zimbabwe, Madagascar and Malawi are African countries identied as highly suitable countries for the virus. Besides, OIE-listed disease-free countries like India, Australia, Brazil, Paraguay and Bolivia have been found suitable for the virusThis model can be used as an epidemiological tool in planning control and surveillance of diseases nationally or internationally. AD Data curation;


Introduction
African horse sickness (AHS) is a non-contagious, highly infectious vector-borne disease of equines caused by the African horse sickness virus (AHSv). Horses are severely affected by the disease, while mules, donkeys, and zebras are less susceptible. The disease was described in early Arabic documents dating back to 1327, in which horses were suffering from an apparent AHS-like disease in Yemen (Zientara, Weyer, & Lecollinet, 2015).
Africa is considered the hotspot for the disease because of increased outbreaks that occur each year. The disease was rst reported in the early 17th century when a major outbreak was observed in 1719 in South Africa. Since then, the African horse sickness virus becomes endemic in Africa, stretching from west to east and extending to South Africa (Zientara et al., 2015). Even though AHS is endemic through most sub-Saharan Africa, with outbreaks occurring regularly, recently, outbreaks have been reported outside of Africa, including Spain, the Middle East and the Indian subcontinent (Ayelet et al., 2013).
AHS outbreaks in endemic areas have a devastating economic and social impact on the economy due to rapid spread, direct mortalities, restriction of animal movements, surveillance and vaccination costs and immediate noti cation requirement for the World Animal Health Organization (Diarra et al., 2018). An outbreak of AHS in a disease-free region would have catastrophic effects on equine welfare and industry, particularly for international events such as the Olympic games and restrictions for international trade of racehorses (Karamalla et al., 2018).
The disease is vector-borne, transmitted by midges belonging to the genus Culicoides. Female Culicoides carry the virus from a diseased animal while feeding on blood and transmit it to healthy animals. Vector abundance and reservoir's existence in a particular area play a key role in transmitting the disease (Diarra et al., 2018). Even though the genus Culicoides species are the most dominant vector for the diseases, other insects can transmit the virus.
For example, mosquitoes of the Aedes, Culex and Anopheles genera or ticks of the Hyalomma or Rhipicephalus genera are vectors capable of transmitting the virus (Zientara et al., 2015).
Vector borne and infectious disease distribution in space can be determined by Ecological niche models. These models can be used to delineate and predict suitable territories of vector borne diseases. The study aimed to delineate the global suitability level and distribution of AHSv outbreaks by identifying environmental derived risk factor analysis from 2005-2019. Besides, we predicted the future suitability level for the years 2020 to 2040 and 2040 to 2060.

Material And Methods
African horse sickness outbreak data source AHS is a noti able disease in which authorities of member countries of the OIE report any case immediately. Each outbreak case is georeferenced and available freely at the Global Animal Disease Information System website

Environmental data sources
Variables known to determine the maintenance and circulation of AHS and Culicoides species were determined based on their biological plausibility. The variables comprise topographic and climate variables downloaded from the worldclim2.1 database (Fick, S.E. and R.J. Hijmans, 2017).
Climate and topographic variables including annual maximum and minimum temperature, annual average precipitation, wind speed, solar radiation, elevation and other bioclimatic variables, have been used in the model. These variables are known to determine the spread of the disease circulating in the equine population (Cao, Jin, Shen, Xu, & Li, 2018). Initially, 19 bioclimatic variables and many weather variables were proposed. However, most of the variables were trimmed and only eleven variables were utilized due to multicollinearity. Besides, projected bioclimatic datasets from 2020 to 2040 and 2040 to 2060 were downloaded from the new WorldClim 2.1 database. All the variables used have been presented in Table 1. Spatial data handling and management The GIS data downloaded from different sources had varied projection, spatial resolution, and cell size. These data sets were projected to the same projection system, resampled to the same cell size (2.5min) and extent. All these GIS operations processed using the SDM package (Naimi, B. & Araújo, M.B., 2016). The variables used were 25 initially but trimmed after multicollinearity was detected. Multicollinearity was checked using the VIF procedure of the USDM package in the R software program (Naimi B, Hamm Na, Groen TA, Skidmore AK and Toxopeus AG., 2014). Variables that have more than 0.7 correlation coe cients were removed from the dataset. Accordingly, 14 variables were removed, and the remaining 11 were used to develop the model.

Model development
The Modeling options in Biomod2 were set to default and the algorithm runs 3-fold with a total of 30 outputs for the ten models. Data were split into training and evaluation sets. 80% of the data was used to develop the model, while 20% was used to evaluate its performance. Area Under the Receiver Operating Curve (AUC) ROC curve, Kappa and the true skill statistic (TSS) were used to evaluate the model performance. TSS value of more than 0.8 was used to ensemble the 30 outputs of the ten models.
Global Suitability level was generated with mean probability, weighted mean (wm) and committee averaging (CA) values. Committee averaging has a dual purpose in ensemble modeling. Firstly, it can be used to predict suitable niches and secondly, it can also be used to evaluate the model's performance. The current suitability distribution result was further projected to predict the disease's future These values were divided by 1000. Hence the suitability values were approximated to zero (unsuitable), 0.25 (moderately suitable), 0.5 (considerably suitable), 0.75 (suitable) and nearly 1(highly suitable).

Individual model evaluation and variable importance
Most of the individual models performed very well in all evaluation metrics employed. By TSS evaluation, RF, GBM, GLM, GAM perform well, with values of 0.98, 0.96, 0.96, 0.94, respectively. Similarly, with ROC evaluation metrics, the models mentioned above outperformed the rest of the models. From all ten models, SRE and Maxent performed less with a TSS value of 0.67, 0.77, respectively ( Table 2).  2). Response curves for some selected models have been depicted in (Fig. 3). As far as the variable contribution is concerned, in RF, altitude and solar radiation contributed the highest share. In a few models, altitude and temperature variables contributed the highest share, while precipitation variables had the highest contribution in other models. However, solar radiation alone contributed nearly a quarter of the contribution in almost all models (Table 3).  average precipitation of the year (14.34%), altitude (7.61%), and precipitation seasonality (6.41%) also contributed a signi cant share (Table 5). Predicted suitable territories of the world for AHSv The suitability level for the virus was generated in the ensemble modeling with mean suitability level (Fig. 4), committee averaging (Fig. 5) and weighted mean suitability levels (Fig. 6) (Fig. 4).

Model uncertainty
Model uncertainty was measured model's clamping mask value. Expressed uncertainties of the model were in North America, Russia, and South America (Fig. 7).

Predicted Future suitability distribution of AHSv
The distribution of suitable niches was projected to the years 2020 to 2040 and2040 to 2060. The results indicate that the suitability level will diminish between 2020 and 2040 than the current suitability level (Fig. 8). However, from 2040 to 2060, the suitable niches will grow wider than 2020 to 2040 but smaller than the current suitability level (Fig. 9).

Discussion
African horse sickness is one of the devastating health threats to the equid family. Mostly donkeys, mule and zebra are believed to be reservoirs; they are less susceptible to the disease (Zientara et al., 2015). Efforts have been made to develop effective prevention and control approaches. These control approaches are focused on three components, namely, quarantine, vector control and vaccination. In support of these approaches, vaccine development efforts have been implemented to develop effective vaccines successfully. Besides, vector control methods mostly focused on the con nement of animals in the active season of vectors have been practiced.
This is the rst of its kind attempt to model the suitability niche for AHSv from retrospective outbreak records to the best of our knowledge. If the virus reaches these territories and is maintained in any possible vector, the virus can persist for prolonged periods affecting the equine population.
Future projections from 2020 to 2040 and 2040 to 2060 indicated that the suitable territories would diminish than the current distribution gradient. As reasonable global warming will be imminent in the coming years, this argument seems unlikely to decrease infectious diseases like AHSv. However, the possible reason can be that temperature will increase, affecting the dynamics of vectors. Besides, for an outbreak to occurs, moderate temperature along with higher precipitation rates should exist. However, in the future, the conditions may not exist as rainfall will decrease. In contrast, the temperature rises scenarios favor the vector dynamics negatively, directly diminishing vector-borne infectious diseases.
The model was magni cent in every evaluation metric employed and depicted suitable territories previously known with the disease to occur. However, it had its limitations. Among them, the inability to incorporate outbreak occurrence data other than African countries and predict for wider area are prominent. ENMs for wider areas come with under (over) estimation of suitability level. Besides, background pseudo absence sampling from nonoccurrence locations using SRE, may result in inaccurate metrics. Due to these reasons, we advise readers to consider these limitations whenever they want to use this model.

Conclusions
The model is the rst to use ENM with bioclimatic risk factor identi cation for AHSv outbreaks worldwide. It had a perfect classi cation capability of suitable and unsuitable niches of the world. Bioclimatic variables like solar radiation, maximum temperature, and precipitation variables contributed to the model's highest share. Endemic territories of Sub-Saharan Africa and the Arabian countries were found highly suitable. Furthermore, OIE's diseasefree areas, like India, Australia, and Brazil, were found suitable for the disease. We believe this model can be used as an epidemiological tool in planning control and surveillance against diseases nationally or internationally.  Committee averaging of the ensemble model depicting both suitability level and model uncertainty. Predicted future global distribution gradient of AHS in from 2040 to 2060. The warmer colors depict suitable areas while the cooler colors depict unsuitable localities