Modeling the spatial distribution of Culicoides species (Diptera: Ceratopogonidae) as vectors of animal diseases in Ethiopia

Culicoides biting midges (Diptera: Ceratopogonidae) are the major vectors of bluetongue, Schmallenberg, and African horse sickness viruses. This study was conducted to survey Culicoides species in different parts of Ethiopia and to develop habitat suitability for the major Culicoides species in Ethiopia. Culicoides traps were set in different parts of the country from December 2018 to April 2021 using UV light Onderstepoort traps and the collected Culicoides were sorted to species level. To develop the species distribution model for the two predominant Culicoides species, namely Culicoides imicola and C. kingi, an ensemble modeling technique was used with the Biomod2 package of R software. KAPPA True skill statistics (TSS) and ROC curve were used to evaluate the accuracy of species distribution models. In the ensemble modeling, models which score TSS values greater than 0.8 were considered. Negative binomialregression models were used to evaluate the relationship between C. imicola and C. kingi catch and various environmental and climatic factors. During the study period, a total of 9148 Culicoides were collected from 66 trapping sites. Of the total 9148, 8576 of them belongs to seven species and the remaining 572 Culicoides were unidentified. The predominant species was C. imicola (52.8%), followed by C. kingi (23.6%). The abundance of these two species was highly influenced by the agro-ecological zone of the capture sites and the proximity of the capture sites to livestock farms. Climatic variables such as mean annual minimum and maximum temperature and mean annual rainfall were found to influence the catch of C. imicola at the different study sites. The ensemble model performed very well for both species with KAPPA (0.9), TSS (0.98), and ROC (0.999) for C. imicola and KAPPA (0.889), TSS (0.999), and ROC (0.999) for C. kingi. Culicoides imicola has a larger suitability range compared to C. kingi. The Great Rift Valley in Ethiopia, the southern and eastern parts of the country, and the areas along the Blue Nile and Lake Tana basins in northern Ethiopia were particularly suitable for C. imicola. High suitability for C. kingi was found in central Ethiopia and the Southern Nations, Nationalities and Peoples Region (SNNPR). The habitat suitability model developed here could help researchers better understand where the above vector-borne diseases are likely to occur and target surveillance to high-risk areas.


Scientific Reports
| (2022) 12:12904 | https://doi.org/10.1038/s41598-022-16911-y www.nature.com/scientificreports/ finally transported to the entomology laboratory at the National Animal Health Diagnostic and Investigation Centre (NAHDIC) for species identification. Species identification and enumeration were performed by observation of morphological features of Culicoides under a stereomicroscope. Identification was made according to previously published identification keys 31,33,34 . Most Culicoides midges have a wing pigmentation pattern and a distribution of wing macrotrichia consisting of grey and white spots; these patterns are unique to each species and can be easily observed under a dissecting microscope. The specimens were mounted on a slide and under the light microscope, morphological features such as shape, size, the number of female spermathecae, and the distance between eyes were observed 33 . Then we observed the ratio of antennae XI/X (length of segment XI divided by the length of segment X) and the shape and size of the 3rd palpal segment. Finally, we compared all the observed features with the images in the IIKC database (Interactive Identification Key for Culicoides) 33 .

Spatial distribution modeling (SDM).
The geographic distribution of the two most common Culicoides spp. was predicted using an ensemble modeling technique 35 . Species distribution models consists of three main aspects: species occurrence data (dependent variable), layers of environmental variables (independent variables), and a modeling algorithm.
Land cover data was downloaded from the European Space Agency's GlobCover Portal (http:// due. esrin. esa. int/ page_ globc over. php). Livestock distribution data was downloaded from the website of FAO livestock systems (http:// www. fao. org/ lives tock-syste ms/). Soil type was downloaded from (https:// www. iiasa. ac. at/ web/ home/ resea rch/ resea rchpr ograms/ water/ HWSD. html). All data layers were projected in the same projection system with a spatial resolution of 2.5 arc minutes using QGIS 3.4.1.

Data analysis.
Ensemble modeling technique using biomod2 35 package of R software (http:// cran.r-proje ct. org/ web/ packa ges/ biomo d2/ index. html) was used to develop the SDM. The package uses ten different methods: general linear models (GLM), general boosted models (GBM, also called boosted regression trees), general additive models (GAM), classification tree analysis (CTA), artificial neural networks (ANN), surface range envelope (SRE), flexible discriminant analysis (FDA), multiple adaptive regression splines (MARS), random forests (RF), and maximum entropy (MAXENT) 35 . Because the above ten modeling techniques require both absence and presence data to determine the suitability range of species, pseudo-absence data were generated using Surface Range Envelope (SRE). The data was split into two parts, 80% was used to train the model and 20% was used to test model performance. Models were evaluated using the results of threefold cross-validation in 30 models (10 techniques × 3 replicates).
The performance of models were evaluated using the true-skill statistic (TSS), the area under the receiveroperating characteristic curve (ROC) curve, and the Cohen's kappa statistic (Kappa). TSS is a threshold-dependent evaluation (sensitivity + 1, specificity − 1), with values closer to one indicating model accuracy 39 . For ensemble modeling, only models which score TSS values greater than 0.8 were considered. AUC scores range from 0 to 1, with models with scores above 0.5 providing better predictions than random draws 40 .
Since there is overdispersion in our multivariable Poisson model, a negative binomial regression models were used to assess the relationship between C. imicola and C. kingi catch with various environmental and climatic factors including agro-ecological zonation, habitat, livestock density, soil type, mean annual minimum and maximum temperatures, annual precipitation, solar radiation, wind speed, land cover and water vapour pressure. Univariable negative binomial regression models were first developed and only variables which are significant on univariable analysis were used to develop the multivariable negative binomial regression models. www.nature.com/scientificreports/ Multicollinearity among explanatory variables was checked using variance inflation factor (VIF) analysis, with the "vifstep" command in the "usdm" package of R. Variables which have VIF values less than or equal to 10 were considered in the analysis 41 . Mean annual maximum temperature was removed from C. imicola model due to collinearity and mean annual minimum temperature, altitude and solar radiation were removed from C. kingi model due to multicollinearity.
Ethics approval and consent to participate. Ethical approval was obtained from ethics committee of the College of Veterinary Medicine and Agriculture of Addis Ababa University. Written informed consents were obtained from all households who participated in the study.

Result
Entomological survey. During the study period, a total of 9148 Culicoides were collected from 66 trapping sites. Of the total 9148, 8576 of them belongs to seven species and the remaining 572 Culicoides were unidentified. Of the seven species identified, C. imicola was the most abundant species with 4830 (52.8%), followed by C. kingi with 2160 (23.6%), C. milnei, C. schultezi and C. Zuluensis, which accounted for 9%, 6.8%, and 1.4% of the catch, respectively ( Table 2). Most Culicoides were caught in Jimma zone (28.5%), followed by Hawassa town (19.4%) and Oromia special zone (17.4%). Most Culicoides were collected in the vicinity of the animal pen 6926 (75.7%), 1000 Culicoides were collected near rivers, which constitute 11% of the catch (Table 3).
Factors associated with Culicoides imicola and Culicoides kingi occurrence. The impact of different environmental and climatic variables on the abundance of C. imicola and C. kingi was evaluated using multivariable negative binomial regression (Tables 4, 5). Agro-ecological zonation, habitat, soil type, mean annual minimum temperature, altitude, and water vapour pressure were found to have a significant effect on the number of C. imicola catches (Table 4). Culicoides imicola catches were higher near animal pen and in nitisols and vertisols soil types. When mean annual minimum temperature and altitude increases, C. imicola catches decrease (Table 4).
Agro-ecological zonation of trapping sites, habitat, soil type and mean annual maximum temperature are significantly related to the occurrence of C. kingi. The catches of C. kingi were higher in humid areas, and in leptosols, nitisols and vertisols soil types. Mean annual maximum temperature is found to have a antagonistic effect with the C. kingi catches, as mean annual maximum temperature increases C. kingi catches decreases (Table 5).   Table 4. Factors associated with C. imicola occurrence using multivariate negative binomial regression analysis. a Water shore includes river side, near pond, and lake shore.

Factors
Number of traps   (Fig. 2). A high suitability range for C. kingi was observed in central Ethiopia and in SNNPR. The ensemble model predicted the other parts of the country as moderately suitable, with the exception of the Afar and Somali regions, for which the model predicted lower suitability (Fig. 3).

Discussion
AHS, BT bluetongue and SBV are economically important Culicoides-borne viral diseases affecting equids and ruminants. The importance of these arboviral diseases derives from their very wide geographic distribution, potential for rapid spread, and large economic impact 1,2 . A considerable number of studies reported widespread occurrence of these diseases in Ethiopia. Demissie 42 reported AHS from Gamo Gofa, Wolaita and Hadiya zones of SNNPR. Zeleke et al. 43 45 reported the presence of bluetongue antibodies from Amhara regional state in northern Ethiopia. There is only one study on the sero-prevalence of SBV in Ethiopia. The author reported a very high apparent seroprevalence of 56.6% 10 . These diseases are transmitted by females of several species of midges belonging to the large genus Culicoides (Diptera: Ceratopogonidae) (which includes more than 1300 described species worldwide 46,47 . In this study, morphological identification confirmed the presence of seven species in Ethiopia. In the current study, various Culicoides species were collected, of which C. imicola was the largest number 4830 (52.8%). Similar to these results, entomological surveys carried out in many sub-Saharan African countries show that C. imicola is the dominant species 22,23,[48][49][50] . A previous study by Mulatu and Hailu 30 , reported the presence of C. imicola, C. milnei, C. neavei, C. zuluensis, C. fulvithorax and C. isioloensis in western parts of Ethiopia and Khamala and Kettle 31 reported the presence of C. fuscicaudae. This study identified three species of Culicoides that had not been previously described in Ethiopia, including C. kingi, C. schultzei and C. pycnostictus.
Multivariable negative binomial regression models were used to model the impact of various environmental and climatic factors on C. imicola and C. kingi catch. For both species, significantly higher catch was obtained in the subhumid agro-ecological zone. This finding suggests that Culicoides species require breeding habitat with high relative humidity 51 . In the current study, higher numbers of Culicoides were caught in traps placed near animal pens. This result is consistent with Riddin et al. 52 that reported high Culicoides catches near horse barns. The abundance of Culicoides near animal pens is mainly due to the presence of suitable breeding sites represented by moist soil sites, leaking animal watering troughs, and pond edges contaminated with feces 53 .
Soil type appears to be very important in determining the distribution and abundance of Culicoides 23,54,55 . According to the studies, the largest numbers of Culicoides were found in areas with a high, moisture-retentive clay soil, whilst the lowest numbers were encountered in rapidly draining sandy soils. In this study, C. imicola is mainly collected from nitisols and vertisols soil type, these soils types are clay-rich and characterized by their high moisture retention capacity 56,57 . C. kingi was mainly collected from diverse soil types including leptosols, nitisols and vertisols soil types. Leptosols are mountain soils known by their high waterlogging capacity 58 . These soil types are believe to create suitable breeding sites for different Culicoides species.
The present study shows that climatic variables to be an important determinants of Culicoides catches, temperature being the most important determinant for both species. Studies demonstrated that, Culicoides activity generally declines or even ceases at low temperatures, and high temperatures. Temperature ≥ 40 °C could be lethal 38 . Foxi et al. 53 also reported relatively poor tolerance of Culicoides to lower temperatures. Eventhough it is not evident in our multivariable negative binomial regression models, previous study by Gusmão et al. 21 suggests that persistent or heavy rain can create conditions for biting midges to proliferate but it can also be a barrier to the activity of adult (winged) biting midges and prevent them from flying. Rain can prevent adults from leaving their shelters and could affect catch.
Various climatic and environmental factors were used to model the species distribution of C. imicola and C. kingi. Soil type, altitude/elevation, livestock distribution, solar radiation, and mean minimum annual temperature were the most important variables for the C. imicola model. Wind speed, soil type, altitude/elevation, and vapor pressure were the variables that contributed most to the model for C. kingi. Although there is no previous information on the climatic requirements of C. kingi, numerous studies have examined the role of various climatic and environmental factors on the distribution and abundance of C. imicola. Global ensemble modeling of C. imicola by Leta et al. 29 reported temperature covariates contributing 64% to their model. This is supported by Veronesi et al. 59 which showed that temperature can affect fecundity, hatching, and survival of C. imicola. When reared at a higher temperature (28 °C), C. imicola exhibited higher variability in fecundity and lower hatch rates, and the mean emergence rate from pupae was highest at 20 °C. The distribution of C. imicola is probably directly limited by its relatively low tolerance to lower temperatures 60 . As temperatures rise, adults hatch and populations gradually increase to reach a peak in abundance in spring or summer, depending on the site, which is a function of www.nature.com/scientificreports/ spring temperatures and summer drought. Because temperature shortens larval development time and the time between two blood meals, thus increasing laying frequency, which has a positive effect on population dynamics (and their growth), we expected that temperature would have a significant effect on abundance 24 .
Our study also showed that solar radiation and livestock distribution are influential variables in the spread of C. imicola. According to Conte et al. 61 , intense solar radiation on C. imicola larval habitat combined with high nighttime temperatures accelerates larval development, resulting in multiple generations/season. The importance of livestock as a source of blood meals for C. imicola is well established 62 . Culicoides imicola is a bloodsucking insect that tends to feed on blood and breed near livestock and humans. The frequency of contact between Culicoides and vertebrate hosts is closely related to the multiplication of the pathogen and the risk of transmission 62,63 .
In this study, C. imicola was found to have a larger suitable range compared to C. kingi. Globally, C. imicola is widespread in tropical and subtropical regions of Africa, the southern part of Europe, and some parts of Asia 64 . The model describes that all regions have a small to large range of suitable areas, with Oromia and SNNPR regions having a larger range of suitable areas. High suitability for C. imicola was also demonstrated in the Amhara region, particularly adjacent to the Blue Nile basin and Lake Tana, in southern Afar, and in areas of the Somali region adjacent to Oromia. The results of the current model overlap in many ways with the previously published model of Leta et al. 29 . The current model emphasizes at national level by elaborating the distribution of C. imicola and C. kingi suitability in different regions of the country.
In conclusion, the entomological study shows the occurrence of C. imicola and C. kingi in different parts of Ethiopia, with C. imicola predominating. The widespread occurrence of these species indicates a higher risk of SBV, BT and AHS in different parts of Ethiopia. The models could help to understand the risk of introduction and spread of SBV, BT, and AHS.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author up on request.