Deep learning approaches for challenging species and gender identification of mosquito vectors

Microscopic observation of mosquito species, which is the basis of morphological identification, is a time-consuming and challenging process, particularly owing to the different skills and experience of public health personnel. We present deep learning models based on the well-known you-only-look-once (YOLO) algorithm. This model can be used to simultaneously classify and localize the images to identify the species of the gender of field-caught mosquitoes. The results indicated that the concatenated two YOLO v3 model exhibited the optimal performance in identifying the mosquitoes, as the mosquitoes were relatively small objects compared with the large proportional environment image. The robustness testing of the proposed model yielded a mean average precision and sensitivity of 99% and 92.4%, respectively. The model exhibited high performance in terms of the specificity and accuracy, with an extremely low rate of misclassification. The area under the receiver operating characteristic curve (AUC) was 0.958 ± 0.011, which further demonstrated the model accuracy. Thirteen classes were detected with an accuracy of 100% based on a confusion matrix. Nevertheless, the relatively low detection rates for the two species were likely a result of the limited number of wild-caught biological samples available. The proposed model can help establish the population densities of mosquito vectors in remote areas to predict disease outbreaks in advance.

The World Health Organization (WHO) has indicated that the study of entomological equipment is a key approach in combating the global outbreaks of arboviruses, which affect at least 80% of the world population at present and account for an estimated annual infection rate of 390 million 1 . The microscopic observation of mosquito vector species is the basis of entomological identification. At present, insect body landmarks and morphological keys are commonly considered as the primarily aspects to identify either vector/non-vector or local/invasive species [2][3][4][5] . The identification procedure can help evaluate the implementation of the prevention strategies based on the mosquito species, which need to be determined as either epidemic or endemic. In general, the aforementioned methods represent the routine procedures to identify insects. Moreover, the morphological identification is primarily performed by well-trained and highly skilled personnel (public health staff). This process is considered as a gold standard practice that can be replaced by automatic tools. Unfortunately, this method is time-consuming and challenging since the number of highly skilled public health personnel is inadequate and since not all of the individuals possess the required level of training, skill, and experience. In addition, these aspects may increase the cost of surveillance activities.
Alternative methods to enhance the accuracy of insect identification include intrinsically qualitative and quantitative approaches, such as polymerase chain reaction (PCR) 6 , quantitative real-time PCR 7 and DNA barcoding 8 . Nevertheless, these methods require the utilization of expensive equipment and high-throughput technology. In addition, the methods can be performed only by highly skilled professionals with knowledge of molecular biology. In this regard, the use of automated apparatuses can help provide consistently accurate results and reduce the required amount of skilled labour. www.nature.com/scientificreports/ can enable local vector surveillance and analysis and support effective vector control programmes, particularly in remote areas with an insufficient number of expert entomologists.

Methods
Ethics statement. This study was approved by the Animal Research Ethics Committee, King Mongkut's Institute of Technology Ladkrabang with ACUC-KMITL-RES/2020/002. The field sampling data were obtained without harming any animal species.
Sample collection and preparation. A total of 1585 biological samples were obtained, which included 200 specimens of laboratory strains, 1295 field-caught specimens (caught during May-July 2019) and 90 specimens of non-mosquito strains ( Table 1). The model performance was evaluated, in preliminary work to determine whether the models could detect and identifying small objects, and in real situations by using the aforementioned samples that consisted of insects aged 3-5 days with intact morphological characteristics. The field-caught samples contained many dengue vectors that are extensively distributed throughout Thailand, especially during the rainy season. When training the model during the pilot study, the specimens were obtained from the insectary of the Department of Entomology, Faculty of Tropical Medicine, Mahidol University. The insects were reared under standard conditions, fed with 5% sucrose solution ad libitum and maintained under a 12/12 h light-dark cycle at a temperature and humidity of 27 ± 2 °C and 70-80%, respectively. An expert entomologist identified the mosquito species by using the standard key taxonomy 2,3 . One hundred mosquitoes (50 male and 50 female) each of the four species were used, with Aedes aegypti and Ae. albopictus corresponding to the main and secondary dengue mosquito vectors; Culex quinquefasciatus and Anopheles dirus were the filarial and malarial vectors, respectively (Table 1).
Additionally, 90 samples of non-mosquito species were used to train the deep learning approach, consisting of Musca domestica (house fly), Trigona apicalis (stingless bees), and Oryzaephilus surinamensis (sawtoothed grain beetle). In general, the non-vector insect species images in a dataset are essential in the training process of a deep learning model. The non-vector images can help improve the well-trained models to distinguish between the vector and non-vector mosquitoes. In this study, we designed the experimental procedure such that the trained models could perfectly classify the non-vector species with an accuracy of 100% in the model validation process before utilizing the models to detect the objects of the vector species (Suppl. Fig S1 and Suppl. Table S2).
To determine whether the training approach could be used to identify the samples in real settings, wild-caught samples were obtained. BG-Sentinel traps were set to capture the insects, which were later stored at − 20 °C until identification. The samples were obtained from four locations in all regions of Thailand, in which dengue Dataset preparation. The anaesthetized/dead mosquitoes collected from the traps were identified. The sample was directly placed on an A4 paper sheet using a non-complex method. A non-mount slide and medium were used to prepare the sample before capturing the images. Predefined camera views of a sample were captured from dried/anaesthetized mosquitoes placed in a certain pose. The samples were individually prepared, and two-lateral, dorsal and ventral views of the samples were photographed. The mosquito images or video clips were captured using a 600× digital microscope. The captured images were obtained by cropping only the image area fitted to the individual mosquitoes. Furthermore, an expert entomologist identified each mosquito image by using the morphological landmarks of the mosquitoes through a standard stereomicroscope. The original resolution of the captured images, 1920 × 1080 pixels, was used to prepare the ground-truth labels. A total of 10,564 captured images were used, among which 8165 (77%) were utilized to realize the trainingintegrated validations and 2399 (23%) were used as the testing image sets (Suppl . Table S1). Fifteen mosquito classes were separated into different bounding boxes and labelled by professional entomologists by using the in-house deep learning software (CiRA CORE, https ://www.faceb ook.com/group s/cira.core.comm/). The bounding boxes represented a potential region of interest (ROI) and contained images of the mosquitoes contained in the smallest area possible. The labels corresponded to the identified genus, species and gender of each mosquito and were used to train the model (Fig. 2). Overall, 190,116 augmented images based on an 8-step rotation and 9 brightness/darkness steps were used to train the model. Subsequently, the model performance was evaluated to identify the appropriate ranges of the resolution (in pixels) as those of the input images. Three different resolutions for each category were considered: 672 × 378, 1440 × 810 and 1920 × 1080 pixels.
Deep learning network configuration. Collectively, six YOLO network combination models were investigated, and their performance in detecting and classifying small biological objects was assessed and compared based on the relevant learning methods. The models included the one-and two-stage methods of the www.nature.com/scientificreports/ tiny YOLO v2, YOLO v2, and YOLO v3 algorithms (Fig. 3). Previous studies pertaining to the characteristics of YOLO algorithms indicated the realization of prompt detection 31 . Specifically, YOLO models are prompt by design and can operate in real-time, processing 45-155 frames per second. The models can be used to detect up to 9000 categories of object classes. Additionally, the YOLO algorithm can be used to evaluate the entire area of an image with only a few false positive values after encoding the contextual information regarding the object classes 32 . A previous study reported that in cases requiring a balance of the accuracy and speed, when the training time is not restrictive, the YOLO model is suitable because the balance between the speed and accuracy is critical in practical applications 33 . Therefore, this model is suitable for realizing entomological surveillance. In this study, we adopted different versions and configurations of the YOLO algorithm to perform the identification process. The one-stage YOLO model was used to identify whether the ground-truth input contained every single-mosquito species. If every single-mosquito species was included, the image was labelled through a rectangular bounding box (the object and ROI), with the relative genus, species and gender presented above the box. In contrast, an unlabelled image was one that did not contain any mosquitoes. In the two-stage learning method (concatenated), two steps were implemented. The 1st stage of the YOLO network was used to investigate whether the whole image area contained any mosquito, and if a mosquito was detected, the cropped image of the mosquito was used for the subsequent processing. The independent YOLO network in the 2nd step was used to identify a particularly relative genus, species and gender within the cropped image derived from the image containing the mosquito, which was obtained in the first step. For the model training process shown in Fig. 3a, the captured images were transferred to a dedicated deep learning server to realize the labelling and model training. The labelling tool available in the CiRA CORE platform was employed. Two schemes were considered to train the model, namely, one-stage and two-stage learning methods. To prepare the dataset for the one-stage learning method, the captured mosquito images were manually labelled with their corresponding species, genus and genders. Subsequently, the dataset was fed to either the YOLO v2 or YOLO v3 model to realize the learning process in the CiRA CORE platform. The learned parameters were utilized in the runtime testing in the following process. www.nature.com/scientificreports/ Figure 3. Workflow for data handling in the end-to-end neural network model, which consisted of two learning strategies, namely, the one-stage and two-stage learning methods.
(1) The one-stage learning method progressed along the dark-blue dashed line, starting from the ground-truth labelling for the "genus, species and relative gender of the insect". The ground truth labels, indicated in red rectangles, were trained in the [model] architecture. The output was displayed, pertaining to the correct relative genus, species and gender, in the output box or red rectangle, if the trained weight reached the optimal value. Under the CiRA CORE platform, the red rectangular box of the output could be selected to display or not display the value. (2) The two-stage learning method progressed along the light-blue dashed line. The start point corresponded to the ground-truth labelling for the mosquitoes and non-mosquitoes before performing the training using the [model_1] architecture, indicated in the red rectangle. The optimal trained weight was validated if could correctly distinguish between the non-mosquito and mosquito testing images. Later, the images in the set were cropped using one of the functions under the CiRA CORE programme, to be used as the dataset for the second learning process implemented using [Model_2] after labelling each cropped image pertaining to the relative genus, species and gender, as indicated in the yellow rectangle. The output could be displayed in two rectangular (red and yellow) boxes; the first box corresponded to the mosquito detection, and the second box corresponded to the classification of the relative genus, species and gender of the mosquito. Under the CiRA CORE platform, both the yellow and red rectangular boxes for the output could be selected to display or not display the values. www.nature.com/scientificreports/ In the two-stage learning method (Fig. 3a), the dataset was prepared in two sequential stages. The captured mosquito images were manually labelled with the label "Mosquito". Moreover, we input non-vector insect species to the dataset since such samples are crucial to realize the distinction between the vector and non-vector mosquitoes. The non-vector insect images were labelled "Non-mosquito". Next, the 1st stage dataset was fed to either the YOLO v2 or YOLO v3 model to implement learning process. Using the parameters learned from the 1st stage dataset, the trained model was used to prepare the cropped mosquito images by operating the trained model with the captured images as the input. One of the model outputs corresponded to the information regarding a bounding box of a mosquito object. This bounding box was used to provide the position and area for cropping the mosquito images. Subsequently, the 2nd stage dataset of the cropped mosquito images was manually labelled with the corresponding species, genus and gender of the mosquitoes. This dataset was fed to either the YOLO v2 or YOLO v3 model for the learning process. The learned parameters were later used in the runtime experiment in the subsequent process.
This study was performed using a dedicated deep learning server with a 64-bit Ubuntu 16.04 operating system with a library based on the CUDA 10.1 toolkit and CUDNN7.5.0. All the experiments were conducted on the server with the following configuration: CPU i7-8700 (3.70 GHz), RAM 16 × 2 GB, GPU NVIDIA GTX 1070Ti (8 GB), using the C++ programming language. The models were fine-tuned with a total of 16 (tiny YOLO v2), 32 (YOLO v2) and 106 (YOLO v3) layers.
The training datasets consisted of 8,165 laboratory and field-caught images. The aforementioned deep learning model was implemented in the in-house deep learning software (CiRA CORE), and the same computational conditions were applied for all the models. An intersection of union (IoU) of 0.5 was used to compare the average precision for each object. A state-of-the-art network algorithm based on the YOLO and YOLO9000 frameworks was modified, as the original algorithm involved certain limitations when using the regression pipelines to detect objects. Moreover, in the model construction, a smaller run time and lower memory consumption were emphasized. A trained model with a mini batch sized 64 (min/max) and involving 16 subdivisions was used on a 1070Ti GPU, with a momentum and decay of 0.9 and 0.0005, respectively. The learning rate was adjusted as 0.001 with a burn, step value, and learning scale rate of 1000, (400,000, 450,000) and (0.1, 0.1), respectively. Additionally, the two-stage learning method of the YOLO v3 was adopted and compared to the previously applied one-stage method. This two-stage method combined both object localization and simulation classification techniques. Hence, an objective of this study was to examine whether the two-stage YOLO method exhibits notable advantages in the detection process, in terms of both the speed and accuracy in cases involving images of dense objects. Figure 3b illustrates the model testing process. In the one-stage leaning method, the captured images were fed directly to the trained model. The model output corresponded to the genus, species and gender of the mosquitoes. In the two-stage learning method, the captured images were fed through the 1st and 2nd stage models consecutively. www.nature.com/scientificreports/ Classification of application input. The performance of the deep learning network models was assessed using confusion metrics. Several statistical variables from these models were compared, including the precision, recall (sensitivity), accuracy 31 , F1 score, misclassification rate and specificity. The formulas for these parameters are as follows: where Tp, Tn, Fp, and Fn denote the number of true positive, true negatives, false positive and false negatives, respectively. The actual positive and actual negative represent the sum of the number of true positive and false negative and sum of the number of true negatives and false positives, respectively. In addition, a confusion matrix table was constructed to realize a class-wise comparison to examine the accuracy of the identifications (generalized accuracy) and ensure that the model could detect and classify objects in a reliable manner. The mean accuracy determined from the table was analysed and assessed for all the models.
The receiver operating characteristic curve (ROC) was constructed with a 95% confidence interval, and the area under the curve (AUC) was established to assess the model accuracy. The ROC analysis was performed using the SPSS software.

Results
This performance of six modifications of the YOLO algorithm, implemented within the Darknet-19 and Darknet-53 backbones was evaluated. The objective was to determine the optimal learning strategy and neural network model of the YOLO algorithm versions to be employed in the entomological field to identify the main mosquito vectors and overcome the limitations of the models in examining the images of wild-caught mosquitoes under different conditions. Assessing the robustness of the proposed networks. The performance and robustness of the proposed model were tested using the in-house deep learning software (CiRA CORE). The video of the performance evaluation process is illustrated in Supplementary Table S1 and hosted on YouTube at https ://youtu .be/7yI5C 3h5yY k. In the initial and modified learning processes of the YOLO models, single-mosquito images were used in the training and testing sets, as shown in Table 1 and Fig. 2, and the different network architectures of the models were assessed considering the same assignments. The statistical results indicated that the accuracy of the YOLO v3 model was higher than that of the other models (Table 2). Nevertheless, for each model, the modified two-stage learning method outperformed the one-stage YOLO model. The mean average precision (mAP) and adjusted mAP ranged from 73.80-85.9% and 87.2-99.0%, respectively, as indicated in Table 2. Since only 15 classes of mosquito vectors were employed in this study, the results obtained through the single-class predictions of this model can be considered to be encouraging for the application of deep learning models in mosquito-vector identification. In addition, the recall (sensitivity) values indicate that the models exhibited a high sensitivity ranging from 70.8 to 92.4% with adjusted values of 88.5-88.9%. Both the specificity and accuracy were highly reliable, ranging from 98.4-99.4% and 97.0-98.9%, respectively. The results indicated low misclassification rates, ranging from 1.1 to 3%. Moreover, a high F1 score was obtained, and the harmonic mean for the precision and recall ranged from 70.6-85.3% and from 81.0-88.9%, respectively, after adjustment. The overall results indicated that the two-stage YOLO v3 network model outperformed all the other models in the identification of entomological images of small objects in a large environment.
The robustness of the models was evaluated and determined considering a threshold probability, P(t), from t 20% to t 80% . The identifications resulting from each model corresponded to P class ≥ t 34 . The sweeping values of t in the ROC ranged from 0 to 1 (Fig. 4a-f).
On average, the AUC was 0.809 ± 0.017 and 0.945 ± 0.007 for the one-and two-stage versions of tiny YOLO v2, respectively, 0.861 ± 0.011 and 0.917 ± 0.020 for the one-and two-stage versions of YOLO v2, respectively, and 0.873 ± 0.012 and 0.958 ± 0.011 for the one-and two-stage versions of YOLO v3, respectively ( www.nature.com/scientificreports/ with the YOLO v3 network outperformed the other models in the object detection in single-mosquito images by using the general criteria of morphological taxonomic keys. Multi-class detection. The class-wise performance of the trained model was assessed considering the dataset created by organizing each image in each class in a separated folder. The prediction results in terms of the mAP and adjusted version are summarized in Table 4. Almost all the classes of mosquitoes, such as the Aedes, Armigeres, Anopheles, Mansonia and Culex mosquitoes, exhibited high mAP values. Interestingly, the one-and two-stage YOLO v3 models could identify the Cu. spp. Male, Cu. gelidus and Cu. quinquefasciatus females at a rate of almost 100%. Nevertheless, the prediction results of the two species exhibited low mAP values. This aspect could be attributed to the small number of images for these species in the training set. Using a training set involving more than 300 unaugmented images could enhance the prediction values obtained using the YOLO v2 and YOLO v3 network models. A confusion matrix was constructed to facilitate the evaluation and explanation of the model performance in identifying the species. According to the tables corresponding to the various YOLO models ( Supplementary  Tables S3.1-3.6), the proposed models could not identify Cu. vishnui. Moreover, Ma. uniformis could be detected using YOLO v3 but not YOLO v2. The results demonstrated the high quality of both the one-and two-stage learning methods with the YOLO v3 model. Due to the different numbers of specimens in each class of the training set, the overall accuracy exhibited different values of less than 90%, specifically, 73.00%, 83.20%, 83.36%, 87.99%, 86.24%, and 88.55% for the one-stage and two-stage YOLO-tiny v2 models, one-stage and two-stage YOLO v2 models, and one-stage and two-stage YOLO v3 models, respectively. Under the remaining classes, the model provided more reliable results ranging from 95 to 100% when using the YOLO v3 network models, consistent with the aforementioned mAP results.
Pixel resolutions for the detection decision. The deep neural networks were examined to determine if the number of pixels in an image and the ability of the proposed models to identify the specimen were correlated. To examine this aspect, three different pixel densities were considered to identify a mosquito presented in images with different background intensities. The modified YOLO v3 model, which was the model with the highest performance, was used to detect the identity of a single mosquito from images with three different pixel resolutions, and the results were as presented in Table 5. The images with a higher resolution (1440 × 810 and 1920 × 1080 pixels) corresponded to better results compared to those obtained using images with a lower resolution (672 × 378 pixels). This result differed from those of a previous study, which indicated that the 608 × 608 pixel resolution image produced a lower mAP than that of images with resolutions of 416 × 416 and 320 × 320 pixels 31 . This phenomenon likely occurred because in the previous study, a well-trained model was used, and the objects to be identified (leukocytes) were presented in different surroundings in the images.

Discussion
According to the result in this study, the state-of-art models is the greatest performance when comparing to others 9,35-38 , except for the precision reported by Minakski et al. (Suppl. Table S5) 36 . The proposed model exhibited an excellent performance with high precision and recall/sensitivity levels of more than 92% in identifying the genus and species of mosquito vectors considering the morphological characteristics of the mosquitoes. Moreover, all the considered classes of mosquitoes could be detected through the various models with a high accuracy (97-98.9%) and specificity (98.4-99.4%), as indicated in Table 2. These results are comparable to those of the existing classifiers and network models 9,21,24 . Notably, the well-trained model shows the sensitivity (92.4%), the specificity (99.40%), the precision (95.56%) and the accuracy (98.9%) to detect the species and gender of mosquito vectors. Specifically, the mAP of the YOLO v2 model is 99%. Nevertheless, based on the class-wise detection results, as indicated in Table 5, the proposed models outperformed the existing models 9,39,40 . Besides, although the dataset used is unbalanced, the overall accuracy in the study shows more than 98% to 100% in identifying between the same species and gender of Aedes aegypti, Aedes albopictus, Culex quinquefasciatus (Suppl . Table S3.6), when comparing to that from previous studies 9,35,37 . This aspect could be attributed to the proposed models being extensively trained using a dataset that was carefully prepared using the original and largest variant in terms of both complete and incomplete morphological characteristics from the samples collected from four study sites in Thailand. Consequently, the proposed model can be used in real settings to realize vector-identification.
In general, intact mosquitoes are distinguished based on the colour and pattern of the head (proboscis and palpi), body (terga and abdomen), wing, thorax (mesonotum) and leg (femur and tarsi) 41,42 . Nevertheless, the variability of the morphological characteristics and texture of the mosquitoes may be degraded owing to discoloration caused during the capture and processing at the study site or during the freezing and drying preservation processes 35 . In this context, the YOLO model can realize the fast localization and classification of highly variable poses of preserved mosquito samples, similar to the work of non-experts, in identifying the different mosquito species.
Rationales may possibly support our empirical finding (described as above) such as sample size, ranges of image resolution and a potential hybrid model of the object detection method. Firstly, a bunch of sample size and their diversity help improve an ability of the neural network model in recognizing the pattern recognition of them 9 . Nevertheless, the classification performance of the models in identifying the particular classes of mosquitoes was strongly correlated with the number of images in each class of the mosquitoes used in the training set (Table 4). As the samples belonging to the same species or class may not be highly variable, the model performance can likely be enhanced by using an enlarged dataset involving more than 300 images for each class. In addition, the problem of the lack of variability in the mosquito images can be alleviated by using www.nature.com/scientificreports/ the augmentation function available in the in-house deep learning software (CiRA CORE), which provides 360° rotational angles, enhanced image contrast, and Gaussian noise and blur conditions. Although this process may introduce an additional step in the image annotation process before the training of the deep neural network, the quality of detection can be enhanced 34 . Furthermore, as reported, the proposed models cannot identify the objects when using low-quality images with a low resolution as the input 18,43 , and a low image resolution may degrade the result of the identification 44 . Suitable ranges for the image resolution should be determined to increase the detection rate of the model. In addition, the specifications of the digital cameras used to capture the images must be examined, since digital cameras can produce a large range of image resolutions, and this aspect may influence the identification performance of the proposed models. This problem can be solved by training the models using sets of images having a variety of pixel resolutions to establish the optimum resolution to train the model. In this study, this procedure was attempted to be implemented by testing one of the models with images having three different pixel resolutions. The outcome suggested that the YOLO v3 model can be applied across a wide range of pixel resolutions, as indicated in Table 5. Lastly, the hybrid approach between darknet-53 and YOLO versions model may be suitable to work with the entomological task, because it can help us find the object location and also classify a small object within a large environment each image 45 . Although a hybrid neural network model between deep learning and machine learning model is considerably efficiency, a major problem of the proposed detector found is how to select the high quality image dataset and also the strategy for labeling an appropriate ground-truth as compatible with an objective. A recent study has been proved that a renewal dataset specific to his purpose improved the performance of the proposed detector up to 81% of average precision when comparing to a former report 46 . Overall, the deep learning approach is practical and comfortably accessible for the crowed community to facilitate the prevention effort against the mosquito-borne disease.
The comparative results of the three deep learning models using one-and two-stage learning methods demonstrate that, in general, the two-stage learning method exhibits a higher identification performance than that of the one-stage method ( Table 2). This study represents the first attempt to direct compare the two methods in identifying insect vectors based on image-training sets. Nevertheless, all the considered network-learning method combinations produced reliable descriptions of both intra-and inter-observed genera, even for incomplete field-caught mosquito samples. The application of the YOLO v3 model combined with the two-stage learning method corresponded to the highest detection rate, based on the morphological features of the mosquito classes included in the training set.

Conclusion
This study used one-and two-stage learning methods of the YOLO network models as a detection algorithm to identify disease-transmitting mosquito vectors. In the one-stage method, the machine was trained using a dataset to predict the insect species. In contrast, the two-stage method (concatenated) involved model training in two rounds using the same dataset to detect the insect of interest and identify the relative species. To evaluate the model performance, the models were trained using the images to identify the features of various vector and non-vector species under various image resolutions. The training dataset involved 190,116 single images (augmented) grouped into 15 different classes to recognize the genus, species and gender of the mosquitoes. The state-of-the-art network model could effectively discriminate the species and gender of the insects when images from the testing dataset were input. The results demonstrate the ability of the proposed model in estimating the population density after identifying the various mosquitoes, many of which are vectors for arboviruses, which can cause disease outbreaks in remote areas 47 . The proposed model exhibited a high accuracy of 0.96, as indicated by the ROC and AUC analysis, and this value is higher than those reported in the literature [47][48][49] . Furthermore, the detection procedure was prompt, which can accelerate the species identification to instantly establish if diseasetransmitting mosquito vectors are present. www.nature.com/scientificreports/ The empirical results suggest that the proposed models can facilitate the detection and classification of mosquito vectors in field sites and realize rapid screening and identification to support the local public health staff. The models can further be used to evaluate the population density of mosquito vectors that may transmit mosquito-borne diseases in remote areas, by enhancing the models by introducing quantitative functions. Finally, the identification model can be extended to an android application for fieldwork. This framework can help public health workers in identifying and counting the mosquito species, thereby enabling the early detection of disease outbreaks in areas subject to periodic epidemics.    www.nature.com/scientificreports/