CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection

Abdelrazek, Mohammed; Abd Elaziz, Mohamed; El-Baz, A. H.

doi:10.1038/s41598-023-50959-8

Download PDF

Article
Open access
Published: 06 January 2024

CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection

Mohammed Abdelrazek¹,
Mohamed Abd Elaziz^2,3,4,5 &
A. H. El-Baz⁶

Scientific Reports volume 14, Article number: 701 (2024) Cite this article

1526 Accesses
2 Citations
Metrics details

Subjects

Abstract

In this paper, a modified version of Dwarf Mongoose Optimization Algorithm (DMO) for feature selection is proposed. DMO is a novel technique of the swarm intelligence algorithms which mimic the foraging behavior of the Dwarf Mongoose. The developed method, named Chaotic DMO (CDMO), is considered a wrapper-based model which selects optimal features that give higher classification accuracy. To speed up the convergence and increase the effectiveness of DMO, ten chaotic maps were used to modify the key elements of Dwarf Mongoose movement during the optimization process. To evaluate the efficiency of the CDMO, ten different UCI datasets are used and compared against the original DMO and other well-known Meta-heuristic techniques, namely Ant Colony optimization (ACO), Whale optimization algorithm (WOA), Artificial rabbit optimization (ARO), Harris hawk optimization (HHO), Equilibrium optimizer (EO), Ring theory based harmony search (RTHS), Random switching serial gray-whale optimizer (RSGW), Salp swarm algorithm based on particle swarm optimization (SSAPSO), Binary genetic algorithm (BGA), Adaptive switching gray-whale optimizer (ASGW) and Particle Swarm optimization (PSO). The experimental results show that the CDMO gives higher performance than the other methods used in feature selection. High value of accuracy (91.9–100%), sensitivity (77.6–100%), precision (91.8–96.08%), specificity (91.6–100%) and F-Score (90–100%) for all ten UCI datasets are obtained. In addition, the proposed method is further assessed against CEC’2022 benchmarks functions.

A weighted-sum chaotic sparrow search algorithm for interdisciplinary feature selection and data classification

Article Open access 28 August 2023

Multi-variant differential evolution algorithm for feature selection

Article Open access 14 October 2020

Birdsongs recognition based on ensemble ELM with multi-strategy differential evolution

Article Open access 13 June 2022

Introduction

Feature selection is one of the major steps in pattern recognition and classification since it aims to eliminate the redundant and irrelevant features within a dataset. It can be challenging to decide which features are useful without prior knowledge. As a result, numerous feature selection techniques are used to select the best features which give superior performance¹. Particularly in applications, each dataset contains numerous significant numbers of features. The key objective of feature selection is to have a greater understanding of the methodology that produced the data in order to identify a subset of pertinent features from the vast pool of available features².

There are two main types of feature selection techniques. First, filtering techniques that don't rely on learning algorithms but rather specific data attributes. In contrast, wrapper approaches evaluate the chosen subset of features using learning algorithms. Although wrapper methods are computationally expensive, they are more accurate than filter approaches³. In general, feature selection is typically a multi-objective optimization problem. Its two main goals are to reduce the feature space and gives high performance. When there is a tradeoff between these two objectives, which they frequently do, the best choice must be made⁴.

Recently, meta-heuristic optimization algorithms are frequently used for finding the most discriminative features. The most methods that have been studied are Particle Swarm Optimization (PSO)⁵, Ant Colony Optimization (ACO)⁶, Genetic Algorithm (GA)⁷, Genetic Programming (GP)⁸, Simulated Annealing (SA)⁹, Differential Evolution (DE)¹⁰, Cuckoo Search (CS)¹¹, Artificial Immune Systems Algorithm (AIS)¹², Tabu Search (TS)¹³, and Whale Optimization algorithm (WOA)¹⁴. In other hand, there are studies including multi objective and its hybrid versions that have been published with these classical meta-heuristic algorithms. The theorem of No-Free-Launch (NFL) is the reason of studies multiplicity where no algorithm can give best solution for all problems, so there is always a probability to find better solution with new meta-heuristic algorithm, that’s why there are hundreds of studies in this field¹⁵.

Xue et al.¹⁶ provided first multi-objective method for feature selection using PSO algorithm, the experiments on 12 Benchmark dataset showed better results for their method comparing traditional one. Emary et al.¹⁷ used Anti Lion Optimization (ALO) in two approaches and compared the results with other common algorithms such GA and Big Bang algorithm (BBA) which proved the capability of their proposed method to find optimal features using 20 UCI dataset. Also, he employed Lèvy flight random walk with Ant Lion Optimizer (ALO) and the results showed its improvement comparing to the native ALO using 21 Benchmark dataset³. Genetic algorithms were the earlier method that have been used in feature selection, Aalaei et al.¹⁸ developed feature selection method by genetic algorithm (GA) to diagnose breast cancer using Wisconsin breast cancer dataset. Their experiments improved the accuracy, specificity and sensitivity. Ferriyan et al.¹⁹ used GA on NSL-KDD Cup 99 datasets. By using one point crossover instead of two, they get better results on the datasets they used comparing to original method.

The artificial bee colony (ABC)²⁰ algorithm is a simple, flexible, and efficient meta-heuristic optimization algorithm. However, it can suffer from slow convergence due to its lack of a powerful local search capability. Etminaniesfahani et al.²¹ overcome this weakness by hybridizing the ABC algorithm with Fibonacci indicator algorithm (FIA)²², calling the new algorithm by ABFIA²¹. Their hybrid algorithm combines the strengths of the artificial bee colony (ABC) algorithm and the Fibonacci indicator algorithm (FIA) by combines the global exploration of the FIA with the local exploitation of the ABC. They demonstrate that the hybrid algorithm outperforms the ABC and FIA algorithms and produces superior results for a variety of optimization functions that are commonly used in the literature, including 20 scalable basic and 10 complex CEC2019 test functions. Akinola et al.²³ combined the binary dwarf mongoose BDMO algorithm with simulated annealing (SA) algorithm and compared it with other 10 algorithms. The results showed that their proposed (BDMSAO) method is better than other algorithms.

Eluri et al.²⁴ introduces a novel wrapper-based method called BGEO-TVFL for addressing feature selection challenges. Their proposed BGEO-TVFL method employs Binary Golden Eagle Optimizer with Time Varying Flight Length (TVFL) to enhance feature selection. Their method adapts the Golden Eagle Optimizer (GEO), a swarm-based meta-heuristics algorithm, for discrete feature selection. Their work explores various transfer functions and incorporates TVFL for a balanced exploration–exploitation trade-off in GEO. They measure their performance evaluation by using UC Irvine datasets and comparison with standard feature selection approaches namely BAT, ACO, PSO, GWO, GA, CS, IG, CFS, GR. The obtained results reveal the superiority of BGEO-TVFL. Their method is tested using CEC benchmark functions, demonstrating its effectiveness in addressing dimensionality reduction issues compared to existing methods.

Chaotic Binary Pelican Optimization Algorithm is proposed by Eluri and Devarakonda²⁵, their proposed algorithm leverages the principles of chaos theory in a binary context to enhance the efficiency of the Pelican Optimization Algorithm for this purpose. In this binary variant, they introduce chaos to improve exploration and exploitation capabilities. Their algorithm aims to address the challenges of feature selection, particularly in handling large datasets and optimizing performance. Their proposed Chaotic Binary Pelican Optimization Algorithm is presented as a promising solution for improving feature selection outcomes in data analysis tasks.

Feature Selection with a Binary Flamingo Search Algorithm and a Genetic Algorithm is discussed by Eluri and Devarakonda²⁶. They evaluate the performance of HBFS-GA using 18 different UCI datasets and various metrics. The results demonstrate that HBFS-GA outperforms existing wrapper-based and filter-based FS methods.

In the new proposed technique for feature selection, the DMO algorithm is used with chaotic maps to select the best prominent features. The DMO is used to explore and find minimal possible features in the datasets. The K-Nearest Neighbor (KNN) is used to evaluate the performance of the selected features. The results obtained by the proposed method proved their efficiency and gave better performance over other related state-of-the-art methods. We can summarize the main contribution of this paper as follows:

Propose a new hybrid feature selection method called CDMO based on improving the performance of DMO using chaotic maps.
Evaluate the proposed CDMO method using ten UCI datasets employing the K-nearest Neighbors (KNN) as a classifier to prove its effectiveness.
The results obtained by the proposed CDMO give superior performance than the original DMO algorithm and with other well-known meta-heuristic-based feature selection methods.
On the CEC’22 test suite, the effectiveness and solution quality generated by our proposed method are computed and compared by all 9 chaotic maps and compared with state-of-the-art algorithms.

The rest part of this study is organized as follows: Section "Background" presents background on DMO algorithm and chaotic maps. Section "The proposed CDMO for feature selection" explains the proposed model. Experimental results and analysis are discussed in Section "Experimental results". Finally, the conclusion is summarized in Section "Conclusion and future work".

Background

Dwarf Mongoose Optimization Algorithm (DMO)

DMO²⁷ is a meta-heuristic method that simulates the foraging behavior of the dwarf mongoose that uses its compensatory behavioral adaptations. The mongoose has two main compensatory behavioral adaptations, which are:

1.
Prey size, group size, and space utilization.
2.
Food Provisioning.

Large prey items, which could provide food for the whole group, are not amenable to capture by dwarf mongooses. Due to the lack of a killing bite and organized pack hunting, the dwarf mongoose has evolved a social structure that allows each individual to survive independently and move from one location to another. The dwarf mongoose lives a semi-nomadic lifestyle in an area big enough to accommodate the entire colony. Because no previously visited sleeping mound is returned, the nomadic lifestyle ensures that the entire territory is explored and prevents over-exploitation of any one area²⁷.

Population initialization

The candidate populations of the mongooses (X) are initialized using Eq. (1). Between the upper bound (UB) and lower bound (LB) of the given problem, the population is generated stochastically.

$$X=\left[\begin{array}{ccccc}{x}_{\mathrm{1,1}}& {x}_{\mathrm{1,2}}& ...& {x}_{1,d-1}& {x}_{1,d}\\ {x}_{\mathrm{2,1}}& {x}_{\mathrm{2,2}}& ...& {x}_{2,d-1}& {x}_{2,d}\\ & \vdots & {x}_{i,j}& \vdots & \\ {x}_{n,1}& {x}_{n,2}& ...& {x}_{n,d-1}& {x}_{n,d}\end{array}\right]$$

(1)

where $X$ is the populations, created at random by Eq. (2), ${x}_{i,j}$ stands for the location of the jth dimension in the ith population, n stands for population size, and d stands for the problem dimension.

$${x}_{i,j}=VarMin+rand\times \left(VarMax- VarMin\right)$$

(2)

where rand is a random number between [0, 1], VarMax and VarMin are upper and lower bound of the problem. The best solution over iteration is the best-obtained solution so far.

The fitness of each solution is calculated after the population has been initiated. Equation (3) calculates the probability value for each population fitness, and the alpha female (α) is chosen based on this probability.

$$\alpha =\frac{fi{t}_{i}}{{\sum }_{i=1}^{n}fi{t}_{i}}$$

(3)

The n-bs is equal to the number of mongooses in the alpha group. Where bs represents the number of nannies. Peep is the alpha female's vocalization that directs the family's path.

The DMO applies the formula from Eq. (4) to provide a candidate food position.

$${X}_{i+1}={X}_{i}+phi*peep$$

(4)

where phi is a uniformly distributed random number [− 1,1], after each iteration, the sleeping mound is specified as in Eq. (5).

$$s{m}_{i}=\frac{fi{t}_{i+1}-fi{t}_{i}}{\mathit{max}\{|fi{t}_{i+1},fi{t}_{i}|\}}$$

(5)

The average value of the sleeping mound found is given by Eq. (6).

$$\varphi =\frac{{\sum }_{i=1}^{n}s{m}_{i}}{n}$$

(6)

The mongooses are known to avoid returning to the previous sleeping mound, so the scouts search for the next one to ensure exploration. The scout mongoose is simulated by Eq. (7).

$${X}_{i+1}=\left\{\begin{array}{c}{X}_{i}-CF*phi*rand*\left[{X}_{i}-\overrightarrow{M}\right] if {\varphi }_{i+1}>{\varphi }_{i}\\ {X}_{i}+CF*phi*rand*\left[{X}_{i}-\overrightarrow{M}\right] elsewhere\end{array}\right.$$

(7)

where, $CF=(1-\frac{iter}{Ma{x}_{iter}}{)}^{\left(2\frac{iter}{Ma{x}_{iter}}\right)}$ indicates the variable, which decreases linearly with each iteration, that controls the group's collective-volatile movement. $\overrightarrow{M}={\sum }_{i=1}^{n}\frac{{x}_{i}\times s{m}_{i}}{{X}_{i}}$ is the vector that controls the mongoose's movement to its new sleeping mound.

Chaotic maps

Chaos is a phenomenon that can exhibit non-linear changes in future behavior when its initial condition is even slightly altered. Additionally, it is described as a semi-random behavior generated by nonlinear deterministic systems²⁸. One of main search algorithms is Chaos Optimization Algorithm (COA) which moves variables and parameters from the chaos to the solution space. It relies on determining the global optimum for stochastic, regular, and periodicity chaotic motion properties. Due to its simplicity and speedily convergence, COA has widely used in last ten years in many papers e.g.,^29,30,31,32. To obtain the chaotic sets, we have used ten well known one-dimensional maps that have been used frequently in literature. Figure 1 shows that the maps have different behaviors which allow testing the behavior of DMO on different maps.

The proposed CDMO for feature selection

In this study, an alternative feature selection technique is proposed using the Chaotic Dwarf Mongoose Optimization (CDMO) as in Fig. 2. Random numbers which are used in Eq. (7) are replaced by chaotic maps to avoid returning to same sleeping mound.

$${X}_{i+1}=\left\{\begin{array}{ll}{X}_{i}-CF*phi*\rho *\left[{X}_{i}-\overrightarrow{M}\right] &\quad if {\varphi }_{i+1}>{\varphi }_{i}\\ {X}_{i}+CF*phi*\rho *\left[{X}_{i}-\overrightarrow{M}\right] &\quad else\end{array}\right.$$

(8)

where $\rho$ is value obtained from well-known chaotic maps which reported in Table 1.

Table 1 Ten chaotic maps.

Full size table

After that, we have set the dimension of the problem, which is d in Eq. (1) as the number of features then give value of $VarMin$ and $VarMax$ in Eq. (2) as 0 and 1, respectively. For each row in Eq. (1) (i.e., the position of each element in ${X}_{i}$) is threshold by 0.5, since the values are set between 0 and 1. After that, elements with positions > 0.5 are considered as candidate features, while elements with positions < 0.5 are not considered in this solution.

$${X}_{i,j}=\left\{\begin{array}{ll}1&\quad { x}_{i,j}>0.5\\ 0 &\quad Otherwise\end{array}\right.$$

(9)

The candidate features are then applied to the fitness function which calculates the classification accuracy of k-nearest neighbor classifier using the applied candidate features.

$$Fitness= \frac{Number\,of\,wrong\,classified }{Total\,numbers\,of\,instances}+\frac{|{X}_{i}|}{d}$$

(10)

Each time the fitness function is invoked the dataset is divided using the holdout method to 80% training dataset and 20% testing dataset. Algorithm 1 and Fig. 2 show the algorithm and the flowchart of the proposed technique, respectively.

Experimental results

Dataset and parameters setting

Table 2 lists the 10 datasets that were used in this study which are come from the well-known UCI data warehouse³³. They have been chosen with different dimensions and different patterns to evaluate the performance of the proposed method on several complexities.

Table 2 Datasets used in this study.

Full size table

K-nearest neighbor (KNN) is employed as a classifier in this study as it is one of the most common and simplest learning algorithms, it is trained using the training dataset, then, tested using the testing part, which ensures higher reliability. To simplify the evaluation process, we choose K = 5 in KNN as 5NN³⁴.

Performance metrics

In this study we have used two types of metrics to evaluate the performance which are Fitness metrics and classification Metrics.

In fitness metrics we have used four statistical measurements which are the worst, best, mean fitness value and the standard deviation which are mathematically defined as following

$${\text{BestFitness}}= {{\text{Max}}}_{{\text{i}}=1}^{{N}_{r}}{{\text{BS}}}_{{\text{i}}},$$

(11)

$$\mathrm{Worst Fitness}= {\mathit{Min}}_{i=1}^{{N}_{r}}{{\text{BS}}}_{{\text{i}}},$$

(12)

$$\mathrm{Mean Fitness}= \frac{1}{{N}_{r}}\sum_{{\text{i}}=1}^{{N}_{r}}{{\text{BS}}}_{{\text{i}}},$$

(13)

$$\mathrm{Standard Deviation }\left({\text{SD}}\right)= \sqrt{\frac{\sum_{{\text{i}}=1}^{{N}_{r}}{({{\text{BS}}}_{{\text{i}}}-\upmu )}^{2}}{{N}_{r}}}$$

(14)

where BS is the best score gained in each iteration and Nr is the number of runs³⁵.

The second evaluation was used to evaluate the selected features using classification measures. These measures are accuracy, precision, sensitivity, specificity, and F-Score. Accuracy is a common technique of evaluation, which is defined as the ratio of correctly classified samples to all samples. It’s mathematically defined as following

$$Accuracy= \frac{TP+TN}{TP+TN+FP+FN} ,$$

(15)

Precision, specificity and sensitivity are proper metrics to measure the performance of classification across unbalanced datasets. While they are not affected by differences in data distribution, therefore these measures are useful for evaluating classification performance in unbalanced learning scenarios³⁶. The F-Score metric make combination between precision and sensitivity and it is given by Eq. (19). Therefore, F-Score is suitable in unbalanced scenarios than the accuracy metric. Precision, sensitivity, specificity and F-score measures are defined by the following equations:

$$Precision = \frac{TP}{TP+FP}$$

(16)

$$Sensitivity= \frac{TP}{TP+FN}$$

(17)

$$Specificity = \frac{TN}{TN+FP}$$

(18)

$$F-score= \frac{2*\left(precision *Sensitivity\right)}{Precision +Sensitivity}$$

(19)

where TP is the true positive, FP is the false positive, FN is the false negative and TN represents the true negative.

Performance of DMO based on ten chaotic maps

To evaluate the performance of the proposed CDMO, 10 different datasets from UCI repository are used. The obtained results are compared with the DMO and other well-known meta-heuristic algorithms namely, PSO⁵, ACO⁶, ARO³⁷, HHO³⁸, EO³⁹, RTHS⁴⁰, RSGW⁴¹, SSAPSO⁴², BGA⁴³ and WOA¹⁴ algorithms. Each one of them has been performed 25 runs in the same PC specifications. To test the convergence capability, the average 25 runs has been computed and compared for each algorithm. Table 3 illustrates the parameter settings of the algorithms used in this study. The experiments are divided into two sections, the first one is to evaluate the performance of the ten chaotic maps on DMO algorithm as shown in Tables 4 and 5, the second experiments are to compare the best chaotic maps with the six meta-heuristic algorithms DMO, ACO, PSO, ARO, HHO, and WOA as shown in Tables 6 and 7.

Table 3 Parameter setting.

Full size table

Table 4 Accuracy comparison between ten CDMO.

Full size table

Table 5 Average fitness comparison between ten CDMO.

Full size table

Table 6 Comparison between CDMO8 and 6 meta-heuristic algorithms in classification metrics.

Full size table

Table 7 Comparison between CDMO8 and 6 meta-heuristic algorithms in fitness metrics.

Full size table

Table 4 shows the accuracy of the average runs for the ten CDMO where the number after CDMO refers to the map number in Table 1, for example CDMO1 is Chebyshev map. Results in Table 4 shows that the Singer map which is CDMO8 has higher results in three datasets named (breastEW, SpectEW, Waveform), CDMO1 and CDMO7 have best results in (KrvskpEW) and (Ionosphere), respectively. All maps have same accuracy in two datasets named (base_exactly) and (base_M-of-n3). Table 5 shows the comparison of average fitness value of the ten chaotic maps. The Singer map (CDMO8) achieved best results in 5 out of 10 datasets. Both CDMO4 and CDMO6 achieved same result in base_M-of-n3. Also, CDMO1, CDMO3, CDMO5, CDMO7, CDMO10 have best results in one dataset for each, so CDMO8 has been chosen to be compared with ACO, PSO, WOA, ARO, HHO and DMO algorithms.

Figure 3 illustrates the convergence curves for the ten chaotic maps. In this figure, the number of iterations is equal to 100. As it can be observed from this figure, almost singer map obtains best result. This is due to that it converges faster than other maps.

Comparison with other meta-heuristic techniques

In this section, we will compare the performance of the developed method based on Singer map with well-known and most used techniques named PSO, ACO, ARO, HHO and WOA.

From Table 6, the CDMO gives best accuracy in seven datasets (base_BreastEW, SonarEW, SpectEW, Waveform, CongressEW, breastEW and Ionosphere) while DMO gives superior performance in one data set named KrvskpeEW. Moreover, DMO and CDMO give equal performance in 2 datasets (base_M-of-n3 and base_exactly). Based on the results of Precision, CDMO8 has better results in seven datasets. Whereas DMO has better results in one dataset named BreastEW, both CDMO8 and DMO have same results in two datasets. By analysis of the obtained results of the Sensitivity, the CDMO8 has highest results of four datasets, while DMO and PSO have highest results in three datasets and one dataset, respectively. Moreover, both CDMO8 and DMO have same results in two datasets named base_exactly and base_M-of-n3. For specificity results, CDMO8 has highest results in seven datasets while PSO has best results in only one dataset named BreastEW. Besides, both CDMO8 and DMO have same results in two datasets. In addition, F-measure results show that CDMO8 has better results in five datasets while DMO has better result in KrvskpEW dataset and ARO has better result in SpectEW and ionosphere datasets, both CDMO8 and DMO have same results in two datasets.

Table 7 presents the results of fitness metrics which is standard deviation SD, Best, Worst and the Average of fitness function. In the average of fitness function, the CDMO8 achieved best results in 9 out of 10 datasets while ACO has best results in Ionosphere dataset only. In terms of best measure, the CDMO8 has best results in 5 out 10 datasets while the original DMO has best results in 2 out of 10 datasets, ARO has better value in ionosphere and base_M-of-n3 datasets both CDMO8 and DMO have same results in breastEW dataset. Furthermore, for Worst measure, CDMO8 has best results in 5 out of 10 datasets, while PSO has the second rank by 3 out of 10 datasets. WOA and DMO have highest results in one dataset for each. Additionally concerning standard deviation, WOA has the superior results by 7 out of 10 datasets, neither CDMO nor original DMO got best results in standard deviation results.

Figure 4 shows the comparison between CDMO8 and other meta-heuristic algorithms (i.e., PSO, ACO, DMO, ARO, HHO and WOA) in convergence curve. As observed from figure, CDMO8 converges faster in most figures.

Table 8 compares the accuracy of CDMO8 against 6 state-of-the-art methods namely, BGA, RTHS, RSGW, EO, SSAPSO and HSGW. It is clear that our proposed CDMO method stands at the top over these methods. CDMO8 produces higher accuracy in 8 out 10 datasets.

Table 8 Comparison of CDMO8 with other 6 state-of-the-art methods based on achieved accuracy (highest classification accuracies are in bold).

Full size table

Performance evaluation on CEC’22 benchmark functions

In this section, the performance of the proposed CDMO algorithm in solving optimization problems is tested. To this end, the numerical solving efficiency of CDMO is evaluated by solving twelve functions of CEC’22. The performance of the proposed CDMO on the CEC’22 benchmark function has been determined. Table 9 presents the outcomes for a CEC’2022 test suite for 30 runs performed by the proposed ten chaotic DMO. These benchmark functions consist of four types unimodal, basic, hybrid and composite functions. It is found that CDMO9 achieves the best performance.

Table 9 Comparison of simulation outcomes using DMO with 10 chaotic maps for a CEC’2022 test suite for 30 runs.

Full size table

In order to verify the effectiveness of CDMO9, the results of the proposed CDMO9 are compared, in Table 10, with six novel optimization algorithms namely, Artificial Hummingbird Algorithm (AHA)⁴⁴, African Vultures Optimization Algorithm (AVOA)⁴⁵, Crow Search Algorithm (CSA)⁴⁶, Harris Hawks Optimization (HHO)³⁸, Northern Goshawk Optimization (NGO)⁴⁷ and Satin Bowerbird Optimizer (SBO)⁴⁸. Besides, in order to demonstrate the ability of CDMO9 to solve optimization problems, the obtained results are compared with two algorithms recently improved by scholars namely, an adaptive quadratic interpolation and rounding mechanism Sine Cosine Algorithm (ARSCA)⁴⁹ and boosting Archimedes Optimization Algorithm using trigonometric operators (SCAOA)⁵⁰. The experimental results show that the proposed method compares favorably with these methods.

Table 10 Comparison of simulation outcomes for a CEC’2022 test suite for 30 runs (highest classification accuracies are in bold).

Full size table

Conclusion and future work

Chaotic Dwarf Mongoose Optimization Algorithm (CDMO) was proposed which is Dwarf Mongoose algorithm hybridized by chaos. To enhance the performance of the proposed technique, ten chaotic maps were employed where CDMO is used as a wrapper feature selector. The CDMO gives superior performance than the well-known meta-heuristic algorithms, namely PSO, ACO, WOA, ARO, HHO BGA, RTHS, RSGW, EO, SSAPSO, HSGW and DMO. The obtained results proved that the capability of CDMO to select the best feature set gives high classification results. Moreover, the experimental results proved that the adjusted variable using the Singer map significantly enhanced the DMO algorithm in terms of classification performance, and fitness performance. Moreover, our proposed algorithm is tested using the recent optimizers in CEC’22.

In the future work we can extend this work to solve real world problem like medical data. In addition, it would be interested to investigate in hybridization DMO algorithm with another swarm meta-heuristic algorithm.

Ethics approval

This research contains neither human nor animal studies.

Data availability

The datasets used in this study are available in the UC Irvine Machine Learning Repository, “https://archive.ics.uci.edu/: Access Date: 10 May 2023. “

References

Kyaw, K. S., Limsiroratana, S. & Sattayaraksa, T. A comparative study of meta-heuristic and conventional search in optimization of multi-dimensional feature selection. Int. J. Appl. Metaheuristic Comput. (IJAMC) 13(1), 1–34 (2022).
Article Google Scholar
Hafez, A. I., Zawbaa, H. M., Emary, E., Mahmoud, H. A., & Hassanien, A. E. An innovative approach for feature selection based on chicken swarm optimization. In 2015 7th international conference of soft computing and pattern recognition (SoCPaR) pp 19–24. IEEE. https://doi.org/10.1109/SOCPAR.2015.7492775 (2015).
Emary, E. & Zawbaa, H. M. Feature selection via Lèvy Antlion optimization. Pattern Anal. Appl. 22, 857–876. https://doi.org/10.1007/s10044-018-0695-2 (2019).
Article MathSciNet Google Scholar
Long, W., Xu, M., Jiao, J. & Wu, T. A velocity-based butterfly optimization algorithm for high-dimensional optimization and feature selection. Expert Syst. Appl. 201, 117217. https://doi.org/10.1016/j.eswa.2022.117217 (2022).
Article Google Scholar
Poli, R., Kennedy, J. & Blackwell, T. Particle swarm optimization: An overview. Swarm Intell. 1, 33–57. https://doi.org/10.1007/s11721-007-0002-0 (2007).
Article Google Scholar
Dorigo, M., Birattari, M. & Stutzle, T. Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39. https://doi.org/10.1109/MCI.2006.329691 (2006).
Article Google Scholar
Sivanandam, S., & Deepa, S. Genetic Algorithm Optimization Problems. In: Introduction to Genetic Algorithms (Springer, Berlin, Heidelberg). https://doi.org/10.1007/978-3-540-73190-0_7 (2008).
Gandomi, A. H., Yang, X. S., Talatahari, S. & Alavi, A. H. Metaheuristic algorithms in modeling and optimization. Metaheuristic Appl. Struct. Infrastruct. 1, 1–24 (2013).
Google Scholar
Nikolaev, A.G. & Jacobson, S.H. Simulated Annealing. In Handbook of Metaheuristics. 146, (eds Gendreau, M. & Potvin, J.Y.) Int. Ser. Oper. Res. Manag. Sci. https://doi.org/10.1007/978-1-4419-1665-5_1 (Springer, Boston, MA, 2010).
Hao, Z. F., Guo, G. H., & Huang, H. A particle swarm optimization algorithm with differential evolution. In 2007 international conference on machine learning and cybernetics, 2, 1031–1035. IEEE. https://doi.org/10.1109/ICMLC.2007.4370294 (2007).
Joshi, A. S., Kulkarni, O., Kakandikar, G. M. & Nandedkar, V. M. Cuckoo search optimization-a review. Mater. Today Proc. 4(8), 7262–7269. https://doi.org/10.1016/j.matpr.2017.07.055 (2017).
Article Google Scholar
Afshinmanesh, F., Marandi, A., & Rahimi-Kian, A. A novel binary particle swarm optimization method using artificial immune system. In EUROCON 2005-The International Conference on" Computer as a Tool”, 1, 217–220. IEEE. https://doi.org/10.1109/EURCON.2005.1629899 (2005).
Shen, Q., Shi, W. M. & Kong, W. Hybrid particle swarm optimization and tabu search approach for selecting genes for tumor classification using gene expression data. Comput. Biol. Chem. 32(1), 53–60 (2008).
Article CAS Google Scholar
Nasiri, J. & Khiyabani, F. M. A whale optimization algorithm (WOA) approach for clustering. Cogent Math. Stat. 5(1), 1483565 (2018).
Article MathSciNet Google Scholar
Dokeroglu, T., Deniz, A. & Kiziloz, H. E. A comprehensive survey on recent metaheuristics for feature selection. Neurocomputing https://doi.org/10.1016/j.neucom.2022.04.083 (2022).
Article Google Scholar
Xue, B., Zhang, M. & Browne, W. N. Particle swarm optimization for feature selection in classification: A multi-objective approach. IEEE Trans. Cybern. 43(6), 1656–1671. https://doi.org/10.1109/TSMCB.2012.2227469 (2012).
Article Google Scholar
Emary, E., Zawbaa, H. M. & Hassanien, A. E. Binary ant lion approaches for feature selection. Neurocomputing 213, 54–65 (2016).
Article Google Scholar
Aalaei, S., Shahraki, H., Rowhanimanesh, A. & Eslami, S. Feature selection using genetic algorithm for breast cancer diagnosis: Experiment on three different datasets. Iran. J. Basic Med. Sci. 19(5), 476 (2016).
PubMed PubMed Central Google Scholar
Ferriyan, A., Thamrin, A. H., Takeda, K., & Murai, J. Feature selection using genetic algorithm to improve classification in network intrusion detection system. In 2017 international electronics symposium on knowledge creation and intelligent computing (IES-KCIC) (pp. 46–49). IEEE (2017).
Karaboga, D. & Akay, B. A comparative study of artificial bee colony algorithm. Appl. Math. Comput. 214(1), 108–132 (2009).
MathSciNet Google Scholar
Etminaniesfahani, A., Gu, H. & Salehipour, A. ABFIA: A hybrid algorithm based on artificial bee colony and Fibonacci indicator algorithm. J. Comput. Sci. 61, 101651. https://doi.org/10.1016/j.jocs.2022.101651 (2022).
Article Google Scholar
Etminaniesfahani, A., Ghanbarzadeh, A. & Marashi, Z. Fibonacci indicator algorithm: A novel tool for complex optimization problems. Eng. Appl. Artif. Intell. 74, 1–9. https://doi.org/10.1016/j.engappai.2018.04.012 (2018).
Article Google Scholar
Akinola, O. A., Ezugwu, A. E., Oyelade, O. N. & Agushaka, J. O. A hybrid binary dwarf mongoose optimization algorithm with simulated annealing for feature selection on high dimensional multi-class datasets. Sci. Rep. 12(1), 14945 (2022).
Article ADS PubMed PubMed Central CAS Google Scholar
Eluri, R. K. & Devarakonda, N. Binary golden eagle optimizer with time-varying flight length for feature selection. Knowl. Based Syst. 247, 108771 (2022).
Article Google Scholar
Eluri, R. K. & Devarakonda, N. Chaotic binary pelican optimization algorithm for feature selection. Int. J. Uncert. Fuzziness Knowl. Based Syst. 31(03), 497–530 (2023).
Article MathSciNet Google Scholar
Eluri, R. K. & Devarakonda, N. Feature selection with a binary flamingo search algorithm and a genetic algorithm. Multimed. Tools Appl. 82(17), 26679–26730 (2023).
Article Google Scholar
Agushaka, J. O., Ezugwu, A. E. & Abualigah, L. Dwarf mongoose optimization algorithm. Comput. Methods Appl. Mech. Eng. 391, 114570 (2022).
Article ADS MathSciNet Google Scholar
Yang, D., Li, G. & Cheng, G. On the efficiency of chaos optimization algorithms for global optimization. Chaos Solit. Fractals. 34(4), 1366–1375 (2007).
Article ADS Google Scholar
Chuang, L. Y., Yang, C. H. & Li, J. C. Chaotic maps based on binary particle swarm optimization for feature selection. Appl. Soft Comput. 11(1), 239–248 (2011).
Article Google Scholar
Sayed, G. I., Darwish, A. & Hassanien, A. E. A new chaotic whale optimization algorithm for features selection. J. Classif. 35(2), 300–344. https://doi.org/10.1007/s00357-018-9261-2 (2018).
Article MathSciNet Google Scholar
Sayed, G. I., Tharwat, A. & Hassanien, A. E. Chaotic dragonfly algorithm: An improved metaheuristic algorithm for feature selection. Appl. Intell. 49, 188–205. https://doi.org/10.1007/s10489-018-1261-8 (2019).
Article Google Scholar
Sayed, G. I., Hassanien, A. E. & Azar, A. T. Feature selection via a novel chaotic crow search algorithm. Neural Comput. Appl. 31, 171–188 (2019).
Article Google Scholar
Frank, A., & Asuncion, A. UCI machine learning repository (2010).
Peterson, L. E. K-nearest neighbor. Scholarpedia. 4(2), 1883. https://doi.org/10.4249/scholarpedia.1883 (2009).
Article ADS Google Scholar
Derrac, J., García, S., Molina, D. & Herrera, F. A practical tutorial on the use of nonparametric statistical tests as a methodology for comparing evolutionary and swarm intelligence algorithms. Swarm Evol. Comput. 1(1), 3–18 (2011).
Article Google Scholar
He, H. & Garcia, E. A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009).
Article Google Scholar
Wang, L., Cao, Q., Zhang, Z., Mirjalili, S. & Zhao, W. Artificial rabbits optimization: A new bio-inspired meta-heuristic algorithm for solving engineering optimization problems. Eng. Appl. Artif. Intell. 114, 105082 (2022).
Article Google Scholar
Heidari, A. A. et al. Harris hawks optimization: Algorithm and applications. Future Gen. Comput. Syst. 97, 849–872 (2019).
Article Google Scholar
Ahmed, S., Ghosh, K. K., Mirjalili, S. & Sarkar, R. AIEOU: Automata-based improved equilibrium optimizer with U-shaped transfer function for feature selection. Knowl. Based Syst. 228, 107283 (2021).
Article Google Scholar
Ahmed, S., Ghosh, K. K., Singh, P. K., Geem, Z. W. & Sarkar, R. Hybrid of harmony search algorithm and ring theory-based evolutionary algorithm for feature selection. IEEE Access. 8, 102629–102645 (2020).
Article Google Scholar
Mafarja, M. et al. Efficient hybrid nature-inspired binary optimizers for feature selection. Cognit Comput. 12, 150–175. https://doi.org/10.1007/s12559-019-09668-6 (2020).
Article Google Scholar
Ibrahim, R. A., Ewees, A. A., Oliva, D., Abd Elaziz, M. & Lu, S. Improved salp swarm algorithm based on particle swarm optimization for feature selection. J. Ambient Intell. Hum. Comput. 10, 3155–3169 (2019).
Article Google Scholar
Leardi, R. Application of a genetic algorithm to feature selection under full validation conditions and to outlier detection. J. Chemometr. 8(1), 65–79 (1994).
Article CAS Google Scholar
Zhao, W., Wang, L. & Mirjalili, S. Artificial hummingbird algorithm: A new bio-inspired optimizer with its engineering applications. Comput. Methods Appl. Mech. Eng. 388, 114194 (2022).
Article ADS MathSciNet Google Scholar
Abdollahzadeh, B., Gharehchopogh, F. S. & Mirjalili, S. African vultures optimization algorithm: A new nature-inspired metaheuristic algorithm for global optimization problems. Comput. Ind. Eng. 158, 107408 (2021).
Article Google Scholar
Askarzadeh, A. A novel metaheuristic method for solving constrained engineering optimization problems: Crow search algorithm. Comput. Struct. 169, 1–12 (2016).
Article Google Scholar
Dehghani, M. & Hubálovsky, Š. Northern goshawk optimization: A new swarm-based algorithm for solving optimization problems. IEEE Access 9, 162059–162080 (2021).
Article Google Scholar
Moosavi, S. H. S. & Bardsiri, V. K. Satin bowerbird optimizer: A new optimization algorithm to optimize anfis for software development effort estimation. Eng. Appl. Artif. Intell. 60, 1–15 (2017).
Article Google Scholar
Yang, X. et al. An adaptive quadratic interpolation and rounding mechanism sine cosine algorithm with application to constrained engineering optimization problems. Expert Syst. Appl. 213, 119041 (2023).
Article Google Scholar
Neggaz, I., Neggaz, N. & Fizazi, H. Boosting archimedes optimization algorithm using trigonometric operators based on feature selection for facial analysis. Neural Comput. Appl. 35, 3903–3923 (2023).
Article PubMed Google Scholar

Download references

Funding

Open access funding provided by The Science, Technology & Innovation Funding Authority (STDF) in cooperation with The Egyptian Knowledge Bank (EKB).

Author information

Authors and Affiliations

Department of Mathematics, Faculty of Science, Damietta University, New Damietta, 34517, Egypt
Mohammed Abdelrazek
Department of Mathematics, Faculty of Science, Zagazig University, Zagazig, 44519, Egypt
Mohamed Abd Elaziz
Artificial Intelligence Research Center (AIRC), Ajman University, Ajman 346, UAE
Mohamed Abd Elaziz
MEU Research Unit, Middle East University, Amman, 11831, Jordan
Mohamed Abd Elaziz
Department of Electrical and Computer Engineering, Lebanese American University, Byblos, 13-5053, Lebanon
Mohamed Abd Elaziz
Department of Computer Science, Faculty of Computers and Artificial Intelligence, Damietta University, New Damietta, 34517, Egypt
A. H. El-Baz

Authors

Mohammed Abdelrazek
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Abd Elaziz
View author publications
You can also search for this author in PubMed Google Scholar
A. H. El-Baz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

M.A.: software, investigation, formal analysis, visualization, writing—original draft, writing—review and editing. M.A.E.: conceptualization, methodology, data curation, validation, investigation, writing—original draft, writing—review and editing, visualization. A.H.E.-B.: methodology, software, data curation, investigation, formal analysis, validation, visualization, writing—review and editing, writing—original draft.

Corresponding author

Correspondence to A. H. El-Baz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Abdelrazek, M., Abd Elaziz, M. & El-Baz, A.H. CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection. Sci Rep 14, 701 (2024). https://doi.org/10.1038/s41598-023-50959-8

Download citation

Received: 26 August 2023
Accepted: 28 December 2023
Published: 06 January 2024
DOI: https://doi.org/10.1038/s41598-023-50959-8

This article is cited by

Improved Dwarf Mongoose Optimization Algorithm for Feature Selection: Application in Software Fault Prediction Datasets
- Abdelaziz I. Hammouri
- Mohammed A. Awadallah
- Majdi Beseiso
Journal of Bionic Engineering (2024)
Feature selection using modified chaotic satin bowerbird algorithm with deep transfer learning for Multispectral Image Classification
- M. Rajakani
- R. J. Kavitha
- S. Rajesh
International Journal of Information Technology (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

A weighted-sum chaotic sparrow search algorithm for interdisciplinary feature selection and data classification

Multi-variant differential evolution algorithm for feature selection

Birdsongs recognition based on ensemble ELM with multi-strategy differential evolution

Introduction

Background

Dwarf Mongoose Optimization Algorithm (DMO)

Population initialization

Chaotic maps

The proposed CDMO for feature selection

Experimental results

Dataset and parameters setting

Performance metrics

Performance of DMO based on ten chaotic maps

Comparison with other meta-heuristic techniques

Performance evaluation on CEC’22 benchmark functions

Conclusion and future work

Ethics approval

Data availability

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

Improved Dwarf Mongoose Optimization Algorithm for Feature Selection: Application in Software Fault Prediction Datasets

Feature selection using modified chaotic satin bowerbird algorithm with deep transfer learning for Multispectral Image Classification

Comments

Search

Quick links