A weighted-sum chaotic sparrow search algorithm for interdisciplinary feature selection and data classification

Jia, LiYun; Wang, Tao; Gad, Ahmed G.; Salem, Ahmed

doi:10.1038/s41598-023-38252-0

Download PDF

Article
Open access
Published: 28 August 2023

A weighted-sum chaotic sparrow search algorithm for interdisciplinary feature selection and data classification

LiYun Jia¹,
Tao Wang¹,
Ahmed G. Gad² &
…
Ahmed Salem³

Scientific Reports volume 13, Article number: 14061 (2023) Cite this article

3356 Accesses
1 Citations
1 Altmetric
Metrics details

Subjects

Abstract

In today’s data-driven digital culture, there is a critical demand for optimized solutions that essentially reduce operating expenses while attempting to increase productivity. The amount of memory and processing time that can be used to process enormous volumes of data are subject to a number of limitations. This would undoubtedly be more of a problem if a dataset contained redundant and uninteresting information. For instance, many datasets contain a number of non-informative features that primarily deceive a given classification algorithm. In order to tackle this, researchers have been developing a variety of feature selection (FS) techniques that aim to eliminate unnecessary information from the raw datasets before putting them in front of a machine learning (ML) algorithm. Meta-heuristic optimization algorithms are often a solid choice to solve NP-hard problems like FS. In this study, we present a wrapper FS technique based on the sparrow search algorithm (SSA), a type of meta-heuristic. SSA is a swarm intelligence (SI) method that stands out because of its quick convergence and improved stability. SSA does have some drawbacks, like lower swarm diversity and weak exploration ability in late iterations, like the majority of SI algorithms. So, using ten chaotic maps, we try to ameliorate SSA in three ways: (i) the initial swarm generation; (ii) the substitution of two random variables in SSA; and (iii) clamping the sparrows crossing the search range. As a result, we get CSSA, a chaotic form of SSA. Extensive comparisons show CSSA to be superior in terms of swarm diversity and convergence speed in solving various representative functions from the Institute of Electrical and Electronics Engineers (IEEE) Congress on Evolutionary Computation (CEC) benchmark set. Furthermore, experimental analysis of CSSA on eighteen interdisciplinary, multi-scale ML datasets from the University of California Irvine (UCI) data repository, as well as three high-dimensional microarray datasets, demonstrates that CSSA outperforms twelve state-of-the-art algorithms in a classification task based on FS discipline. Finally, a 5%-significance-level statistical post-hoc analysis based on Wilcoxon’s signed-rank test, Friedman’s rank test, and Nemenyi’s test confirms CSSA’s significance in terms of overall fitness, classification accuracy, selected feature size, computational time, convergence trace, and stability.

CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection

Article Open access 06 January 2024

Pair barracuda swarm optimization algorithm: a natural-inspired metaheuristic method for high dimensional optimization problems

Article Open access 25 October 2023

Hybrid Harris hawks optimization with cuckoo search for drug design and discovery in chemoinformatics

Article Open access 02 September 2020

Introduction

The twenty-first century has become the era of data, with data analysis and utilization visible everywhere in all aspects of life, and these data are frequently of high-dimensional character^1,2,3,4,5. However, it is inevitable that this data will contain a substantial number of redundant and irrelevant characteristics, increasing the computational overhead and risk of overfitting when handled by traditional machine learning (ML) algorithms^6,7,8. As a result, in order to make better use of the data, efficient procedures, such as feature selection (FS), must be developed to handle the worthless features^9,10,11. Wrappers, filters, and embedded FS techniques are commonly used to differentiate them based on their evaluation for feature subsets¹². Wrapper-based approaches rely on predefined ML algorithms to obtain higher classification accuracy but are very expensive to compute because the ML algorithms must be run numerous times¹³. On the contrary, while evaluating feature subsets, filter-based approaches do not use any ML algorithms, which reduces computing cost but may reduce classification accuracy¹⁴. Embedded techniques incorporate FS into model learning, accounting for the influence of the algorithmic model while lowering computational weight; however, these methods have poor generalization ability and significant computational complexity¹⁵.

Because the number of feature subsets varies geometrically due to data dimensionality, it is challenging to produce adequate results using traditional methods, especially when working on high-dimensional data. To reduce the high computational cost caused by the curse of dimensionality, novel feature subset selection approaches can be developed based on wrapper swarm intelligence (SI) algorithms due to their robustness and adjustability^16,17,18. SI algorithms have three essential characteristics: flexibility, self-organization, and resilience. These algorithms are often inspired by group behavior in nature, such as foraging, anti-predation, and migration¹⁹. Typical SI algorithms are ant colony optimization (ACO)²⁰, particle swarm optimization (PSO)²¹, grey wolf optimizer (GWO)²², artificial bee colony (ABC)²³, whale optimization algorithm (WOA)²⁴, grasshopper optimization algorithm (GOA)²⁵, harris hawks optimization (HHO)²⁶, and bird swarm algorithm (BSA)²⁷. Other optimization algorithms include bat algorithm (BA)²⁸, atom search optimization (ASO)²⁹, and henry gas solubility optimization (HGSO)³⁰. In general, meta-heuristic algorithms can effectively handle FS problems, lowering computational complexity while achieving a greater classification accuracy, and SI approaches have, therefore, been consistently applied to FS problems^31,32,33,34. For instance, Hussain et al.³⁵ integrated the sine-cosine algorithm (SCA) into HHO to balance the exploration and exploitation capabilities of HHO, and experimental results on several numerical optimization as well as FS problems revealed the competitive advantage of the proposed algorithm over other SI algorithms. Neggaz et al.³⁶ first applied HGSO to solving FS problems. Experimental results on datasets with different feature sizes (from 13 to 15009) showed that HGSO is effective in minimizing feature size, especially on high-dimensional data, while preserving maximum classification accuracy.

Nevertheless, SI algorithms tend to fall into local optimization due to: (i) the imbalance between exploration and exploitation; and (ii) super stochasticity^37,38. Numerous studies have shown that chaos theory can defeat such an issue owing to its characteristics of semi-stochastic, ergodicity, and sensitivity to the initial swarm^39,40. Khosravi et al.⁴¹ incorporated a new local search strategy and the Piecewise chaotic map, in order to make their teaching optimization algorithm capable of tackling high-dimensional FS problems. Zhang et al.⁴² integrated the Gaussian’s mutation and the Logistic chaotic map into the fruit fly algorithm (FFA) to avoid premature convergence and hence strengthen the exploration capability. Sayed et al.⁴³ optimized the crow search algorithm (CSA) by using ten chaotic maps to improve its performance in tackling FS problems in terms of classification accuracy, number of selected features, and convergence speed. Altay et al.⁴⁴ replaced the random parameters in BSA with ten chaotic maps to boost the exploration ability.

The sparrow search algorithm (SSA) is one of many recently developed SI algorithms. In it, the sparrow is a dexterous species that forages through collective collaboration and can effectively escape natural predators. SSA was proposed by Xue et al.⁴⁵ by emulating such properties. When compared to its counterparts, SSA has garnered a lot of attention because of its fast convergence, great search efficiency, and stability^{46,47,48,49,50,51}. However, SSA suffers from the same flaws as other SI algorithms in that swarm diversity and exploratory abilities decrease as the algorithm progresses^47,52. As a result, significant enhancements have been made to SSA. To make SSA more thorough in exploring the solution space, Xue et al.⁵³ utilized a new neighbor search approach. Gao et al.⁵² added a chaotic map and a mutation evolution technique to SSA to improve its robustness and convergence speed. Gad et al.⁵⁴ binarized SSA using S- and V-shaped functions and included a random relocation approach for transgressive sparrows as well as a new local search strategy to balance its exploration and exploitation capabilities. Lyu et al.⁵⁵ used the Tent chaotic map and the Gaussian mutation technique to improve SSA and apply it to simple image segmentation challenges. Furthermore, Yang et al.⁵⁶ improved SSA with the use of the Sine chaotic map, an adaptive weighting approach, and an adaptive t-distribution mutation operator, and then applied the suggested technique to numerical optimization problems. However, no one has yet used a chaos-improved SSA to solve FS problems. SI algorithm performance can generally be improved in three ways: (i) adjusting their parameters; (ii) altering their mechanisms; and (iii) combining them with other algorithms⁵⁷. In light of this, this work aims to improve SSA by redefining its random parameters and procedures through the use of a chaotic map. The following are the main contributions:

1.
The initial swarm, transgressive positions, and random variables in SSA are processed by using chaotic maps to simultaneously boost its swarm diversity and make a good trade-off between exploration and exploitation in it. Comparing twenty different chaos-improved SSA variants yields the best chaotic SSA (CSSA).
2.
CSSA is compared against twelve peer algorithms, including SSA, ABC, PSO, BA, WOA, GOA, HHO, BSA, ASO, HGSO, success-history based adaptive differential evolution with linear population size reduction (LSHADE)⁵⁸ and evolution strategy with covariance matrix adaptation (CMAES)⁵⁹, on some representative functions from the Institute of Electrical and Electronics Engineers (IEEE) Congress on Evolutionary Computation (CEC) and eighteen multi-scale datasets from the University of California Irvine (UCI) data repository as a scaffold to verify its competitiveness. Furthermore, this study also selects seven recently proposed FS methods from the literature to verify that CSSA still has advantages over several state-of-the-art algorithms.
3.
The capability of CSSA is further tested on three high-dimensional microarray datasets with a number of features/genes (dimensions) up to 12500.
4.
We empirically and theoretically measure the strengths and weaknesses of CSSA against different algorithms to solve FS problems under evaluation metrics, such as overall fitness, classification accuracy, selected feature size, convergence, and stability.
5.
A post-hoc statistical analysis, including Wilcoxon’s signed-rank test, Friedman’s rank test, and Nemenyi’s test, is conducted at a 5% significance level to verify the statistical significance of CSSA over its peers.

Following that, this article is organized as follows. Section Preliminaries introduces the SSA principle and the ten chaotic maps that have been tested with it, whereas Sect. Proposed chaotic sparrow search algorithm (CSSA) presents the proposed CSSA. Section Experimental results and discussion compares CSSA to twelve peer algorithms and seven popular FS approaches in the literature, and experimental data on eighteen UCI datasets and three high-dimensional microarray datasets are provided and analyzed. Section Discussion discusses CSSA’s strengths and limitations. Finally, Sect. Conclusion concludes the paper.

Preliminaries

Sparrow search algorithm (SSA)

This section presents a brief history of SSA and its mathematical formulation. SSA is a recently developed SI algorithm that in a mathematical language mimics the foraging and anti-predatory behaviors of sparrows. In general, sparrows are classed as producers or scroungers based on their fitness values, which are assessed on a regular basis using individuals’ current positions. Producers are largely responsible for supplying food to the swarm, whereas scroungers often use producers as a means to get a source of food. In addition, as predators approach the swarm, some scouters modify their positions to protect themselves and the entire swarm. As a result, the sparrow swarm can continuously gather food while also ensuring security for the swarm’s reproduction under various strategies. Different species of sparrows have different roles, and the following are the components of SSA and its algorithmic process.

Step 1:

The swarm is initialized. SSA first randomly generates the initial positions of a group of sparrows as

$$\begin{aligned} {\textbf{X}}=\left[ \begin{array}{c} {\textbf{x}}_{1} \\ {\textbf{x}}_{2}\\ \vdots \\ {\textbf{x}}_{N} \end{array}\right] = \left[ \begin{array}{cccc}x_{1,1} &{} x_{1,2} &{} \ldots &{} x_{1, D} \\ x_{2,1} &{} x_{2,2} &{} \ldots &{} x_{2, D} \\ \vdots &{} \vdots &{} \ldots &{} \vdots \\ x_{N, 1} &{} x_{N, 2} &{} \ldots &{} x_{N, D}\end{array}\right] \text {, } x_{i,j} \sim U{(0,1)}, \end{aligned}$$

(1)

where N denotes the number of individuals in the swarm, D represents the dimensionality of a decision vector (or the number of features in a dataset being processed in the case of FS problems), and $x_{i,j}$ denotes a value taken by a sparrow i in a dimension j. SSA judges the quality of obtained solutions via a fitness function

$$\begin{aligned} \mathbf {F({\textbf{X}})}=\left[ \begin{array}{c}f({\textbf{x}}_{1}) \\ f({\textbf{x}}_{2}) \\ \vdots \\ f({\textbf{x}}_{N}) \end{array}\right] , \end{aligned}$$

(2)

where a fitness function f(.) is used to evaluate the quality of a given solution ${\textbf{x}}_i$.

Step 2:

The producer is mainly responsible for finding food sources, and its position update rules are

$$\begin{aligned} {\textbf{x}}_i^{t+1}=\left\{ \begin{array}{lll}{\textbf{x}}_i^{t} \exp \left( \frac{-i}{\alpha T}\right) , &{} \text{ if } &{} R_{2}<ST\\ {\textbf{x}}_i^{t}+QL, &{} \text{ if } &{} R_{2} \ge ST\end{array}\right. \end{aligned}.$$

(3)

SSA improves the quality of its solutions by exchanging information among its consecutive iterations. Eq. (3) is used to describe the way information is exchanged between producers as the number of iterations increases. t denotes current iteration’s number. Since SSA is not used to find the global optimal solution, but to provide a relatively better solution, the maximum number of iterations T is usually used as the condition for the termination of the algorithm. $\alpha $ usually has a random value in the range [0, 1]. The warning value $R_2 \sim U{(0,1)}$ indicates the hazard level of a producer’s location, while the safety value $ST \in [0.5,1]$ is a threshold value used to determine whether a producer’s location is safe. $R_2<ST$ indicates that the producer is in a safe environment and can search extensively; otherwise, the producer is at risky location of predation and needs to fly away. Q is a random parameter that follows a normal distribution. L denotes a $1 \times D$ matrix with all its elements having values equal to 1.

Step 3:

The swarm in SSA can be divided into producers and scroungers. The scroungers renew themselves as

$$\begin{aligned} {\textbf{x}}_i^{t+1}=\left\{ \begin{array}{lll}Q \exp \left( \frac{{\mathbf{g}}_{worst}^{t}-{\textbf{x}}_i^{t}}{i^2}\right) , &{} \text { if } i>N / 2 \\ {\mathbf{g}}_{best}^{t+1}+|{\textbf{x}}_i^{t}-{\mathbf{g}}_{best}^{t+1}|A^{+}L, &{} \text { otherwise }\end{array}\right. \end{aligned},$$

(4)

where ${\mathbf{g}}_{worst}$ and ${\mathbf{g}}_{best}$ denote the current global worst and best positions, respectively, with the help of which the discoverers can improve the convergence speed of the algorithm, but it increases the risk of falling into a local optimum. $A^+=A^T(AA^T)^{-1}$, where A denotes a $1 \times D$ matrix with each element in it having a value randomly set to 1 or $-1$. Eq. (4) shows that $i>N/2$ indicates that scroungers need to fly elsewhere to get food; otherwise, scroungers get food form around producers.

Step 4:

Scouters are randomly selected from the swarm, typically 10–20% of the total swarm size, and they are updated as

$$\begin{aligned} {\textbf{x}}_i^{t+1}=\left\{ \begin{array}{ll}{\mathbf{g}}_{worst}^{t}+\beta |{\textbf{x}}_i^{t}+{\mathbf{g}}_{best}^{t}|, &{} \text{ if } f({\textbf{x}}_i^{t})>f({\mathbf{g}}_{best}^{t}) \\ {\textbf{x}}_i^{t}+K\left( \frac{|{\textbf{x}}_i^t+{\mathbf{g}}_{worst}^{t} |}{|f({\textbf{x}}_i^{t})-f({\mathbf{g}}_{worst}^{t})|+\sigma }\right) , &{} \text{ if } f({\textbf{x}}_i^{t})=f({\mathbf{g}}_{best}^{t})\end{array}\right. \end{aligned},$$

(5)

where $\beta $ takes a random value with normal distribution properties, K is a parameter that takes a random value between $-1$ and 1, $\sigma $ is a constant to avoid the occurrence of an error when the denominator is 0, and $f({\mathbf{g}}_{best}^{t})$ and $f({\mathbf{g}}_{worst}^{t})$ are fitness values of the current global best and worst individuals, respectively. The scouters take fitness according to an update criterion, i.e., $f({\textbf{x}}_i^{t})>f({\mathbf{g}}_{best}^{t})$ indicates that the sparrow is at risk of predation and needs to change its location according to the current best individual, whereas when $f({\textbf{x}}_i^{t})=f({\mathbf{g}}_{best}^{t})$, a sparrow needs to strategically move closer to other safe individuals to improve its safety index.

Step 5:

Updation and stopping guidelines are applied. The current position of a sparrow is only updated if its corresponding fitness is better than that of previous position. If the maximum number of current iteration is not reached, then return to Step 2; otherwise, output position and fitness of the best individual.

Thus, the basic framework of SSA is realized in Algorithm 1.

Chaotic maps

Chaos is defined as a phenomenon and exhibits some sort of chaotic behavior by using an evolution function and have three main characteristics: i) quasi-stochastic; ii) ergodicity; and iii) sensitivity to initial conditions⁶⁰. If its initial condition is changed, this may lead to a non-linear change in its future behavior. Thus, stochastic parameters in most algorithms can be strengthened by using chaos theory, given that the ergodicity of chaos can help explore the solution space more fully. Table 1 presents the mathematical expressions for the ten chaotic maps used in this study⁴⁴, where ${\tilde{x}}$ represents the random number generated from a one-dimensional chaotic map. Figure 1 shows their own visualizations, as well.

Table 1 Definition of the ten chaotic maps used in this study.

Subjects

Abstract

Similar content being viewed by others

CDMO: Chaotic Dwarf Mongoose Optimization Algorithm for feature selection

Pair barracuda swarm optimization algorithm: a natural-inspired metaheuristic method for high dimensional optimization problems

Hybrid Harris hawks optimization with cuckoo search for drug design and discovery in chemoinformatics

Introduction

Preliminaries

Sparrow search algorithm (SSA)

Chaotic maps

Proposed chaotic sparrow search algorithm (CSSA)

Solution encoding

Flow of CSSA

Computational complexity analysis

Experimental results and discussion

Dataset description

Performance metrics

Comparative analysis

CSSA under different chaotic maps

Contribution of chaos to SSA’s overall performance

Comparison of CSSA and its peers

Convergence curves of all competitors

Statistical test and analysis

Merits of CSSA’s main components via an ablation study

CSSA vs. other state-of-the-art optimizers in the literature

CSSA on high-dimensional microarray datasets: The additional experiment

Discussion

Conclusion

Data availibility

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's note

Rights and permissions

About this article

Cite this article

Share this article

This article is cited by

An improved chaos sparrow search algorithm for UAV path planning

Comments

Search

Quick links