Selectively-informed particle swarm optimization

Particle swarm optimization (PSO) is a nature-inspired algorithm that has shown outstanding performance in solving many realistic problems. In the original PSO and most of its variants all particles are treated equally, overlooking the impact of structural heterogeneity on individual behavior. Here we employ complex networks to represent the population structure of swarms and propose a selectively-informed PSO (SIPSO), in which the particles choose different learning strategies based on their connections: a densely-connected hub particle gets full information from all of its neighbors while a non-hub particle with few connections can only follow a single yet best-performed neighbor. Extensive numerical experiments on widely-used benchmark functions show that our SIPSO algorithm remarkably outperforms the PSO and its existing variants in success rate, solution quality, and convergence speed. We also explore the evolution process from a microscopic point of view, leading to the discovery of different roles that the particles play in optimization. The hub particles guide the optimization process towards correct directions while the non-hub particles maintain the necessary population diversity, resulting in the optimum overall performance of SIPSO. These findings deepen our understanding of swarm intelligence and may shed light on the underlying mechanism of information exchange in natural swarm and flocking behaviors.

However, most of the existing PSO algorithms treat all particles equally, prompting us to explore the impact of heterogeneous sight ranges: the hub particles (leaders) have a broad sight of the population; each non-hub particle (follower) has only a single source of information. The former would make the optimization process well guided by the leaders while the latter allows the followers to move without unnecessary interference. We found that our algorithm, selectively-informed PSO (SIPSO), taking into account the individuals' heterogeneity, can balance the exploration and the exploitation in the optimization process thus it achieves better performance.
In the following we will briefly introduce the PSO and its typical variants and then describe our SIPSO algorithm in detail.

GPSO & LPSO.
For a minimum optimization problem with D independent variables and an objective function f(x), the PSO algorithm represents the potential solutions with a flock of particles. Each particle i has a position x i 5 [x i1 , x i2 , …x iD ] and a velocity v i 5 [v i1 , v i2 , … v iD ] in the D-dimensional space. The goal is to find an optimal position x i of any particle i that makes the objective function f(x) minimum. Initially the particles' positions and velocities are generated randomly. Then, at each time step (iteration), each particle updates its position and velocity according to the following equations 5 : x i~xi zv i ð2Þ Here p i is the best historical position found by particle i, p n,i is the best historical position found by i's neighbors, c 1 and c 2 are the acceleration coefficients. U(a, b) is a random number drawn at each iteration from the uniform distribution [a, b]. Therefore, c 1 and c 2 balance the impacts of each particle's own and its neighbors' experiences, and g indicates the learning rate. Based on previous extensive analysis 14 we choose the appropriate settings as c 1 5 c 2 5 2.05 and g 5 0.7298. Previous studies 17,18,[20][21][22] have found that the interaction topology of particles has a great influence on final optimization results. Two versions of canonical PSO algorithm with different topologies are most commonly used: the GPSO with a fully connected network ( Fig. 1(a)) and the LPSO with a ring ( Fig. 1(b)). GPSO converges more rapidly than LPSO, yet, is more susceptible to be trapped at local optima 17 .
FIPSO. In the canonical PSO each particle is influenced by itself and the best-performed particle in its neighborhood. This ''singleinformed'' strategy may ignore some important information from the remaining neighbors. Mendes et. al. hence proposed a ''fullyinformed'' version of PSO (FIPSO) 20,21 , in which each particle adjusts its velocity according to the experiences of its all neighbors: where N i ð Þ is the node set of i's neighbors, k i is the number of i's neighbors (i.e., k i is i's degree and k i~N i ð Þ j j), p j is the best historic Each node connects to all others. (b) A ring network with 20 nodes. Each node links to its nearest two neighbors. (c) A scale-free network with 20 nodes, in which the node size represents the node degree, i.e. the number of edges associated with the node. It shows that most nodes have low degrees, yet there exist a few high-degree nodes (hubs).
is the best solution found in j-th run, and x opt is the (known) optimum solution for the given function f. Therefore, the smaller the value of F, the better the performance of the algorithm. position found by j. Studies 21,22 have revealed that, with appropriate parameter settings, the FIPSO can outperform the traditional PSO, but it is susceptible to the topology alteration. In some topologies the FIPSO may perform even worse than the canonical PSO.

SFPSO & SFIPSO.
Recently, many natural and man-made networks have been found to exhibit scale-free property, i.e. the degree distribution is power-law 26,27 . Examples include neural networks 28 , citation networks 29 , World Wide Web 30 , Internet 31 , software engineering 32 , and on-line social networks 33 . In scale-free networks, only a few nodes are densely connected hubs and most nodes are low degree non-hub nodes, resulting in high heterogeneity of node's degrees (Fig. 1c). This discovery has triggered the interest of studying the impacts of underlying network structures on dynamical processes [34][35][36][37][38][39][40] and also of introducing scale-free topologies into evolutionary optimization algorithms 19,[41][42][43] . In particular, Liu et al. investigated the influence of scale-free population structure on the performance of PSO 19 . Their results indicated that the scale-free PSO (SFPSO) outperforms the traditional GPSO and LPSO. In the following we also compare our algorithm to the fully-informed versions of SFPSO and GPSO (called SFIPSO and GFIPSO hereafter, respectively).
SLPSO. In most traditional PSO algorithms, a single learning mode is used for all particles, which may restrict the intelligence for a particular particle to deal with different situations. Li et al.
proposed the self-learning PSO (SLPSO) that enables the particles to switch between four modes: exploitation, exploration, jumping out, and convergence 25 . Each mode has a set of operations to update the particles' velocity and position. A common strategy was introduced to allow each particle to adaptively choose the most suitable mode which depends on evolutionary stages and local fitness landscape. Experimental comparisons showed that SLPSO outperforms several peer algorithms in terms of mean value, success rate and overall ranking, especially for some complex highdimensional functions. Yet, three key parameters of SLPSO need to be chosen very carefully through a parameter tuning approach, as these parameters significantly affect the algorithm's performance. Note that in SLPSO, although each particle is able to switch between different modes, the learning strategy of choosing suitable modes is identical for all particles.
Selectively-informed PSO. The algorithms described above assumed that all particles are single-informed or fully informed, or adopt the same strategy for switching between different modes, overlooking the heterogeneity of individuals. Here we propose the selectively-informed PSO (SIPSO) algorithm that takes into consideration the heterogeneity of individuals' learning strategies. The population structure of our SIPSO is represented by a scalefree network (see Methods). And the learning strategy of each particle depends on its degree: where k i is the degree of particle i, k c is the threshold to determine a particle fully-or single-informed. The densely-connected hubs   (k . k c ) are provided with more information to better lead the optimization process. The non-hub particles (k # k c ) are less affected so that they can move in the search space with more freedom, maintaining the diversity of the population. Note that, when k c 5 k min 2 1, all the particles are fully-informed thus the algorithm is degenerated to SFIPSO; when k c 5 k max , all the particles take the canonical learning strategy, turning the algorithm to SFPSO. Here we are interested in the information selectivity, i.e, k min 2 1 , k c , k max . For example, in Fig. 1c, when k c 5 5 the grey nodes (particles) with degree higher than 5 are fully-informed and the rest red nodes are single-informed.

Results
Overall performance. We test the performance of our algorithm on eight widely-used benchmark functions f 1-8 (see Methods) and compare it to other seven algorithms for three criteria: success rate, solution quality, and convergence speed (see Methods). Note that in SIPSO the optimal value of the degree threshold k c varies for different test functions. We also show the results for a fixed threshold (k fix c~5 ) over all the functions. Table 1 lists the comparison of success rate. Our algorithm SIPSO shows significant advantages, i.e., 99% on f 8 and 100% on all the other functions. Even with a fixed threshold k fix c~5 the SIPSO also gets very satisfactory success rates. Table 2 lists the results in terms of solution quality. For each function, the best solutions are highlighted in bold and ''-'' means that the corresponding algorithm fails to reach the acceptable solution even once. For functions f 2-4 our SIPSO remarkably outperforms the other algorithms, for f 1 , f 5 , f 6 and f 8 the SIPSO ranks 2 nd of all the algorithms, while for f 7 it ranks 3 rd . When the degree threshold is fixed as k fix c~5 , the solution quality still ranks top 3 of all the algorithms over eight test functions. Table 3 shows the convergence speed of each algorithm, represented by the steps required to reach the goal value. Thus the smaller the number of required steps, the higher the convergence speed. The best cases are marked in bold. Our SIPSO has a relatively fast convergence speed on all the functions, ranking 2 nd on f 1 , f 2 , f 3 , f 6 and f 8 , 3 rd on f 4 and f 7 , 4 th on f 5 . SFIPSO has the fastest convergence speed on f 1 , f 2 , f 3 , f 6 , and f 8 , and the GFIPSO converges fastest on f 4 , f 5 and f 7 . It is worth noting that, faster convergence does not necessarily mean a better optimization trial. Actually, too fast convergence may lead to the problem of prematureness, i.e., being trapped at local optima. For example, as shown in Table 2 the solution qualities of SFIPSO and GFIPSO are really bad for most benchmark functions, although their convergence are very fast. In the fully-informed algorithms, each particle's information can be quickly transferred to all other individuals in the swarm thus the algorithms converge rapidly, resulting in prematureness. In contrast, in our SIPSO, only the hub particles are fully-informed and there are many non-hub particles taking the single-informed learning strategy to maintain the population diversity. Consequently, our SIPSO can achieve better performance with a satisfactory convergence speed.
The impact of k c . As described above we find that for each function there is an optimal value of the threshold k c with which our algorithm SIPSO performs best. Hence we investigate the impact of k c on the performance for all eight benchmark functions. The results of solution quality, success rate and convergence speed are shown in Figs. 2 and 3. One can see that, for the solution quality on all functions except f 5 and f 7 SFPSO (the rightmost data point) outperforms SFIPSO (the leftmost data point), while for f 5 and f 7 it reverses. However, on all the functions except for f 7 , neither SFIPSO nor SFPSO is able to obtain the best result. With k c between k min and k max our algorithm SIPSO achieves the best performance (Fig. 2). Similar results for success rate are shown in Fig. 3(a). Our SIPSO has high success rate on all functions with an appropriate k c . As shown in Fig. 3(b), increasing the number of fully-informed particles can significantly improve the convergence speed and our SIPSO has moderate speed of convergence.
The microscopic point of view. To uncover the underlying mechanism of our algorithm, we explore the optimization process from a microscopic point of view. We compare our SIPSO (k min 2 1 , k c , k max ) to SFIPSO (k c 5 k min 2 1) and SFPSO (k c 5 k max ) that are all on scale-free networks, excluding the influence of other factors. For the sake of simplicity, in the following we will present the results for the function f 1 . The results for other functions are alike and not shown here.   First, we examine the mean fitness (F mean ) of the swarm population during an optimization process, with the definition where N is the total number of particles, x i is the position of particle i, and x opt 5 1 is the optimum solution of f 1 . As shown in Fig. 4(a) the SFIPSO has the fastest convergence as each particle uses full information from all of its neighbors, but it is trapped at some local optima in the early stage (, 150 iterations). Despite their relatively low convergence SIPSO and SFPSO are able to achieve higher qualities of final solutions, and SIPSO is the best for the mean fitness. Second, we compare the population diversity of SFPSO, SFIPSO and SIPSO, which indicates the extent of exploration during the searching process of the swarm. The population diversity is defined where N is the total number of particles, and x~X N i~1 x i is the mean position (center) of the swarm. Thus, the larger the s, more diverse is the swarm. And a very small s means that all particles are aggregated together, diminishing the capability of exploration. As shown in Fig. 4(b), the diversity of SFIPSO decreases quickly to a very small value due to the information redundancy of the fully-informed learning. Consequently, SFIPSO is not able to escape once gets stuck at a local optimum. Both SFPSO and SIPSO have a high level of diversity during the optimization, which ensure the thorough search in the parameter space thus improve the probability of finding the global optimum.
Furthermore, we investigate the fitness of particles with different the number of particles with degree k. d(k i , k) 5 1 if k i 5 k, and 0 otherwise. The particles in SFPSO have only one information source, which is very unstable during the optimization process. So the fluctuation of the particles' fitness in SFPSO are violent ( Fig. 5(a)). In SFIPSO, all particles are fully-informed, making the algorithm converge fast but prematurely ( Fig. 5(b)). Our SIPSO combines the advantages of the two algorithms. The fitness of hub particles monotonously decreases, indicating that the hubs play the role of guiding the swarm. On the contrary, the non-hub particles have oscillating fitness, maintaining the necessary diversity of the swarm (Fig. 5(c)).
The two different roles of the particles in SIPSO result in the appropriate trade-off between the convergence speed and the population diversity.

Discussion
Taking into account the heterogeneity of individuals behaviors in flocking we propose the Selectively-Informed Particle Swarm Optimization (SIPSO) algorithm. In SIPSO, the particles interact with their neighbors and change the searching direction and speed by learning from the experiences of themselves and their neighbors. Each particle's learning strategy depends on its degree: the hubs are able to learn from all of their neighbors (fully-informed) while each non-hub particle learns from a single yet best-performed neighbor. Consequently, the hubs have bird's eye views of the swarm and can better lead the population; the non-hub particles are less influenced thus can search in the space with high freedom, maintaining the diversity of the population. We test the performance of our SIPSO on eight benchmark functions. The results show that SIPSO has high success rate, high solution quality, and acceptable convergence speed. We examine the optimization process from a microscopic point of view and reveal that, indeed, there are two different roles that the particles play in the SIPSO. Moreover, our algorithm is able to balance the population diversity and the convergence speed during optimization processes, improving the overall performance in comparison with other seven algorithms.
It is worth noting that we do not introduce adaptation into our SIPSO algorithm, i.e., all parameters including k c are set initially and do not change during the optimization process, but instead we discriminate the nodes with different degrees, in contrast to SLPSO which adopts adaptive strategies in search of the optimum. Despite the lack of adaptation, our SIPSO works very well in the benchmark test functions. This finding uncovers the importance of considering the individuals' heterogeneity in particle swarm optimization. Nevertheless, as shown in previous works (e.g., refs. 24, 25), adaptation can improve PSO's performance. It is fairly expected that adaptively tuning the value of k c during the searching process could improve our SIPSO's performance, which deserves future pursuits.

Methods
Benchmark functions. To make a comprehensive comparison to test the effectiveness of our algorithm we designed extensive experiments. We choose eight benchmark functions ( Table 4) that have been widely used 17,18,20,21,44 . Functions f 1 2 f 4 are unimodal, which are relatively easy to solve. Functions f 5 2 f 8 are multi-modal with a large number of local optima so that the algorithm really suffers from being premature. Functions f 6 and f 7 are the same Griewank function with different dimensions. In fact, f 7 is considered more difficult 18 . Column 2 shows the formula of the fitness function. Column 3 shows the dimension of the problem D. Column 4 gives the range that variables can take. In column 5 the optimum values of the problems are presented. Column 6 defines the goal value to judge whether a run (trial) is successful or not.
Parameter settings. The parameters of experiments are set as follows. The population size is 50. For each algorithm and each benchmark function, the experiment consists of 100 independent runs. The maximal iteration is 5000. For SFPSO, SFIPSO and SIPSO, the scale-free network has maximal degree 14 and minimal degree 2. We generate the scale-free networks by Barabási-Albert model 46 , which has two main mechanisms: growth and preferential attachment. Starting with m 0 fully-connected nodes, at each time step we add a new node to the network and connect it to m existing nodes(m , m 0 ). The probability P i that the new node is connected to an existing node i depends on i's degree: P i~k i S j k j , where j runs over all the existing nodes. Here we set the parameters m 0 5 4 and m 5 2.
Criteria. To compare the performance of different algorithms we use three criteria: solution quality, convergence speed, and success rate. The solution quality is the final fitness value at the end of 5000 iterations. The convergence speed is represented by the number of iterations required to reach the goal. Obviously, the larger the number of required iterations, the lower the convergence speed. The success rate is the fraction of successful runs. Both the solution quality and the convergence speed are average values over the successful runs.