Introduction

The vehicle routing problem (VRP) with limited capacity serves as a complex extension of the classic traveling salesman problem (TSP). In this context, the objective is to outline k routes, optimizing for minimal cost or distance, to cater to clients, each with their predetermined locations and demands. It’s crucial that each vehicle starts and concludes its journey at a specified point, all while adhering to particular constraints. Numerous methodologies have been proposed to tackle the VRP challenge. These include linear programming, the ant lion optimizer (ALO), particle swarm optimization (PSO), modified hybrid particle swarm optimization (MHPSO), double population genetic algorithm (DPGA), whale optimization algorithm (WOA), grey wolf optimizer (GWO), genetic algorithm (GA), and the dragonfly algorithm (DA).

In the realm of transportation and logistics, the VRP stands as a paradigmatic NP-hard challenge. Despite being the subject of extensive academic investigation, characterizing the VRP remains elusive due to its intricate array of constraints and stipulations. These include factors like Chronological Span, Length, Collection and Drop-off, and Capability, as outlined by Laporte1. As a result, research endeavors addressing the VRP are tasked with focusing on pivotal parameters such as length2, cost3, and the intertwined factors of temporal duration and carbon emissions4. Liu et al.5 differentiated the VRP from the TSP by highlighting the former's provision for multiple routes. Each of these routes is constrained by a specific vehicle capacity and must traverse all nodes. Given the daunting complexity inherent to the VRP, research has chiefly gravitated towards heuristic and meta-heuristic strategies as the primary methodologies to derive workable solutions.

The VRP has ascended as a key subject in academic research, chiefly due to its pivotal role in transportation and logistics. Given the necessity to ensure punctual deliveries of large volumes of goods, the task often exceeds the capabilities of individual vehicles. Taking into account each vehicle's inherent capacity and load restrictions, devising astute delivery routes becomes essential to meet daily consumer demands. Researchers in this arena endeavor to calibrate the objective function, targeting an optimal solution that simultaneously minimizes costs, geographical span, time constraints, and carbon emissions. Such optimization efforts encompass a range of approaches, from tackling the VRP in contexts where goods are dispatched from a single depot6,7 to more intricate setups originating from multiple depots8,9.

The significance of optimization is evident across a myriad of fields, leading to a marked increase in the focus on metaheuristic techniques. One of the salient features of metaheuristics is their adaptability. From a broader perspective, metaheuristics can be delineated based on the degree of randomness they introduce during each optimization iteration. They can also be characterized based on their foundational inspirations, many of which are derived from swarm intelligence. Examples include the whale optimization algorithm (WOA)10, the grey wolf optimizer (GWO)11, and the African wild dog optimization algorithm (AWDO)12. These metaheuristic techniques find applications in diverse domains, such as the time–cost trade-off in construction projects13,14, dispatching of ready-mix concrete trucks15, optimization of construction site layouts16, VRP17, reduction of construction material costs18, logistics cost optimization19, and the design optimization of water distribution systems20.

The WOA, a metaheuristic optimization technique, was introduced by Mirjalili and Lewis10 in 2016. Deriving its inspiration from the intricate hunting behaviors of humpback whales, this method employs a set of candidate solutions, each representing a potential optimum. The WOA unfolds through a three-pronged schema of search strategies: exploration, exploitation, and convergence. In the exploration phase, the algorithm adopts a stochastic approach, identifying promising regions within the vast search space. As it shifts to the exploitation stage, it mirrors the humpback's bubble-net hunting tactics to close in on these pinpointed regions. Finally, in the convergence phase, the WOA concentrates on the fine-tuning of the best solution, progressively narrowing the search scope. Despite inherent limitations, such as sensitivity to parameter variations and a tendency towards premature convergence, the WOA is lauded for its versatility, user-friendly nature, and notable efficiency in diverse sectors, including engineering, electrical systems, and finance.

The WOA has garnered significant attention due to its wide applicability across diverse domains. Correspondingly, there has been a surge in research initiatives aimed at refining its optimization capabilities. Chakraborty, Saha21 unveiled a modified WOA (mWOAPR) to enhance the diagnosis of COVID-19 severity using chest X-ray images. Notably, their findings outperformed both the foundational and other advanced metaheuristic algorithms, especially in benchmarking and segmenting COVID-19 X-ray images. A subsequent study proposed an elite-based WOA variant (EBWOA)22, addressing certain limitations of the conventional WOA. This iteration demonstrated its efficacy across benchmark functions, IEEE CEC 2019 functions, design issues, and tangible cloud scheduling dilemmas. An additional WOA modification, optimized for high-dimensional problems, was introduced23, addressing challenges like inadequate exploration, compromised accuracy, and premature convergence. Building upon prior research, Chakraborty, Saha24 enriched the WOA (designated WOAmM) by incorporating a revised mutualism phase from the Symbiotic Organisms Search (SOS) algorithm. This enhancement specifically targeted premature convergence pitfalls. In a distinct development, a novel WOA iteration (m-SDWOA) was put forth25 amalgamating features from both the SOS and Differential Evolution (DE). This fusion harmoniously married exploration and exploitation, culminating in improved accuracy, diversity, and mitigation of early convergence. In another collaborative effort, Chakraborty, Sharma26 rolled out an optimized WOA version (ImWOA) with aspirations to magnify diversity, exploration, and solution precision. Their evaluations rendered promising results across a spectrum of optimization tasks, including image segmentation, particularly when benchmarked against rudimentary algorithms and newer WOA iterations. Lastly, a fusion of success-history-based adaptive differential evolution (SHADE) with a customized WOA was presented27, culminating in the SHADE-WOA hybrid. This avant-garde optimization technique manifested exemplary results, both in standard benchmarks and practical engineering design tasks, as corroborated by comprehensive statistical examinations.

In 2014, Mirjalili et al.11 pioneered the GWO, a metaheuristic approach inspired by the behavioral dynamics of grey wolves. When compared to prevailing metaheuristic techniques like PSO, DE, GSA, and FEP28, GWO stands out, particularly during its exploitation phase. The algorithm demonstrates an innate ability to deftly navigate the solution space, outperforming in avoiding local optima in a significant majority of the 29 functions examined11. Nevertheless, the GWO, despite its astute update mechanism, is not devoid of challenges. Researchers have identified difficulties in balancing exploration and exploitation29 and pointed out its limited success in tackling issues related to non-linear equation systems and unconstrained optimization29. This underscores the pressing need to further refine and enhance GWO to overcome these intrinsic shortcomings. While attempts to diversify the population through random initialization of the grey wolves’ population have been made, such a strategy is not without pitfalls—a concern later addressed30.

The main issues with WOA, as identified in28,29,30,31, have motivated the authors of this paper to propose a hybridized approach using GWO. This hybrid method starts by initializing the initial population according to WOA, GWO to create population diversity and sort preliminary results. Next, the study proposes to use the leadership hierarchy inherent in GWO to apply WOA’s bubble attack strategy. In the mining phase, the proposed algorithm selected the top three alpha, beta, and delta wolves from the entire search agent, and the other search agents modified their positions according to the agent’s position. find the best of other search agents to improve the performance of the WOA algorithm through the p-factor. The performance goals of hGWOA are demonstrated through unilateral and multimodal benchmark functions. This solved the local optimization problems, incomplete solution improvement after each iteration, and low performance in the exploitation phase of WOA.

The large-scale capacity vehicle routing problem (CVRP) stands at the crux of effective transportation and logistics management. Though myriad solutions have been proposed to address this problem, the performance of many such methodologies in confronting extensive CVRPs leaves much to be desired. Bridging this gap, we present the hGWOA, a state-of-the-art hybrid optimizer. By seamlessly fusing the strengths of the GWO and WOA, the hGWOA promises to deliver potent solutions specifically tailored for medium to large-scale CVRPs, and beyond this, for a diverse range of optimization challenges inherent in real-world transportation systems. The deployment of this innovative model not only accentuates optimization efficacy but also empowers decision-makers, bestowing upon them the capability to derive astute, well-informed, and strategic solutions that address the complex nuances of transportation logistics.

The remains of this study are structured as follow: section “Literature review” provides a comprehensive review of the existing literature on the vehicle routing problem. In section “Model development”, we present the specifics of our proposed hybrid grey wolf optimizer algorithm. Section “Computational experiments” evaluates the algorithm’s performance and effectiveness in comparison to existing models. Lastly, section “Conclusion” concludes our findings, highlighting the study’s contributions and suggesting potential areas for future research in the field.

Literature review

The VRP has been the subject of rigorous investigation for over six decades, with a plethora of strategies and objectives proposed32,33,34,35. One prevalent approach for addressing the VRP factors is in both distance and customer demands. This strategy employs the “3-opt” framework in tandem with mixed-integer linear programming for uniformly sized vehicles, and binary linear programming when dealing with fleets of varied sizes2. Additionally, research endeavors in this arena have delved into optimizing processes like the loading and unloading of goods2, refining travel and service intervals36, curtailing operational costs such as vehicular wear, fuel consumption, and refrigeration expenses3, and emphasizing the reduction of carbon footprints4,5.

Capacity limitations are frequently observed in various research studies, acting as a fundamental constraint in vehicle routing problems. The CVRP has been the subject of numerous methodologies developed to address its complexities. These methodologies are broadly classified into two categories: exact methods and heuristic methods, each possessing its own unique attributes and advantages. The ant colony algorithm (ACO) was first introduced by Dorigo et al.37 as a simulation-based optimization technique that mirrors the food-seeking behavior of real-world ants. This algorithm has been widely employed to address the travelling salesman problem (TSP) and other intricate combinatorial challenges. The fundamental premise of the ACO is that the paths traversed by ants represent potential solutions to the optimization dilemma. As time progresses, there is a systematic increase in the concentration of pheromones on the more optimal paths. Consequently, a higher number of ants are inclined to select shorter routes, paving the way to pinpointing the optimal solution. To enhance the efficiency of the ACO, various adaptations have been suggested by researchers. Notably, Dorigo et al.37 proposed the ant colony system (ACS) as a refined version of the original algorithm. Moreover, Yu et al.38 presented an augmented ACO equipped with an intensified local search capability. Furthering the innovations in this field, Chen and Shi39 put forward a hybrid methodology that melds local search techniques with the foundational principles of the ant colony algorithm, specifically targeting the multi-compartment vehicle routing challenge.

The CVRP has captivated the attention of researchers aiming to augment the efficacy of transportation systems. A plethora of algorithms addressing this conundrum have been proposed, including contributions by Pham and Nguyen17, Azad40. Korayem et al. 41 introduced an inventive approach that amalgamates K-means clustering with grey wolf optimization, aiming for adept group formation and routing. On a similar note, Ng et al.42 unveiled the multiple-colonies artificial bee colony methodology, which employs a re-routing paradigm to optimize CVRP solutions. Another notable contribution is by Wei et al.43,44 who infused two-dimensional packing constraints into the Simulated Annealing framework for CVRP problem-solving. This adaptation not only modifies the neighborhood structure but also augments the solution’s quality. They further expanded on this by developing a method that accentuated CVRP optimization through the integration of two-dimensional packing constraints. Delving into more intricate challenges, Tao and Wang45 tackled the three-dimensional loading CVRP (3L-CVRP) by embedding three-dimensional packing and loading capacity constraints within the tabu search algorithm. In a parallel stride, Zhang et al.46 devised a random local search technique focusing on the same constraints. Both research endeavors furnish competent solutions for the 3L-CVRP, underscoring distinct search strategies tailored to specific constraints. Akpinar47 championed a hybrid approach, harnessing the strengths of both large-scale neighborhood search and ant colony algorithms to refine the optimization process. Furthermore, Sze et al.48 presented a two-phase hybrid approach with an adjustable locality mechanism, embedding a large neighborhood search to diversify the solution pool. In another noteworthy contribution, Akhand et al.49 integrated adaptive scanning and velocity speculation into the particle swarm optimization (PSO) technique, enhancing path optimization. They further honed the PSO method, tailoring it for the optimization of garbage collection routes. Collectively, these methodologies illuminate pathways for refining transport systems, providing robust solutions that bolster transportation operations’ efficiency.

Reed et al.50 employed ACS to devise routing strategies for vehicles in cyberspace. They further broadened its application by integrating multi-chambered vehicles designed for waste sorting. Remarkably, their methodology led to a significant cost reduction of 15% in a management science project undertaken at E. I. Du Pont, Inc34. In another innovative approach, Narasimha et al.51 presented a VRP formulation centered on minimizing the journey time of the vehicle traversing the longest route. This perspective is especially pertinent in situations demanding rapid emergency responses. Furthermore, a subset of scholars has broadened the scope of VRP models to incorporate diverse parameters. These include customer satisfaction, environmental emissions, and cost optimization7,8 and52;

Amidst rising apprehensions regarding global warming, the mitigation of carbon emissions has taken center stage in the discourse on the VRP. In response to these environmental challenges, many nations have instated taxes predicated on the carbon emissions produced by transport vehicles. This has underscored the imperative of cultivating efficient solutions to address these emission concerns. Consequently, there has been a marked surge in research endeavors over recent years, focusing on optimizing carbon emissions within the context of the VRP5,53. Given the intricate nature and expansive scale of the VRP, the quest for optimal resolutions often relies on heuristic and meta-heuristic methodologies. Such strategies are pivotal in sculpting efficient and environmentally sustainable transportation frameworks.

Beyond the scope of the traditional VRP, the dynamic vehicle routing problem (DVRP) has emerged as a significant area of interest. In the DVRP, new orders surface while goods are in transit, necessitating real-time route modifications54. To address this dynamic challenge, researchers have turned to strategies such as the PSO method and adaptive neighborhood search. Moreover, in a bid to minimize carbon emissions, the MDGVR problem has been introduced. This problem centers around eco-friendly vehicles that commence their routes from various depots but conclude at a singular, primary warehouse9. A proposed resolution for this particular challenge hinges on the deployment of a two-stage ACS methodology.

This research presents a new methodology, denoted as hGWOA, crafted to tackle the distance optimization challenges inherent to CVRP, aiming to reduce associated logistics expenses. To ascertain the efficacy of hGWOA, it was juxtaposed with several established algorithms, namely GWO, independent WOA, DA, and ALO. This comparative analysis utilized both classical benchmark test functions and CEC2017 test functions. The results underscore that hGWOA's performance is notably superior to its counterparts. Following this, the hGWOA algorithm was employed on two emblematic CVRP scenarios, further elucidated in section “Computational experiments”.

Model development

CVRP description and mathematical model

In the domain of operations research and logistics, the CVRP problem's significance is widely acknowledged55. This problem centers on crafting an optimal plan for transporting goods from a central warehouse to a set group of clients using a vehicle fleet, with the subsequent return of the fleet to the base. Shan and Wang56 have clearly defined this challenge, emphasizing two key constraints: firstly, the strict carrying capacity of each cargo vehicle, ensuring the total goods volume or weight on any given route does not exceed the vehicle's limits; and secondly, the requirement for each client to be visited only once, ensuring efficient and timely deliveries. The overarching goal of the CVRP is to minimize the entire journey distance of the fleet during its operations17.

Consider:

$$ \begin{gathered} D = total~\;distance\;~travelled~\;by\;~all\;~units \hfill \\ x_{{ijt}} = \left\{ {\begin{array}{*{20}c} {1,~\;vehicle\;~t~\;depart\;~from\;~i~\;to~\;j} \\ {0,~\;otherwise} \\ \end{array} } \right.;y_{{it}} = \left\{ {\begin{array}{*{20}c} {1,~\;customer\;~i~\;is~\;served\;~by\;~unit~\;t} \\ {0,~\;otherwise} \\ \end{array} } \right. \hfill \\ \end{gathered} $$

Objective function:

$$ Min D = \mathop \sum \limits_{i = 0}^{k} \mathop \sum \limits_{j = 0}^{k} \mathop \sum \limits_{t = 1}^{h} c_{ij} x_{ijt} $$
(1)
$$ \mathop \sum \limits_{i = 0}^{k} x_{ijt} = y_{jt} ;j = 1,2, \ldots ,k;t = 1,2, \ldots ,h $$
(2)
$$ \mathop \sum \limits_{i = 0}^{k} x_{ijt} = y_{it} ;j = 1,2, \ldots ,k;t = 1,2, \ldots ,h $$
(3)
$$ \mathop \sum \limits_{i = 0}^{k} g_{i} y_{it} \le q_{t} y_{it} ;t = 1,2, \ldots ,h $$
(4)
$$ \mathop \sum \limits_{t = 1}^{h} y_{it} = \left\{ {\begin{array}{*{20}c} {1; i = 1,2,3, \ldots ,k} \\ {h;i = 0} \\ \end{array} } \right\} $$
(5)

where cij represents the cost from customer i to customer j. The symbol gi stands for the demand of the ith client, with i taking values from 1 through k, where k is the total number of clients. The letter h represents the total number of units. Lastly, qt indicates the capacity of the tth unit, with t ranging from 1 to h.

Equation (1) defines the objective function for the VRP. Within this equation, xijt is a binary variable indicating the route's selection status. It is assigned a value of 1 if the route is chosen and 0 otherwise. The VRP's primary goal is to minimize the cumulative distance traveled, epitomized by the sum of the distances covered by each unit. Equations (2) and (3) are constraints ensuring that there’s a unique path linking each unit to every client. Specifically, Eq. (2) mandates that each client is visited only once, whereas Eq. (3) stipulates that each unit must visit a minimum of one client. The unit capacity constraint is introduced in Eq. (4), restricting the volume of goods transported along a particular route. The sum of goods delivered to every client along a route must stay within the unit’s designated capacity. Lastly, Eq. (5) dictates that a singular unit exclusively services each client. In contrast, the warehouse receives attention from h units, where h denotes the specific number of units assigned to the warehouse.

Hybrid whale optimization algorithm model for CVRP

Whale Optimization Algorithm_WOA

In 2016, Mirjalili and Lewis10 unveiled the WOA, a pioneering metaheuristic optimization technique. Inspired by the intricate hunting behaviors of humpback whales, the WOA facilitates proficient exploration and exploitation of the search space to pinpoint optimal solutions. As illustrated in Fig. 1, the WOA operationalizes through three distinct phases: encircling the prey, navigating the spiral bubble trap, and the subsequent prey hunt.

Figure 1
figure 1

Bubble-net feeding strategy of humpback whales.

Encircling prey

Humpback whales have a unique ability to detect and encircle their prey. However, given that the exact position of the optimal solution within the search space remains unknown a priori, the WOA algorithm predicates the notion that the current best candidate solution either signifies the target prey or is in proximity to the optimal solution. Upon the identification of the best-performing search agent, the other agents endeavor to recalibrate their positions in alignment with this top-scoring agent. This behavior is encapsulated mathematically in Eqs. (6) and (7):

$$ \vec{D} = \left| {\vec{C} \times \vec{X}^{*} \left( t \right) - \vec{X}\left( t \right)} \right| $$
(6)
$$ \vec{X}\left( {t + 1} \right) = \vec{X}^{*} \left( t \right) - \vec{A} \times \vec{D} $$
(7)

In the Eqs. (8) and (9), the term t stands for the current iteration. \(\vec{A}\) and \(\vec{C}\) are known as coefficient vectors. \({X}^{*}\) indicates the position vector of the most optimal solution found until the present iteration, while \(X\) signifies the position vector of the current search agent. The || represent the concept of absolute value. It's important to highlight that \( X^{*}\) needs updating every iteration if a better solution emerges.

The calculation for the vectors \( \vec{A}\) and \(\vec{C}\) is as follows:

$$ \vec{A} = 2\vec{a} \times \vec{r} - \vec{a} $$
(8)
$$ \vec{C} = 2 \times \vec{r} $$
(9)

where a undergoes a decremental variation, starting from an initial value of 2 and culminating at a value of 0 as the iterations ensue. This decrement is manifest in both the exploration and exploitation phases. In addition, the variable r represents a vector whose elements are randomly generated, with values ranging between 0 and 1.

Figure 2a offers a graphical illustration of the application of Eq. (7) to a two-dimensional problem. It elucidates the method by which a search agent's position is updated in relation to the most recent solution's position. Through modifications to the vectors \(\vec{A}\) and \(\vec{C}\), the search agent can traverse various regions proximate to the highest-performing solution. Figure 2b extrapolates this notion to a three-dimensional context, highlighting the potential update trajectories of a search agent. Importantly, the random vector (\(\overrightarrow{r}\)) empowers the search agent to probe any location within the search domain, as delineated by the pivotal points in Fig. 2. As a result, Eq. (7) aids in refining a search agent’s position near the apex-performing solution, simulating the dynamics of encircling prey.

Figure 2
figure 2

2D and 3D position vectors and their possible subsequent placements (X* is the top-performing solution obtained so far).

Bubble-net attacking method (exploitation phase)

To formulate a mathematical representation of the bubble-net foraging tactics observed in humpback whales, two distinct methodologies have been proposed:

  • Constriction and Encompassing Strategy: This approach endeavors to emulate the behavior through modifications to the parameter and vectors in Eq. (8). Specifically, the magnitude of ‘a’ is diminished, which consequently reduces the variation amplitude of \(\overrightarrow{A}\). Here, \(\overrightarrow{A}\) is an unpredictable value confined to the interval [− a, a]. As the iterations progress, the value of a is systematically reduced from 2 to 0. By assigning random values to \(\overrightarrow{A}\) within the range of [− 1, 1], it becomes feasible to position a search agent anywhere between its originating position and the location of the best-performing agent. Figure 3a graphically illustrates the potential positions that can be achieved within a 2D plane, spanning from \((X, Y)\) to \(({X}^{*}, {Y}^{*})\), contingent on the constraint \(0\le A\le 1\).

  • Spiral Updating Position Approach: As illustrated in Fig. 3b, this methodology commences by computing the Euclidean distance between the whale's position \(\left( {X, Y} \right)\) and its prey's position \(\left( {X^{*} , Y^{*} } \right)\). The subsequent step involves devising a spiral equation, designed to mimic the helical trajectory often exhibited by humpback whales as they converge on their target. The derived equation is articulated as:

    $$ \vec{X}\left( {t + 1} \right) = \vec{D} \times e^{bl} \times \cos \left( {2\pi l} \right) + \vec{X}^{*} \left( t \right) $$
    (10)
    $$ \vec{D} = \left| {\vec{X}^{*} \left( t \right) - \vec{X}\left( t \right)} \right| $$
    (11)
Figure 3
figure 3

Bubble-net search mechanism implemented in WOA (X *is the top-performing solution).

In Eqs. (10) and (11), the vectors \(\vec{D}\) and the variable l denote the distance between the ith whale and the prey. These serve dual purposes: first, as indicators of the spatial proximity between the two entities, and second, as metrics to gauge the quality of the optimal solution acquired up to that point. The constant b emerges as a pivotal element, endowing the logarithmic spiral with its unique characteristics. Furthermore, the variable l is derived from a uniform distribution over the interval [− 1, 1], infusing the equation with a stochastic component.

The collective behavior of humpback whales, characterized by their tendency to encircle prey in a narrowing loop while also adopting a spiral trajectory, is emulated in the model. Within this framework, a balanced probability of 50% is designated to either the contraction-encircling mechanism or the spiral model. This probabilistic approach dictates how the whales' positions are updated throughout the optimization procedure. The mathematical articulation of this model is presented as follows:

$$ \vec{X}\left( {t + 1} \right) = \left\{ {\begin{array}{*{20}l} {\vec{X}^{*} \left( t \right) - \vec{A} \times \vec{D}} \hfill & {if \;p < 0.5} \hfill \\ {\vec{D} \times e^{bl} \times \cos \left( {2\pi l} \right) + \vec{X}^{*} \left( t \right)} \hfill & {if \;p \ge 0.5 } \hfill \\ \end{array} } \right. $$
(12)

A similar method, centered on the manipulation of vector \({\vec{\text{A}}}\), finds application in the pursuit of prey during the exploration phase. In this context, humpback whales engage in stochastic search behaviors influenced by the relative positions of their peers. Consequently, vector \(\overrightarrow{{\text{A}}}\) is endowed with random values exceeding 1 or descending below − 1, serving to compel a search agent to undertake substantial displacements from a reference whale. Diverging from the exploitation phase, where a search agent's position is updated based on the most successful agent discovered thus far, the exploration phase employs a different strategy. Here, the updating of a search agent's position hinges on the random selection of another search agent, rather than relying on the best-found agent. This mechanism, when coupled with \(\left| A \right| > 1\), underscores the significance of exploration, thereby empowering the WOA to conduct an extensive global search. The mathematical formulation is presented as follows:

$$ \vec{D} = \left| {\vec{C} \times \vec{X}_{rand} - \vec{X}} \right| $$
(13)
$$ \vec{X}\left( {t + 1} \right) = \vec{X}_{rand} - \vec{A} \times \vec{D} $$
(14)

where \(\vec{X}_{rand}\) denotes a stochastic position vector, which is selected from the existing population of whales.

Grey Wolf Optimizer (GWO)

The GWO algorithm was introduced by Mirjalili et al.11 in 2014, drawing inspiration from the hunting and hierarchical leadership behavior of wild wolves. The algorithm comprises four levels, denoted as alpha, beta, delta, and omega. In this hierarchy, the first three wolves represent the best variants within the population, while omega (ω) symbolizes the variation within the population, as illustrated in Fig. 4. Additionally, the algorithm models the two distinct stages of the wolf population: the siege stage and the hunt for prey stage.

Figure 4
figure 4

Grey wolf population organization chart.

The siege phase is displayed as follows:

$$ \vec{d} = \left| {\vec{c} \times \vec{c}_{p}^{t} - \vec{x}^{t} } \right| $$
(15)
$$ \vec{x}^{{\left( {t + 1} \right)}} = \vec{x}^{t} - \vec{a} \times \vec{d} $$
(16)

where \(\vec{x}^{t}\) is the wolf's position in iteration t, \(\vec{d}{ }\) is the prey’s position vector, \(\vec{a}\) and \(\vec{c}\) represents coefficient vectors, which are computed as follows:

$$ \vec{a} = 2l \times r_{1} $$
(17)
$$ \vec{c} = 2 \times r_{2} $$
(18)

During the hunting phase, Mirjalili models the hunting behavior by assuming that alpha, beta, and delta have knowledge of the potential position of the prey based on their experience. This is expressed mathematically as follows:

$$ \vec{d}_{\alpha } = \left| {\vec{c}_{1} \times \vec{x}_{\alpha } - \vec{x}} \right|;\vec{d}_{\beta } = \left| {\vec{c}_{2} \times \vec{x}_{\beta } - \vec{x}} \right|;\vec{d}_{\delta } = \left| {\vec{c}_{3} \times \vec{x}_{\delta } - \vec{x}} \right| $$
(19)
$$ \vec{x}_{1} = \vec{x}_{\alpha } - \vec{a}_{1} \times \vec{d}_{\alpha } ; \vec{x}_{2} = \vec{x}_{\beta } - \vec{a}_{2} \times \vec{d}_{\beta } ; \vec{x}_{3} = \vec{x}_{\delta } - \vec{a}_{3} \times \vec{d}_{\delta } $$
(20)
$$ \vec{x}^{{\left( {t + 1} \right)}} = \frac{{\vec{x}_{1} + \vec{x}_{2} + \vec{x}_{3} }}{3} $$
(21)

During the search and attack phase, a vector \(\vec{a}\) is randomly generated within the range of [-2a, 2a]. If \(\left| {\vec{a}} \right| < 1\), the wolves will attack a randomly selected prey, referred to as the mining stage. However, if \(\left| {\vec{a}} \right| > 1\), the wolves may abandon their current target and search for better prey57. Another parameter that influences the decoy search is the variable c, which takes a value within the range of2. A vector \(\vec{c}\) is randomly and abruptly updated to prevent local optimization. If \(c > 1\), the solution will converge towards the prey, whereas if \(c < 1\), the solution will move away from the prey in search of new targets.

Hybrid whale optimization algorithm model for CVRP

This section introduces a proposed methodology that combines the WOA and the GWO to enhance the efficiency of the WOA during its exploitation phase. This aims to attain superior solutions, drawing upon the insights discussed in the preceding sections regarding both WOA and GWO.

Despite the standard version of the WOA exhibiting a notable proficiency in identifying optimal solutions, its capability to consistently refine these solutions in subsequent iterations might be limited. To address this limitation and bolster the WOA’s performance, an amalgamation with the GWO was proposed, leading to the genesis of a novel algorithm termed hGWOA. This innovative hybridization introduces two pivotal modifications to the conventional WOA. Foremost, a conditional constraint is embedded within WOA's exploitation phase, aiming to augment its hunting efficacy.As illustrated by Eq. (21), the parameters \(\vec{x}_{1}\), \(\vec{x}_{2}\), and \(\vec{x}_{3}\) are pivotal to the exploitation performance of the GWO. To circumvent the challenges of local optima, particularly when each ‘a’ is either less than 1 or greater than − 1, a novel condition has been incorporated into hGWOA’s standard exploitation phase. Furthermore, modifications have been made to Eqs. (19), (20), and (21) to facilitate their use within this newly introduced condition, focusing expressly on the parameters \(\vec{x}_{1}\), \(\vec{x}_{2}\), and \(\vec{x}_{3}\). In addition, a supplementary criterion has been introduced during the exploration phase of hGWOA to guide the current solution more effectively towards the most propitious outcome, while concurrently forestalling the whale from advancing to a position inferior to its preceding location.

hGWOA initiates by establishing a population comprising search agents, encompassing both whales and wolves. This population is subsequently subjected to a procedure designed to rectify agent positions that surpass the defined boundaries of the search space. Following this positional adjustment, the fitness function is meticulously computed for each agent. In instances where an agent's fitness falls below the alpha_score (best_score), the alpha_score is updated to align with the agent's computed fitness. Consequently, pivotal algorithmic variables, including a, A, C, L and p are subject to updates, and a random p number is stochastically generated.When the generated random number p falls below the threshold of 0.5, it triggers an evaluation of an additional conditional statement, which inquires whether |A| does not equal 1. If this condition is met, a new position for the agent is computed utilizing Eq. (6). Subsequent to this calculation, if the fitness of the newly derived position surpasses that of the current position, the algorithm updates the agent's position accordingly. However, if the condition |A|≥ 1 holds true, then the new position is determined utilizing Eq. (7). Analogous to the prior condition, the algorithm scrutinizes the fitness of the new position relative to the old, and if superiority is established, the agent's position undergoes a corresponding update. In an alternative scenario, if the randomly generated variable p is greater than or equal to 0.5, and all the variables a1, a2 and a3 fall within the range of − 1 to 1, then the algorithm proceeds to update the current solution's position utilizing Eq. (21).

Following these steps, the algorithm checks if any newly computed positions exceed the defined search space limits. If they do, corrective actions are taken to bring them within bounds. This process results in the calculation of updated fitness values for the agents, ultimately leading to the identification and reporting of the algorithm’s optimal fitness achievement.

The fundamental distinction between WOA and hGWOA is observed in the incorporation of Eqs. (19), (20), and (21) during the exploitation phase of WOA. This is further complemented by an innovative strategy introduced in the exploration stage to enhance the solution quality. The integration of these equations, coupled with this new strategy, amplifies the foraging efficiency of WOA. As a result, the optimal solution undergoes refinement in each iteration, bolstering the algorithm's resilience against local optima. Additionally, the introduction of this specific condition during the exploration phase augments the algorithm's search capability, reinforcing the robustness of existing solutions. Table 1 summarizes the parameters used, demonstrating an appropriate blend for the hGWOA, WOA, and GWO algorithms. Concurrently, Table 2 and Fig. 5 present the pseudo-code and flowchart for the hGWOA approach, respectively.

Table 1 Parameter settings of the hGWO, GWO and WOA.
Table 2 Pseudo-code of the proposed hGWOA method.
Figure 5
figure 5

Flowchart of the proposed hGWOA method.

The hGWOA algorithm showcases significant advancements in integrating both global and local search strategies within the search space. This hybrid approach generates a succession of stochastic solutions during its initial phase, optimizing the quest for the ideal solution. Additionally, the hGWOA methodology utilizes an iterative framework, enabling the effective pinpointing and harnessing of unexplored regions within the search domain. Consequently, this leads to the revelation of novel and promising solutions.

Computational experiments

Convergence behaviours on classical benchmark function

A detailed evaluation of the hGWOA’s optimization prowess was executed, using classical benchmark test functions that are widely acknowledged in the field. Comparative analyses pitted hGWOA against four prominent optimization methodologies: GWO, WOA, DA, and ALO. The benchmark test functions deployed in this study were categorized based on their distinctive traits into three groups: uni-modal, multi-modal, and fixed-dimensional composite functions with multiple local optima. Comprehensive depictions of these functions can be found in Tables 3, 4 and 5.

Table 3 Uni-modal test functions.
Table 4 Multi-modal test functions.
Table 5 Fixed functions with multiple local optima.

For a rigorous and impartial comparative analysis, each algorithm was run 30 times for every benchmark function. Following this, statistical evaluations were conducted to determine both the central tendency and variability of data from these 30 runs. The research framework utilized 60 search agents, each limited to a maximum of 500 iterations. Tables 6, 7 and 8 present the statistical results, encompassing mean values (ave) and standard deviations (std), of the hGWOA approach, comparing its performance to other notable algorithms, including DA, ALO, GWO, and WOA.

Table 6 Results of different algorithms on uni-modal functions.
Table 7 Results of different algorithms on multi-model functions.
Table 8 Results of different algorithms on fixed functions.

It is imperative to highlight that uni-modal functions are characterized by a singular global extremum, making them an ideal benchmark for evaluating an algorithm’s capability in exploiting the search space. Upon examination of the results presented in Table 6, it is discernible that hGWOA surpasses other nature-inspired algorithms, namely ALO, DA, WOA, and GWO, in the domain of uni-modal mathematical functions. This superiority is evidenced by its consistent performance across all seven instances for GWO, WOA, and DA, and in six of the seven cases for ALO.

In contrast to uni-modal functions, multi-modal functions are distinguished by the presence of a singular optimal global point accompanied by multiple local optima. These characteristics make multi-modal functions particularly apt benchmarks for assessing the search space exploration competence of hGWOA. A close examination of the outcomes from the multi-modal test functions, as presented in Table 7, underscores hGWOA's superior performance relative to WOA, GWO, ALO and DA. Notably, hGWOA's efficacy surpasses that of DA across all six instances, outperforms WOA in four of the six, eclipses ALO in five of the six, and bests GWO in three of the six scenarios. Such outcomes attest to hGWOA's skill in adeptly navigating around local optima and its thorough probing of the search space. This exceptional performance accentuates the algorithm’s potential significance in academic research, particularly in the domain of exhaustive search space exploration.

Composite benchmark test functions represent an integration of various monomodal and multi-modal functions, subjected to transformations and perturbations, including rotation, translation, and bias. These composite benchmark evaluation functions share a consistent actual search domain replete with numerous local optima. This makes them particularly beneficial for assessing the balance between exploration and exploitation within the search space. Table 8 showcases the results of evaluating the efficacy of the hGWOA algorithm in addressing synthesized benchmark evaluation challenges (F14–F23). Based on the empirical findings, it can be inferred that the hGWOA algorithm surpasses other population-based optimization techniques in efficiency, underlining its prowess in striking an equilibrium between search space exploration and exploitation. This competency is further illuminated by the algorithm's aptitude to consistently demonstrate superior mean values, illustrating its balanced approach to the tradE−off between discovering and harnessing the search space.

The convergence analysis, which evaluates the efficacy of the hGWOA algorithm, is juxtaposed against other prominent algorithms, namely DA, ALO, WOA, and GWO. This comparative evaluation is visually represented in Figs. 6, 7, and 8. In this study, 30 exploration strategies were employed across 150 iterations, resulting in convergence diagrams. These diagrams vividly underscore the superior convergence aptitudes of hGWOA for a majority of the standard functions. Notably, the data suggests that hGWOA possesses a heightened probability of attaining optimal convergence compared to the other algorithms under examination.

Figure 6
figure 6

Convergence behavior of GWO, WOA, ALO, DA, and hGWOA for unimodal test functions.

Figure 7
figure 7

Convergence behavior of GWO, WOA, ALO, DA, and hGWOA for multi-modal functions.

Figure 8
figure 8

Convergence behavior of hGWOA, GWO, DA, ALO, WOA for composite functions.

Convergence behaviours on CEC2017 benchmark function

The CEC2017 test functions form a suite of benchmark functions introduced during the 2017 IEEE Congress on Evolutionary Computation (CEC) competition, with an emphasis on real-parameter optimization. These benchmarks are highly esteemed within the evolutionary computation community and related fields. They serve as pivotal tools for evaluating and comparing the performance of optimization algorithms. Evolving from the benchmark collections of prior years, the CEC2017 suite has been rigorously designed to offer a wide array of challenges to optimization techniques.

Contrasted with the 23 traditional benchmark functions, the CEC2017 functions are viewed as more representative of realistic optimization scenarios. Their expansive coverage encompasses both unimodal and multi-modal landscapes, and they span separable as well as non-separable functions. Moreover, these functions feature shifted and rotated variants, providing an exhaustive testbed for algorithmic evaluations. Such a versatile set of testing scenarios allows researchers to evaluate the merits and limitations of various optimization algorithms across diverse contexts.

In this context, the efficacy of hGWOA has been assessed using the IEEE CEC2017 test suites, as referenced in58. These suites are broadly classified into four distinct categories: unimodal, multimodal, hybrid, and composition. Table 9 offers detailed definitions of the CEC2017 benchmark problems. To enhance the level of complexity and thoroughly evaluate the proposed method’s aptitude in addressing intricate optimization challenges, all functions within the CEC2017 suite have been configured to be 30-dimensional.

Table 9 CEC2017 benchmark functions.

For a comprehensive and unbiased evaluation, each algorithm was executed 30 times for every benchmark function. Following these runs, statistical analyses were carried out to evaluate both the central tendency and the dispersion of the data from these 30 trials. In the context of this study, 60 search agents were utilized, with each restricted to a maximum of 500 iterations. The results of the hGWOA approach are presented in Tables 10, juxtaposing its performance with that of other prominent algorithms such as DA, ALO, GWO, and WOA. A detailed examination of the data in Table 10 demonstrates that hGWOA consistently surpasses its nature-inspired peers, namely ALO, DA, WOA, and GWO, in the unimodal, multimodal, hybrid, and composition domains.

Table 10 Results of different algorithms on classical benchmark test functions.

Different versions of the CVRP.

For the TSP as delineated in references55,59, the computational complexity is recognized to escalate exponentially with the augmentation in the number of cities. To elucidate, a TSP encompassing n cities entails considering 1/2*(n−1)! feasible routes. Taking an illustrative example where n = 16, the total number of potential routes amounts to an overwhelming 6.54 × 1011. This vast array of route permutations renders the TSP exceptionally computation-intensive. In light of this, when considering the VRP, which essentially comprises multiple intertwined TSPs, the computational complexity is magnified substantially.

Case study 1

In the first case study addressing the CVRP challenge, the setting encompasses a central warehouse tasked with catering to eight distinct customers. This operation is facilitated by two delivery trucks, each possessing a capacity to transport eight vehicles. The Euclidean distances, along with the specific delivery requirements pertinent to each customer, are tabulated in Table 11. The primary objective of this case study revolves around minimizing the cumulative distance traversed by the two delivery trucks, ensuring that all constraints intrinsic to the VRP are met in the process.

Table 11 Customer Euclidean distance and delivery requirements of 8-customer problem60.

Table 12 delineates the results derived from a diverse array of algorithms applied to the given problem. This includes methodologies as proposed in reference60, complemented by outcomes from distinct algorithms like DA, GWO, ALO, and WOA. Notably, the mean percentage deviation (%dev) for the hGWOA stands out as superior. It registers a more favorable performance than WOA (0.44%), DA (1.51%), ALO (2.14%), GWO (1.44%), MHPSO (1.74%), DPGA (2.73%), and SGA (4.03%).

Table 12 Results of different algorithms on 8-customer problem.

While all considered algorithms yielded commendable results, the average outcomes from hGWOA surpassed the rest, underscoring its superior stability in both the mining and exploration phases. Complementing these observations, Fig. 9 visually portrays hGWOA's advantageous data distribution relative to its counterparts. Among the results, the pinnacle solution achieved a commendable total distance of 67.5 units. Leveraging the hGWOA algorithm, the navigation routes for the two vehicles were computed, the details of which are tabulated in Table 13. A more granular graphical representation of these routes can be viewed in Fig. 10.

Figure 9
figure 9

Boxplot of hGWOA, WOA, DA, ALO, MHPSO, DPGA, SGA and GWO on 8-customer problem.

Table 13 Routing of vehicles and distance using hGWOA algorithm on 8-customer problem.
Figure 10
figure 10

Best solution for the CVRP of 8-customer problem.

For implementation, the chosen algorithms were rendered in Java. Subsequent integrations and tests were conducted on a personal computer equipped with an Intel(R) Core(TM) Processor I7-1165G7 clocked at 2.80 GHz. Each algorithm underwent 20 runs, employing 20 distinct exploration strategies and encapsulating 50 iterations for all the CVRP scenarios.

Case study 2

In the second case study addressing the CVRP problem, the intrinsic complexity of the TSP issue is addressed by leveraging data from Azad’s40 study. This research focuses on a hub-and-spoke delivery system serving 25 cement customers. A pioneering approach to the CVRP is proposed, implementing a genetic algorithm technique and deploying a fleet of five delivery trucks, each having a capacity of 1500 bags. Using the coordinate data provided in Table 14, we derive matrices that represent distances among customers and their specific demands. These matrices are presented in Tables 15 and 16, respectively. The primary aim is to optimize delivery routes for the 25 customers with a fleet of five trucks, thereby minimizing the total distance traveled while still adhering to the fundamental constraints inherent in the CVRP.

Table 14 The coordinates of 25 customers and their respective demands per customer40.
Table 15 Customer Euclidean distance and delivery requirements of 25-customer problem.
Table 16 Customer Euclidean distance and delivery requirements of 25-customer problem (continued).

Table 17 presents the results from various algorithm implementations. Notably, the %dev best solution achieved by the hGWOA algorithm surpasses that of other optimization techniques. It outperforms WOA by 6%, DA by 16%, ALO by 26%, GWO by 4%, and GA by 31%. Moreover, Fig. 11 provides a visual representation that highlights the superior data distribution of hGWOA compared to other algorithms. This study's findings accentuate the efficacy of the hGWOA algorithm in obtaining the optimal solution, with a total distance of 571.24 units. Table 18 details the delivery routes for the five trucks as determined by hGWOA, and a graphical representation is provided in Fig. 12.

Table 17 Results of different algorithms on 25-customer problem.
Figure 11
figure 11

Boxplot of hGWOA, WOA, DA, ALO and GWO on 25-customer problem.

Table 18 Routing of vehicles and distance using hGWOA algorithm on 25-customer problem.
Figure 12
figure 12

Best solution for the CVRP of 25-customer problem.

For experimental implementation, the algorithms were coded in Java and executed on a personal computer powered by an Intel(R) Core(TM) i7-1165G7 processor with a clock speed of 2.80 GHz. Each algorithm was tested over 20 runs, employing 60 search agents, for a total of 200 iterations in all CVRP scenarios.

Real CVRP in Viet Nam

In the real case study addressing the CVRP issue, authentic delivery data from a cement supplier servicing 30 customers was scrutinized within a shaft-and-spokes distribution framework. This data was pivotal in tackling the intricacies associated with the TSP challenge. The supplier operated with a fleet of six delivery trucks, each with a capacity of 700 bags. Utilizing the given distance coordinates, we derived a distance matrix for each customer, outlining their specific demands, as illustrated in Tables 19 and 20. The primary goal was to efficiently cater to all 30 customers using the six trucks, minimizing the total travel distance, all while adhering to the parameters set by the CVRP.

Table 19 Customer Euclidean distance and delivery requirements of 30-customer problem.
Table 20 Customer Euclidean distance and delivery requirements of 30-customer problem (continued).

Table 21 consolidates the performance metrics of the different algorithms tested. Significantly, the hGWOA algorithm emerged as the frontrunner, with its best %dev solution outperforming other optimization techniques: WOA by 20.2%, DA by 31.8%, ALO by 36.6%, and GWO by 19.5%. Figure 13 provides a visual comparison, highlighting the superior data distribution of hGWOA compared to other algorithms. This analysis reaffirmed the effectiveness of the hGWOA algorithm in optimizing delivery routes, achieving a cumulative distance of 791.24 units. The delivery routes for the six trucks, as determined by hGWOA, are delineated in Table 22 and further illustrated in Fig. 14.

Table 21 Results of different algorithms on 30-customer problem.
Figure 13
figure 13

Box plot of hGWOA, WOA, DA, ALO and GWO on 30-customer problem.

Table 22 Routing of vehicles and distance using hGWOA algorithm on 30-customer problem.
Figure 14
figure 14

Best solution for the CVRP of 30-customer problem.

For the computational studies, algorithms were developed in Java and executed on a personal computer powered by an Intel(R) Core(TM) i7-1165G7 processor operating at 2.80 GHz. Each algorithm was subjected to 20 runs, using 60 search agents, and covered 200 iterations for every CVRP test scenario.

Conclusion

This study unveils a novel approach to global optimization by merging the WOA method with GWO techniques. This strategic combination aims to seamlessly merge the exploratory capabilities of WOA with the search space exploitation proficiencies inherent to GWO, targeting optimal outcomes. The resulting hybrid algorithm, termed hGWOA, has been meticulously assessed using both classical test functions and CEC2017 benchmark test functions. The empirical results underscore hGWOA’s marked advantage over both GWO and WOA in achieving global optimization.

Additionally, this research employs the innovative hGWOA algorithm to tackle the Routing Logistics Challenge faced by limited-capacity cement trucks, referred to as the CVRP. Through computational evaluations across various contexts—namely, two unique case studies and a practical project—it is evident that hGWOA excels in crafting high-quality solutions to CVRP optimization issues. Based on these findings, hGWOA emerges as a promising meta-heuristic approach, suitable not only for the CVRP dilemma but also for a spectrum of related optimization challenges.

Directions for future research

This study emphasizes the application of the hGWOA method specifically to address CVRP issues. However, in real-world materials transportation scenarios, VRP challenges often encompass a myriad of factors, including delivery timelines, carbon emissions, fuel consumption metrics, and prevailing road traffic conditions. It is therefore anticipated that subsequent research endeavors will deploy the hGWOA methodology to grapple with intricate and multifaceted VRP conundrums that simultaneously align with customer stipulations.

Upon comparative evaluation with established swarm-based optimization algorithms, specifically DA, ALO, WOA, and GWO, the hGWOA paradigm manifests a commendable balance between exploration and exploitation capacities. Moreover, it showcases competitive prowess across diverse magnitudes of the CVRP. A limitation, however, arises when scaling to larger problem sets, wherein hGWOA occasionally grapples with local optimization pitfalls. As a forward-looking initiative, forthcoming research aims to concoct a composite model wherein hGWOA operates synergistically with ancillary techniques. These might encompass adaptive weighting customizations, Yin-Yang-centric learning mechanisms, mutation procedures, and crosstalk interventions. Such an integrative approach aims to bolster hGWOA's effectiveness in navigating optimization challenges, particularly within transportation management, and extending to broader technical spheres.

The hybrid model hGWOA may converge slowly, especially when dealing with high-dimensional or complex optimization problems. Employ techniques such as adaptive parameter settings, dynamic population sizing, or hybridization with other optimization algorithms to accelerate convergence and improve efficiency. In addition, the performance of this model may deteriorate when applied to extremely large-scale optimization problems. Hence, future research could consider implementing problem-specific adaptations, parallel processing, or divide-and-conquer strategies to make hGWOA more suitable for handling larger problem instances.