Introduction

Applications with high computational and data demands, such as climate modelling, drug discovery, genomics, bioinformatics, financial modelling, data analytics, and healthcare informatics, are fueling the demand for computational grids1,2,3,4,5,6,7,8,9,10. Computational grids have emerged as powerful computational paradigms, facilitating large-scale, distributed computing through the utilization of interconnected computing and storage resources. The optimal allocation of tasks to resources in computational grids becomes increasingly intricate due to various constraints, including resource heterogeneity, dynamic workload characteristics, system dynamics, and adherence to user Quality of Service (QoS) parameters, such as latency and cost.

Grid service providers typically aim to maximize profits, while users seek to minimize execution costs, communication costs, and turnaround time for their applications. One approach to achieve this is by designing efficient task schedulers to schedule user applications on computational grids. Efficient task schedulers play a crucial role in achieving these objectives, enabling intelligent decisions regarding task allocation and resource management within specified constraints. Despite being an NP-complete problem11, designing efficient task scheduling algorithms for computational grids is essential in meeting user-defined QoS requirements.

The design of task scheduling algorithms is based on either single or multi-objective functions. Task scheduling algorithms based on a single objective function are not suitable for scheduling complex real-time applications. Single-objective task scheduling algorithms primarily focus on optimizing a specific objective ( minimizing makespan, cost, energy etc) based on heuristics, metaheuristic algorithms, or mathematical optimization techniques to find near-optimal scheduling sequences. Single objective functions will find the best solution, which corresponds to either minimum or maximum value. However, they often fail to consider other objectives, resulting in imbalanced resource utilization, increased energy consumption etc. These algorithms are based on meta-heuristic algorithms12, greedy13, fuzzy model14, game theory15, bio-inspired16, and more. However, in real-world applications, it is necessary to take into account several conflicting goals at once. For instance, maximizing resource utilisation, minimizing turnaround time, minimizing task execution cost etc are equally crucial for improving system efficiency. On the other hand task scheduling algorithms based on multi-objective criteria will address these limitations by simultaneously optimizing multiple objectives, offering more robustness for users to prioritize one or more criteria over other and diverse solutions.

Multi-objective function optimization involves optimizing multiple conflicting objectives simultaneously. Common heuristic approaches for multi-objective task scheduling include the application of genetic algorithms (NSGA, NSGA-II)17,18, particle swarm optimization (MOPSO)19, simulated annealing(MOSA)20, ant colony optimization (MOACO)21, and other evolutionary (MOEAs)22 etc.These methods leverage principles inspired by natural processes to explore the solution space and find trade-off solutions among conflicting objectives. In our proposed method, heuristics are utilized as general problem-solving strategies, employing intuitive, trial-and-error methods to quickly find effective solutions. This systematic approach is designed to identify the best solution based on a defined objective function or set of criteria. Heuristics serve as rule-of-thumb methods, particularly valuable when an exhaustive search or an exact solution is impractical. The objective of incorporating heuristic approaches into our framework is to strike a balance among competing objectives. This includes minimizing turnaround time, execution cost, and communication cost while maximizing resource utilization. The application of heuristics enables the derivation of practical and computationally efficient solutions, especially in scenarios where finding an optimal solution proves challenging or unfeasible. In this article, we propose a task scheduling algorithm based on multi-objective optimization formulation with different objective functions such as minimising turnaround time (TAT), task execution cost, data communication cost between resources, and maximising grid utilization in a heterogeneous multi-grid environment. The proposed framework is plugged into a gridsim architecture as shown in Fig. 1(green colour). The framework contains five different schedulers namely 1. Greedy scheduler: prioritizes minimizing turnaround time, communication cost, and execution cost while maximizing grid utilization. 2. Greedy communication cost scheduler: minimizes communication cost by distributing tasks across computing resources within a single Grid. 3. Greedy execution cost scheduler: aims to minimize execution cost by scheduling each task on the most suitable subset of computing resources based on their cost-to-performance ratio. 4. Greedy no fragmentation scheduler: task as fragmented and schedule tasks on individual computing resources. 5. Random scheduler: schedules tasks on a random subset of computing resources.

We summarize our contributions as follows:

(1) Formulating a task scheduling framework with multiple objectives. (2) The proposed framework is integrated with Grid-sim (simulator) and performance is evaluated. (3) We applied a Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) to solve the proposed multi-objective optimization for task scheduling.

The rest of this paper is organized as follows. Section "Related work" describes the related work. Section "System model" describes the system model. In Sect. "Formulation of multi-objective optimization for task scheduling", objective functions are formulated for TAT, execution cost, communication cost and grid utilization. The task scheduling algorithm is presented in Sect. "Proposed task scheduling algorithm" In Sect. "Demonstration of the proposed task scheduling algorithm" demonstration of our proposed task scheduling is discussed. In Sect. "Results and discussion" results are discussed. Multi-Objective Decision Making Problem is presented in Sect. "Formulation of the multi-Objective-decision-making problem". Finally, in Sect. "Conclusion and future work" we conclude the paper.

Related work

In this section, we present a brief discussion on existing multi-objective task scheduling frameworks/algorithms/models etc. A Grid-based Evolutionary Algorithm (GrEA) is proposed in Ref.23 to tackle multi-objective optimization issues by utilising the grid-based resource capacity to boost selection pressure in the best direction while maintaining a broad and uniform distribution of choices. A framework is designed to evaluate multi-objective functions (makespan, cost, deadline violation rate, and resource utilization.) for scheduling tasks24 based on the Ant Colony Algorithm in Cloud Computing. A new bio-inspired diversity metric, Pure Diversity (PD) is proposed in Ref.25 to assess the performance of diversity of multi-objective evolutionary algorithms (MOEAs) for solving Many-objective optimization problems(MaOPs). A MATLAB-based PlatEMO is developed to use it for performing comparative experiments, embedding new algorithms, creating new test problems, and developing performance indicators26. This platform includes more than 50 multiobjective evolutionary algorithms and more than 100 multi-objective test problems. Multi-objective particle swarm optimizer(NMPSO) algorithm with a Balanceable Fitness Estimation(BFE) method was designed in Ref.27 to tackle many-objective optimization problems( MaOPs). A multi-objective optimization method based on a non-dominated sorting genetic algorithm (NSGA-II) is applied and tested on an IEEE 17-bus test system28, which simultaneously minimizes two contradicting objective functions such as voltage deviation at buses and total line loss. A multi-objective charging framework that incorporates a vehicle-to-grid (V2G) strategy to optimally manage the real power dispatch of electric cars. The objective functions minimizing load fluctuation and charging costs related with EVs in residential areas29. Partitional Clustering Method (PCM) and Hierarchical Clustering Method (HCM) are used in clustering-based evolutionary algorithms for tackling MaOPs30. For determining congestion thresholds in low-voltage (LV) grids, authors in Ref.31 used a multi-objective particle swarm optimisation (MOPSO) approach paired with data analytics via affinity propagation clustering. A virtual machine migration method is designed to maximize host release and minimize virtual machine migration is proposed in32. Task Scheduling for Deadline and Cost Optimization (DCOTS) is presented in Ref.33. This work ensures the fulfilment of user requirements while simultaneously aiming to maximize the profitability for cloud providers. The objective functions for building a multi-objective cloud task scheduling model include34 execution time, execution cost, and virtual machine load balancing. Subsequently, the task scheduling problem is addressed using the multi-factor optimization (MFO) technique, and the characteristics of task scheduling are integrated with the multi-objective multi-factor optimization (MO-MFO) algorithm to formulate an assisted optimization task. A Task Scheduling technique35 based on a Hybrid Competitive Swarm Optimization Algorithm (HCSOA-TS) within the context of the CC platform. The proposed HCSOA-TS efficiently schedules tasks to maximize resource utilization and overall performance. The construction of a multi-objective task scheduling model for cloud computing36, aimed at optimizing cloud computing tasks, utilizes the Cat Swarm Optimization (CSO) model. The task objectives for cloud computing were scrutinized, leading to the formulation of a multi-objective task scheduling model with execution time and system load as key scheduling objectives. Study in Ref.37 presents a parallel algorithm for task scheduling, where both the priority assignment to tasks and the construction of the heap are concurrently executed. Authors in Ref.38 present edge scheduling stage, tasks are arranged based on the latest start times of their successors instead of their sub-deadlines, with the goal of mitigating lateness in subsequent tasks.

In Grid Computing, the resource optimisation problem is treated as a Multi-Objective Optimisation problem39, and PSO is used to search the problem area for possible solutions. To find non-dominated solutions for the multi-objective issue and to optimise and search for the best Grid resources, the Functional Code Sieve algorithm is used. Similarly, various task scheduling algorithms40,41,42,43,44,45,46,47 based on multi-objective optimization are studied.

Resource management and task scheduling are intricate operations in computational grids. To manage distributed resources and evaluate scheduling algorithms and their performance with different numbers of resources, a toolkit named GridSim has been proposed. GridSim aids in the mapping of user tasks to grid resources. Several task scheduling algorithms have been simulated using GridSim since its introduction48,49,50,51,52,53,54,55.

The Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) is a method used for multi-criteria decision analysis. It was initially introduced in Refs.56,57,58. A-TOPSIS, presented in Ref.59, aims to compare the performance of different algorithms based on mean and standard deviations. This technique calculates the best and worst algorithms based on user-defined parameters. Another method, D-TOPSIS, is presented in Ref.60 and is more effective in representing uncertain information compared to other group decision support systems based on the classical TOPSIS method. TOPSIS fuzzy61 is a multi-objective decision-making tool used to find a scheduling algorithm that can minimize response time and maximize throughput. In Ref.62, the authors propose a method that combines the Heterogeneous Earliest Finish Time (HEFT) algorithm with the TOPSIS method to solve multi-objective problems. Thus, TOPSIS is a valuable decision-making technique because it provides a systematic and structured approach to evaluate and rank alternatives based on multiple criteria, helping end users to make well-justified choices in complex decision scenarios.

System model

Task model

The task scheduling framework consists of a task graph, a task scheduler and a grid network. A task graph is an input to a task scheduler and is defined as a Weighted Directed Acyclic Graph (WDAG) \(WTG=(T, E)\). where T is set of tasks and E set of edges which describes the dependency between tasks. The weight \(W(T_i)\) is assigned to task \(T_i\) represents the size of a \(i^{\text {th}}\) task and is expressed as Million Instructions (MI).

Grid model

Grid network consists of set of grid nodes G = \(\{G_1, G_2, G_3 ,...,G_m\}\) and they are interconnected by high speed network. Each grid node contains p number of heterogeneous processing elements \(G_i\) = \(\{r_{i1}, r_{i2}, r_{i3},...,r_{ip}\}\) and these processing elements are internally connected by a high-speed communication network. Processing speed / CPU_speed of each processor is represented in terms of Million Instructions Per Second (MIPS). Each computational grid contains a local scheduler, The function of the local scheduler is to manage the execution of a task on a grid resource given by the task scheduler. The local scheduler is also responsible for collecting information about computational resources periodically and communicating with the task scheduler.

Figure 1
figure 1

Proposed multi-layer architecture.

Simulation model

GridSim66

We have employed a Java-based discrete-event toolkit called GridSim to simulate our multi-objective task scheduling framework. This versatile toolkit offers a comprehensive suite of features for modelling and simulating resources and network connectivity, accommodating various capabilities and configurations. Among its capabilities are primitives for composing applications, information services for resource discovery, and interfaces for task allocation to resources and managing their execution. These capabilities enable us to simulate resource brokers or grid schedulers, facilitating the evaluation of scheduling algorithms’ performance. It’s worth noting that GridSim does not prescribe any specific application model, but in our proposed framework, we have adopted a Directed Acyclic Graph (DAG) as the application model. Within the GridSim environment, individual tasks can exhibit differing processing times and input file sizes. To represent these tasks and their requirements, we utilize Gridlet objects. Each Gridlet encapsulates comprehensive information related to a job, including execution management details such as job length (measured in MIPS), disk I/O operations, input and output file sizes, and the job’s originator. In the context of GridSim, a Processing Element (PE) stands as the smallest computing unit, configurable with varying capacities denoted in Million Instructions per Second (MIPS). Multiple PEs can be combined to construct a machine, and in a similar fashion, machines can be aggregated to form a grid. Grids can allocate Gridlets in either a time-sharing mode (common in single-processor Grids) or a space-sharing mode (typical for multi-processor Grids).

Existing GridSim architecture

Proposed multi-layer architecture and abstractions are shown in Fig. 1. The layered structure of this system begins with the foundational run-time machinery, known as the JVM (Java Virtual Machine). This JVM is versatile, catering to both single and multiprocessor systems, including clusters. Moving up to the second layer, we encounter a fundamental discrete-event infrastructure that relies on the interfaces offered by the first layer. This infrastructure is actualized through SimJava, a well-regarded Java library for discrete event simulation. The third layer delves into the simulation of essential grid entities, encompassing resources and information services, among others. Here, the GridSim toolkit employs the discrete event services provided by the underlying infrastructure to simulate these core resource entities. Ascending to the fourth layer, our attention turns to the simulation of resource aggregators, often referred to as grid resource brokers or schedulers. Finally, the fifth and topmost layer is dedicated to application and resource modelling across various scenarios. It harnesses the services furnished by the two lower-level layers to evaluate scheduling strategies, resource management policies, heuristics, and algorithms.

Life cycle of a GridSim simulation

Prior to commencing a simulation, we establish the resource entities (including PEs, Machines, and Grids) that will be available throughout the simulation. Upon GridSim’s initiation, these resource entities autonomously enroll themselves with the Grid Information Service (GIS) entity by dispatching relevant events.

Furthermore, at the onset of the simulation, a user initiates the process by submitting their job to a Resource Broker. The resource broker plays a pivotal role in the simulation, encompassing several responsibilities. It first employs information services to identify accessible resources for the user. Subsequently, it performs task-to-resource mapping (scheduling), orchestrates the staging of application components and data for processing (deployment), initiates job execution, and ultimately aggregates the results. Beyond these tasks, the resource broker also takes on the crucial role of monitoring and tracking the progress of application execution.

Our resource broker implementation

All the application models we have explored rely on task inter-dependencies, which are precisely defined using Directed Acyclic Graphs (DAGs). Regrettably, GridSim does not inherently accommodate the execution of tasks that are constrained by these inter-dependencies. In response to this limitation, our Resource Broker implementation extends support for such scenarios by ensuring that the order of task execution adheres to the specified dependency constraints. Our Resource Broker defines a versatile task Scheduler interface, offering seamless integration with various schedulers. This interface serves as a plug-and-play mechanism, enabling the utilization of multiple schedulers introduced in our work (GS, GCPS, GEPS, GNFS), all of which adhere to this common interface. Furthermore, our task scheduling framework introduces an innovative concept called task fragmentation, allowing tasks to be divided for execution across multiple computing resources. To facilitate this, our resource broker incorporates a Gridlet Fragmentation Service. When a gridlet is scheduled to run on more than one Processing Element, it is initially fragmented into multiple smaller virtual gridlets. These virtual gridlets are then individually executed by the allocated Processing Elements. Upon their completion, the Gridlet Fragmentation Service reunites them into the original single gridlet. Another novel concept introduced by our task scheduling framework involves partial dependencies among tasks. However, GridSim does not inherently enable the Resource Broker to monitor task progress during execution. To address this, we have implemented a pinger service within the Resource Broker and individual Processing Elements. This pinger service allows the Broker to stay informed about a gridlet’s execution progress, enabling it to schedule child tasks once a parent task has reached a predefined threshold percentage of execution, as dictated by the parent-child dependency.

Lastly, we have enhanced the Resource Broker with the capability to gather performance statistics, including Turnaround Time, Resource Utilization, Execution Price, and Communication Price. These statistics provide valuable insights into the system’s performance.

Formulation of multi-objective optimization for task scheduling

We propose task scheduling problem as a multi-objective optimization problem with a goal to minimize TAT, execution price, communication price and maximize grid utilization for precedence constrained task graphs is represented as argmin(TAT, EP,CP, \(-GU\)).

The objective function for TAT is defined and formulated as shown in Eq. (1).

$$\begin{aligned} \ TAT = {{\sum }}_{i=1}^{n} {{ \sum }}_{j=1}^{m}{{ \sum }}_{k=1}^{p[j]}{{X_{ij_{k}}} \times \tau _{{ij}_{k}}} \end{aligned}$$
(1)

where \({X_{ij_{k}}}={\left\{ \begin{array}{ll} 1, &{}\quad \text {if the task} {T_{i}} \text {is scheduled on the } jth \text { grid} \\ &{}\quad \text {on its } kth \text { resource } \\ 0, &{}\quad \text {otherwise}. \end{array}\right. }\)

\(\tau _{{ij}_{k}}=\) Execution time of Task \(T_i\) on k’th resource of grid j

$$\begin{aligned} GU= \frac{\sum _{i=0}^{n} W_{T_i}}{\left( \sum _{j=0}^{m} \sum _{k=0}^{j} W_{jk} \right) \times TAT} \end{aligned}$$
(2)

Grid Utilization is formulated in Eq. (2).

$$\begin{aligned} \begin{aligned} EP&= \sum _{i=0}^{n} \sum _{j=0}^{m} \sum _{k=0}^{p[j]} \left( {X_{ij_{k}}} \times \tau _{{ij}_{k}} \times Price_{E_{kj}} \right) \end{aligned} \end{aligned}$$
(3)

Task execution price and communication price is defined and formulated in Eqs. (3) and  (4) respectively. Rest of the paper used price and cost interchangeably.

$$\begin{aligned} \ CP = {\sum }_{i=1}^{n} \left( {M_{i} \atopwithdelims ()2} {MAX_{j=1}^{m}} \left( \tau _{ij} \right) \times Price_C \right) \end{aligned}$$
(4)

Where

\({\tau _{ij}}={\sum }_{k=1} ^{p[j]} {X_{ijk}}*{\tau _{ijk}}\)

and

\({M_{i}}={\sum }_{j=1} ^{m} X_{ij}\)

where \({X_{ij}}={\left\{ \begin{array}{ll} 1, &{} \text {if the task }{T_{i}} \text {is scheduled on} \\ &{} \text {on any machine of Grid} G_j \\ 0, &{} \text {otherwise}. \end{array}\right. }\)

Proposed task scheduling algorithm

Proposed Multi-Objective task scheduling algorithm is described in algorithm 2. Algorithm generates an optimized schedule sequence (task-id, [grid-ID, machine-ID], execution start-time and end-time) according to multiple objectives (TAT, EC, CC and RU).

Input to the algorithm is number of tasks(n), task dependency graph (weighted adjacency matrix WTG[1, ..., n][1, ..., n]), task lengths (\(W_T[1,..., n]\)), number of grids(m), number of machines p[1, ..., m] in each grid, processing capacity of each grid in terms of MIPS (\(W_G[1,..., m])\), and the user’s objective optimization criteria (See 2 for choices). The algorithm’s output is the optimized task schedule sequence (step 1 and 2). Step 3 generates all possible combinatorial subsets of Grid-Machines that a task can be allocated onto, depending on the user’s objective optimization criteria, as so: If the user criteria is GS then this step generates all possible subsets of grid-machines sets. If the user criteria is GCPS then it generates combinatorial sets of grid machines with all the machines in each set belonging to the same grid. If the user criteria is GEPS then it generates combinatorial sets of grid-machines which offer the lowest task execution price (other Grid-Machines are ignored). Similarly, if the user criteria is GNFS then it generates singleton sets of all the individual grid-machines.

The algorithm then executes in a loop (from Step 7) until all the tasks have been scheduled. On every iteration of the loop, the algorithm first identifies (in Step 8) tasks whose parent task dependency constraints have been met and are thus available for scheduling. Step 4 then uses function (5) to select the best task and Grid-Machine combination for scheduling. Steps 11 to 13 append this Task-Grid-Machine allocation to the schedule sequence, and update the information about available Grid Machines and unscheduled tasks. Finally, Steps 14, 15 enter into a blocking wait until one or more Grid-Machines are available, after which, the algorithm enters into another iteration of the Step 7 loop.

Algorithm 1
figure a

Multi-objective task scheduler.

Algorithm 2
figure b

Generic single-objective task scheduler.

Algorithm 3
figure c

Generate processing element combinations.

Function to determine the preference to schedule a task on a set of GridMachines

$$\begin{aligned} f_{s}(T_i, G_jM_k) = \frac{W_{T_i}}{MAX_{i=1}^{n}(W_{T_i})} \times \frac{d^+(T_i)}{MAX_{i=1}^{n}(d^+(T))} \times \frac{W_{G_j}}{MAX_{j=1}^{m}(W_{G_j})} \end{aligned}$$
(5)

Demonstration of the proposed task scheduling algorithm

To enhance comprehension of the proposed algorithm 2, we’ll illustrate its functionality through an example, using concise input parameters. This demonstration will cover four distinct user objective types (GSGEPSGCPSGNFS).

Consider an application with workload characterized by a task graph comprising four tasks, each task contains 60 million instructions (MI). This task graph is represented as a Directed Acyclic Graph (DAG), as shown in Fig. 2a. Similarly, a grid network, depicted in Fig. 2b, comprises two grids: \(G_1\) housing Grid-Machine \(G_1M_1\) and \(G_2\) hosting Grid-Machines \(G_2M_1\) and \(G_2M_2\). Each Grid-Machine possesses a processing capacity of 2 million instructions per second (MIPS). These specifications in Table 1, serve as the inputs for Algorithm 2. In the following subsections, we illustrate the iterations executed by the proposed scheduling algorithm and the corresponding helper functions for each distinct objectiveType.

Figure 2
figure 2

A typical scenario for a proposed scheduling algorithm demonstration.

Table 1 Input parameters to the scheduling algorithm 2.
Table 2 Function \(f_g()\) to generate possible subsets of Grid machines to allocate tasks to.

Objective type: greedy scheduler

Function \(f_g()\) (described in Table 2) generates 7 possible combinations of Grid-Machine subsets to allocate tasks for the Greedy Scheduler objectiveType, as illustrated in Table 3.

Function \(f_s()\) (described in Eq. (5)) computes a preference matrix for scheduling each task on each of the generated Grid-Machine subsets, as shown in Table 4. Then, Algorithm 2 computes the schedule sequence of task allocations onto Grid-Machines, as shown in Table 5.

Table 3 Grid-Machine subsets generated by \(f_g()\) for objectiveType=G.
Table 4 Function \(f_s(task, gmSubset)\) output for objectiveType=G.
Table 5 Schedule sequence of tasks allocations to grid-machines by Greedy scheduler for objectiveType=G.

Objective type: greedy communication price scheduler

Function \(f_g()\) (described in Table 2) generates 4 possible combination of Grid-Machine subsets to allocate tasks for a Greedy Scheduler objectiveType, as illustrated in Table 6. Function \(f_s()\) (described in Eq. (5)) computes a preference matrix for scheduling each task on each of the generated Grid-Machine subsets, as shown in Table 7. Then, Algorithm 2 computes the schedule sequence of task allocations onto Grid-Machines, as depicted in Table 8.

Table 6 Grid-Machine subsets generated by \(f_g()\) for objectiveType=\(G_{CP}\).
Table 7 Function \(f_s(task, gmSubset)\) output for objectiveType=\(G_{CP}\).
Table 8 Schedule sequence of tasks allocated to Grid-machines for objectiveType=\(G_{CP}\).

Objective type: greedy no fragmentation scheduler

Function \(f_g()\) (described in Table 2) generates 3 possible combination of Grid-Machine subsets to allocate tasks for the Greedy Scheduler objectiveType, as illustrated in Table 9. Function \(f_s()\) (described in Eq. (5)) computes a preference matrix for scheduling each task on each of the generated Grid-Machine subsets, as shown in Table 10. Then, Algorithm 2 computes the schedule sequence of task allocations onto Grid-Machines, as shown in Table 11.

Table 9 Grid-machine subsets generated by \(f_g()\) for objectiveType=\(Greedy_{NF}\).
Table 10 Function \(f_s(task, gmSubset)\) output for objectiveType=\(G_{NF}\).
Table 11 Schedule sequence of tasks allocated to grid-machines for objectiveType=\(G_{NF}\).

Objective type: greedy execution price scheduler

Function \(f_g()\) (described in Table 2) generates 4 possible combination of Grid-Machine subsets to allocate tasks for the Greedy Scheduler objectiveType, as illustrated in Table 12. Function \(f_s()\) (described in Eq. (5)) computes a preference matrix to schedule a task on each of the generated Grid-Machine subsets, as shown in Table 13. Then, Algorithm 2 computes the schedule sequence of task allocations onto Grid-Machines, as shown in Table 14.

Table 12 Grid-machine subsets generated by \(f_g()\) for objectiveType=\(Greedy_{EP}\).
Table 13 Function \(f_s(task, gmSubset)\) output for objectiveType=\(G_{EP}\).
Table 14 Schedule sequence of tasks allocated to Grid-Machines for objectiveType=\(G_{EP}\).

Results and discussion

Simulation setup

The proposed multi objective task scheduling framework is simulated using GridSim. Simulation is carried out on three types of task graphs : standard task graphs, random task graphs and scientific task graphs on ubuntu operating system with AMD Ryzen 5 processor.

The framework includes five distinct task schedulers, each designed to optimize different target objectives:

1. Greedy scheduler: Prioritizes minimizing turnaround time, communication cost, and execution cost while maximizing grid utilization. 2. Greedy Communication Cost scheduler: Focused on minimizing communication cost by distributing tasks across computing resources within a single Grid. 3. Greedy Execution Cost scheduler: Aims to minimize execution cost by scheduling each task on the most suitable subset of computing resources based on their cost-to-performance ratio. 4. Greedy No Fragmentation scheduler: Aims to schedule tasks on individual computing resources, resulting in zero task fragmentation. 5. Random scheduler: Schedules tasks on a random subset of computing resources.

Table 15 explicates the notations used in the mathematical models and algorithms. Table 16 delineates the symbols representing various scheduling algorithms, while Table 17 furnishes a catalogue of scientific application graphs used in the current study.

Table 15 Key notation definitions.
Table 16 Schedulers and symbols.
Table 17 Scientific application graphs.

The proposed task scheduling algorithm is evaluated using standard, random and scientific task graphs.

Standard task graphs

Our earlier research, as presented in63, demonstrated theorems for standard unit size task graphs on a homogeneous grid network for turnaround time. Similarly, in64, we stated theorems for grid utilization. In this article, we have formulated mathematical models for homogeneous standard-weighted task graphs on a homogeneous grid network for both fragmented and non-fragmented versions of the task graphs. These formulations are defined in Tables 18 and 19 respectively.

Table 18 TAT for weighted fragmented standard task graphs.
Table 19 TAT for weighted non-fragmented standard task graphs.

TAT obtained from the proposed algorithm is tabulated in Table 20. The result describes both theoretical and simulated results for various standard task graphs (with fragmentation and without fragmentation) for a given number of tasks, grids, and processing elements. Here each task contains a uniform number of instructions (\(W_{Ti}=20000\) MI) and homogeneous processing elements ( \(W_{GR}=500\) MIPS) in each grid. Computed TAT is on par with our mathematical formulations. From the results, it is evident that as number of tasks increases, TAT also increases. Similarly, The computed values of turnaround time, execution cost, communication cost, and resource utilization using proposed schedulers for pipeline, star, ternary, independent and fully connected task graphs with varying number of task nodes and a given number of grid resources are tabulated in Tables 21, 22 and 23 respectively. From these results, it has been found that the greedy scheduler successfully optimizes for the fastest turnaround time along with grid utilisation, but the trade-off is that the communication cost is high. However, the greedy communication cost scheduler with a slightly slower TAT successfully incurs the lowest communication cost. In the absence of task fragmentation, a greedy scheduler achieves optimal grid utilization while incurring zero communication costs. Additionally, it’s worth noting that the execution cost remains consistent across different task schedulers when standard graphs are processed on a homogeneous grid network. As the nature of the graphs becomes increasingly independent (such as star graphs or independent graphs), most schedulers yield similar turnaround times due to the reduced dependency constraints. The Random scheduler, inherently achieves TAT, Resource Utilization, and Communication cost in between the extremes achieved by the other schedulers. Another interesting observation is that greedy scheduler achieves maximum resource utilization and minimum turnaround time, albeit by incurring the highest communication and execution costs.

Table 20 Simulated/computed TAT for standard task graphs.

Random task graphs

Random task graphs with diverse levels of connectivity (0%, 25%, 50%, 75%, and 100%) is generated by using algorithm-264. The outcomes of our proposed algorithm, encompassing TAT, resource utilization, execution cost, and communication cost, are depicted in Fig. 3 through Fig. 4. From these results, we can conclude that the turnaround time increases due to the increase in the number of tasks and also the increase in task dependency. This is shown in Figs. 5, 67 and 8. Also when tasks are scheduled without fragmentation TAT increases as compared to tasks with fragmentation.

Figure 3
figure 3

Random task graph with 0% connectivity - TAT and resource utilization.

Figure 4
figure 4

Random task graph with 100% connectivity - TAT and execution cost.

Figure 5
figure 5

Random task graph with 25% connectivity - TAT and resource utilization.

Figure 6
figure 6

Random task graph with 50% connectivity - TAT and resource utilization.

Figure 7
figure 7

Random task graph with 75% connectivity - TAT and resource utilization.

Figure 8
figure 8

Random task graph with 100% connectivity - TAT and resource utilization.

Resource utilization decreases when scheduling a random task graph with a higher degree of dependency without fragmentation (as seen in Figs. 5, 6, 7, and 8), in contrast to when tasks are fragmented. Additionally, it’s noteworthy that all schedulers perform optimization for turnaround time and resource utilization when there is no inter dependency among the tasks, as illustrated in Fig. 3.

As the inter dependency between tasks within a task graph increases (with connectivity’s of 0% as shown in Fig. 9, 25% connectivity as shown in Fig. 10, 50% in Fig. 11, 75% in Fig. 12, and 100% in Fig. 13), it becomes evident that the greedy scheduler achieves the lowest Turnaround Time (TAT). However, this comes at the cost of higher communication expenses due to the fragmentation of tasks. Conversely, a greedy scheduler without task fragmentation incurs zero communication costs in all cases, effectively eliminating this expense from the scheduling process.

Figure 9
figure 9

Random task graph with 0% connectivity - TAT and communication cost.

Figure 10
figure 10

Random task graph with 25% connectivity - TAT and Commutation Cost.

Figure 11
figure 11

Random task graph with 50% connectivity - TAT and communication cost.

Figure 12
figure 12

Random task graph with 75% connectivity - TAT and communication cost.

Figure 13
figure 13

Random task graph with 100% connectivity - TAT and communication cost.

Figures 4, 14, 15, 16 and 17 shows that the greedy execution cost scheduler incurs the least execution cost. The computed values of turnaround time, execution cost, communication cost, and resource utilization using proposed schedulers for random task graphs with 25%, 50%, 75% and 100% dependency with varying number of task nodes and a given number of grid resources are tabulated in Tables  24, 25 and 26 respectively.

Figure 14
figure 14

Random task graph with 0% connectivity - TAT and execution cost.

Figure 15
figure 15

Random task graph with 25% connectivity - TAT and execution cost.

Figure 16
figure 16

Random task graph with 50% connectivity - TAT and execution cost.

Figure 17
figure 17

Random task graph with 75% connectivity - TAT and execution cost.

Table 21 Performance of standard task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.
Table 22 Performance of standard task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.
Table 23 Perform acne of fully connected task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.
Table 24 Performance of scheduling algorithms on random task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.
Table 25 Performance of scheduling algorithms on random task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.
Table 26 Performance of scheduling algorithms on random task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.

Scientific task graphs

The performance of the proposed algorithm is also evaluated by using scientific graphs such as Montage, CyberShake, LIGO etc. Here all workflows are generated by Pegasus Workflow Generator65. Results shown in Figs. 18, 19, 20, 21 and 22 demonstrates the performance of proposed algorithm.

Figure 18
figure 18

Scientific task graphs with resource utilization and communication cost.

Figure 19
figure 19

Scientific task graphs with resource utilization and execution cost.

Figure 20
figure 20

Scientific task graphs with resource utilization and TAT.

Figure 21
figure 21

Scientific task graphs with resource utilization and communication cost.

Figure 22
figure 22

Scientific task graphs with TAT and execution cost.

Our observation reveals that across all application task graphs, the greedy scheduler consistently generates schedules with the most optimal TAT and resource utilization. However, it’s important to note that this optimization is achieved at the expense of incurring the highest communication and execution costs compared to the schedules generated by the other schedulers.

A consistent trend that emerges across all schedulers is the inverse relationship between resource utilization and the extent of parallel execution of tasks, which is dictated by the inter-dependency constraints among tasks. For example, in the case of the Gaussian Elimination and Montage scientific application graphs, where tasks exhibit a high degree of inter-dependency, the scheduling sequences result in the lowest resource utilization. This highlights the influence of task inter-dependency on resource allocation and utilization in the scheduling process.

Similarly, computed values of turnaround time, execution cost, communication cost, and resource utilization using proposed schedulers for different scientific task graphs is tabulated in Table 27.

Table 27 Performance of scheduling algorithms on scientific task graphs on AWS EC2 Type-1, Type- 2, 2 Grids and 11 CPUs.
Table 28 TOPSIS ranking of scheduling algorithms on standard task graphs.
Table 29 TOPSIS ranking of scheduling algorithms on random task graphs.
Table 30 TOPSIS ranking of scheduling algorithms on scientific task graphs.

Formulation of the multi-objective-decision-making problem

The generic multi-attribute-decision-making (MADM) problem

Scheduling tasks in a grid network can be conceptualized as a MADM problem. In the context of MADM, the goal is to assess and prioritize various alternative solutions denoted as \(A_i(i = 1, 2, 3, \dots , I)\), taking into account specific criteria. These criteria, represented as \(C_j(j = 1, 2, 3, \dots , J)\), encapsulate the factors that play a role in influencing the ranking of the alternative solutions within the set \(A_i\).

Each alternative solution, denoted as \(A_i\), undergoes an evaluation against each individual criterion, represented by \(C_j\). This evaluation process produces a performance rating matrix \(X = (x_{ij})_{(I \times J)}\).

$$\begin{aligned} X = \begin{array}{*{20}c} A_1 \\ A_2 \\ \dots \\ A_I \\ \end{array} \mathop {\left( {\begin{array}{*{20}l} x_{11}&{} x_{12}&{} \dots &{} x_{1J}\\ x_{21}&{} x_{22}&{} \dots &{} x_{2J}\\ \dots &{} \dots &{} \dots &{} \dots \\ x_{I1}&{}x_{I2}&{} \dots &{} x_{IJ}\\ \end{array} } \right) } \limits ^{{\begin{array}{*{20}l} &{} C_1 &{} C_2 &{} \dots &{} C_j \\ \end{array} }} \end{aligned}$$

The user is tasked with specifying a set of weights, denoted as \(W = w_j(j = 1, 2, \dots , J)\), which serve as indicators of the user’s individual preferences for each criterion, \(C_j\).

Modeling task scheduling as an MADM problem

We model task scheduling problem as an MADM problem by:

  1. 1.

    Considering the schedule sequence output by each scheduler as the set of alternative solutions i.e. \(A = \{ a | a \subset \{GS, GCCS, GECS, GNFS\} \}\).

  2. 2.

    Considering the performance metrics of a schedule sequence as the set of criteria i.e. \(C = \{c | c \subset \{TAT, RU, CC, EC\}\}\).

  3. 3.

    Computing the performance rating of each scheduler (GSGCCSGECSGNFS) against every criteria (TATRUCCEC).

    i.e.

    $$\begin{aligned} X = \begin{array}{*{20}c} GS \\ GCCS \\ GECS \\ GNFS \\ \end{array} \mathop {\left( {\begin{array}{*{20}l} y_{11} &{} x_{12}&{} x_{13}&{} x_{14}\\ x_{21}&{} x_{22}&{} x_{23}&{} x_{24}\\ x_{31}&{} x_{32}&{} x_{33}&{} x_{34}\\ x_{41} &{} x_{42} &{} x_{43} &{} x_{44} \\ \end{array} } \right) } \limits ^{{\begin{array}{*{20}l} &{} TAT &{} RU &{} CC &{} EC \\ \end{array} }} \end{aligned}$$
  4. 4.

    Collecting a user’s preferences for each criterion involves ranking these criteria in descending order of importance. Weights are then allocated using a Geometric Progression, with greater weights being assigned to criteria ranked higher in importance by the user.

Solving the MADM problem

The task scheduling MADM problem is addressed using a well-regarded technique within the MADM field known as TOPSIS. TOPSIS operates on the principle that the optimal solution is the one closest to the positive-ideal solution while simultaneously being the farthest from the negative-ideal solution. Alternatives are ranked by computing an overall index based on their proximity to these ideal solutions.

The TOPSIS method comprises a series of steps, as follows:

  1. 1.

    Normalize the performance rating matrix.

    i.e. \(y_{ij} = \frac{x_{ij}}{\sqrt{ \sum _{i=1}^{I} x_{ij}^2 }}\)

    \(Y = \begin{bmatrix} y_{11} &{} y_{12} &{} \dots &{} y_{1J} \\ y_{21} &{} y_{22} &{} \dots &{} y_{2J} \\ \dots &{} \dots &{} \dots &{} \dots \\ y_{I1} &{} y_{I2} &{} \dots &{} y_{IJ} \end{bmatrix}\)

  2. 2.

    Determine the weighted, normalized performance rating matrix.

    i.e.

    \(V = \begin{bmatrix} y_{11} &{} v_{12} &{} \dots &{} v_{1J} \\ v_{21} &{} v_{22} &{} \dots &{} v_{2J} \\ \dots &{} \dots &{} \dots &{} \dots \\ v_{I1} &{} v_{I2} &{} \dots &{} v_{IJ} \end{bmatrix}\)

    Where \(v_{ij} = W_j * y_{ij}; (i = 1, 2, \dots , I; j = 1, 2, \dots , J)\)

  3. 3.

    Compute the positive and negative ideal solutions, \(A^+\) and \(A^-\), respectively.

    \(A^+ = [v_1^+, v_2^+, \dots , v_J^+]\) \(A^+ = [v_1^-, v_2^-, \dots , v_J^-]\)

    where,

    \(v_j^+ = {\left\{ \begin{array}{ll} max_{i=1}^{I}(v_{ij})\,\,\,\, \text {if }\,\,\,\, j \text { is a benefit attribute}, min_{i=1}^{I}(v_{ij})\,\, \text {if }\,\,\,\, j\,\,\,\, \text { is a cost attribute} \end{array}\right. }\)

    \(v_j^- = {\left\{ \begin{array}{ll} min_{i=1}^{I}(v_{ij})\,\,\,\, \text {if }\,\, j \,\,\,\,\text { is a benefit attribute}, max_{i=1}^{I}(v_{ij})\,\, \text {if } \,\,\,\,j\,\,\,\, \text { is a cost attribute} \end{array}\right. }\)

  4. 4.

    Calculate the Euclidean distance from the positive and negative ideal solutions.

    \(S_i^+ = \sqrt{ \sum _{j=1}^j (v_{ij} - v_{j}^+)^2 }\)

    \(S_i^- = \sqrt{ \sum _{j=1}^j (v_{ij} - v_{j}^-)^2 }\)

  5. 5.

    Calculate the closeness of each alternative solution to the ideal solution. \(V_i = \frac{S_i^-}{S_i^- + S_i^+}\)

  6. 6.

    Determining the rank order of all alternatives on the basis of their relative closeness to the ideal solutions. The larger the \(V_i\) is, the better the alternative solution \(A_i\) is. The best alternative solution is the one with the largest closeness to the ideal solution.

TOPSIS results and discussion

To rank the task schedule sequences produced by various schedulers, the TOPSIS method is employed. This method optimizes the selection of schedules according to the user’s prioritized objectives, which include Turnaround Time, Resource Utilization, Communication Price, and Execution Price, in terms of their desirability. Tables 28, 29, and 30 presents the result of the TOPSIS algorithm when applied to standard, random, and scientific task graphs, respectively. We explore different possible priority orders that users may assign to each criterion. Notably, we find a consistent ranking pattern for schedule sequences across all types of graphs, including Standard, Random, and Scientific graphs, which encompass Fully Connected, Pipeline, Star, Ternary, and Independent graph categories. Additionally, this ranking consistency persists even when the number of tasks varies (40, 121, 364, and 1039).

Weightage types 1, 2, 3, 4, as well as 7, 8, 9, and 10, exemplify situations where the user places the highest importance on turnaround time and resource utilization as criteria, while assigning less significance to communication cost and execution cost. In these scenarios, TOPSIS consistently ranks the greedy scheduler as the top solution. The second-best alternative solution is the greedy communication cost scheduler, which outperforms the other schedulers in terms of TAT and resource utilization.

However, in cases corresponding to weightage types 5, 6, 10, and 11, where the user’s preference primarily focuses on achieving an optimal communication cost, TOPSIS identifies the schedule generated by the greedy communication scheduler as the best solution. This scheduler minimizes communication costs to zero while maintaining TAT and resource utilization levels that are nearly on par with those achieved by the greedy scheduler. In this context, the output schedule sequence of the greedy scheduler is ranked last by TOPSIS, as it incurs the highest communication cost, contradicting the user’s prioritization of criteria desirability.

Conclusion and future work

In this paper, we presented a multi-objective task scheduling framework for scheduling different types of workflows on computational grids. The main objective of our proposed framework is to minimize the overall execution cost, including application turnaround time and communication cost, while maximizing grid utilization. The proposed scheduling framework is integrated with GridSim and validated through experiments conducted on weighted standard task graphs, weighted random task graphs, and scientific task graphs. Furthermore, we envisaged a multi-criteria decision method called Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) to rank the output of the scheduling sequence based on different objective functions and the requirements of both users and service providers.

As part of future work, we plan to design a multi-objective task scheduling framework based on Large Language Models (LLMs) and compare the performance with NSGA-II in a computational cloud computing environment.