key: cord-0443664-r98rx6lq authors: Materwala, Huned; Ismail, Leila title: Performance and Energy-Aware Bi-objective Tasks Scheduling for Cloud Data Centers date: 2021-04-25 journal: nan DOI: nan sha: 9770eb428562fc272fcb3efd8fc3cef7e0be8d14 doc_id: 443664 cord_uid: r98rx6lq Cloud computing enables remote execution of users tasks. The pervasive adoption of cloud computing in smart cities services and applications requires timely execution of tasks adhering to Quality of Services (QoS). However, the increasing use of computing servers exacerbates the issues of high energy consumption, operating costs, and environmental pollution. Maximizing the performance and minimizing the energy in a cloud data center is challenging. In this paper, we propose a performance and energy optimization bi-objective algorithm to tradeoff the contradicting performance and energy objectives. An evolutionary algorithm-based multi-objective optimization is for the first time proposed using system performance counters. The performance of the proposed model is evaluated using a realistic cloud dataset in a cloud computing environment. Our experimental results achieve higher performance and lower energy consumption compared to a state of the art algorithm. Cloud computing [1] has become a very promising paradigm for both consumers and service providers allowing convenient, on-demand network access to a shared pool of configurable computing resources. With the advancement in technological paradigms such as the Internet of Things (IoT) and Big data analytics for smart cities' applications, data center traffic is exploding with the rapid growth of cloud applications. It is predicted that the global cloud data center traffic will increase from 6 zettabytes (ZB) in 2016 to reach 19.5 ZB by the year 2021 [2] . Furthermore, with the current COVID-19 pandemic situation, all the essential services such as healthcare, work, food, and education have become online. These services heavily rely on the cloud computing paradigm. Consequently, cloud computing infrastructure must  Correspondence: Leila Ismail (email: leila@uaeu.ac.ae) maintain large-scale data centers, consisting of thousands of computing nodes that consume a large amount of electrical power. It is estimated that the data centers will become the world's largest energy consumers globally, with an increase from 3% of total energy consumption in 2017 to 4.5% in 2025 [3] . The data center energy cost increases by 100% every 5 years [4] . High energy consumption not only incurs a high cost but also harms the environment. It is predicted that by 2025 the data centers will emit nearly 3.5% of carbon emission globally [5] . According to a report by Natural Resources Defense Council (NRDC), it is expected that data centers will emit nearly 100 million metric tons of carbon pollution per year [6] . Consequently, it becomes crucial to address this issue of cloud energy consumption. Several works in the literature have proposed energy-efficient tasks' scheduling algorithms using evolutionary algorithm in the cloud computing environment [7]- [12] . As tasks' scheduling in the cloud is an NP-hard problem, the evolutionary algorithm, such as genetic [13] , is well suited for task optimization problems due to its characteristics of parallel and efficient global search. However, the tasks' performance, i.e., Quality of Service (QoS) should be considered while minimizing energy consumption. Very few works in the literature focus on multi-objective performance and energy-aware tasks scheduling in the cloud using evolutionary algorithm [8] - [12] . However, none of these works considers the tasks' resource utilization in terms of system performance metrics, i.e., CPU, memory, disk, and network, while computing the energy consumption. This is crucial considering the dynamic nature of the tasks submitted to the cloud. In this paper, we develop an intelligent autonomous agent for performance and energy-aware biobjective tasks' scheduling in a cloud data center based on the evolutionary algorithm. We consider the task's execution time as a measure of performance. The tasks' scheduling is modeled as a bi-objective optimization problem to minimize tasks' execution time and energy consumption. We use the Locally Corrected Multiple Linear Regression (LC-MLR) [14] power consumption model, which is based on CPU, memory, disk, and network utilization, for the prediction of the computing server's power consumption. The predicted power is then used to compute the server's energy consumption. The performance of the proposed model is evaluated using a realistic cloud dataset in terms of energy consumption and execution time. This is in a cloud data center simulated using the CloudSim 3.0.3 [15] , a software tool for cloud computing simulation. The performance of the proposed model is compared with the genetic algorithmbased task scheduling model in the literature that uses a power model based on CPU and memory utilization values [8] . The rest of the paper proceeds as follows. Section 2 provides an overview of the related work. The cloud system model is presented in Section 3. Section 4 describes the optimization problem and its formulation using evolutionary algorithm. The experiments and the performance evaluation are presented in Section 5. Section 6 concludes our work. Several works in the literature have proposed the use of the evolutionary genetic algorithm for energyefficient multi-objective tasks' scheduling in a cloud computing environment [8] - [12] . However, [11] and [12] do not mention the power model used for the computation of energy consumption. A hardware-based power model using the computing server's voltage and frequency is considered by [10] . However, the hardware-based power model often requires physical sensors for monitoring the hardware resources. This leads to high hardware cost and sensors' energy consumption when the sensors are attached to thousands of servers in a cloud data center [16] . A software-based power model consisting of system performance metrics such as CPU, memory, disk, and/or network resources is used by [9] and [8] . However, the power model used by [9] is based only on CPU utilization, and the one used by [8] is based on CPU and memory utilization values. To the best of our knowledge, none of the works on performance and energy-optimized cloud tasks' scheduling based on evolutionary genetic algorithm use an energy consumption formulation based on system performance counters. In this work, we propose an evolutionary algorithm-based intelligent agent for task scheduling in cloud computing while minimizing the task's execution time and energy consumption. The energy consumption in the proposed bi-objective optimization method considers system performance counters. We compare the performance of our proposed model with the genetic algorithm-based bi-objective optimization model in the literature that uses power model based on CPU and memory utilization values [8] . The cloud computing architecture consists of 'v' heterogeneous virtual machines (VMs) that operate on 'p' heterogeneous physical machines (PMs) as shown in Figure 1 . The set of VMs in represented as V = {VM1, VM2, …, VMv} and the set of PMs is represented as P = {PM1, PM2, …, PMp}. The cloud users' tasks are submitted to the cloud broker which implements an intelligent agent that schedules the tasks on a VM such that the energy consumption and task execution time are the minima. The task analyzer monitors and records the resources and service requirements of the tasks submitted by the cloud users. The resources' requirements of a task include the CPU, memory, disk, and network utilization values, while the service requirement involves the performance metrics such as task deadline and execution time. Based on the task's requirements in terms of CPU, memory, disk, and network, the agent calculates the execution time and energy consumption on each VM. Therefore, the agent communicates with the VM manager which is responsible to monitor the resource utilization of running VMs. It reads the current energy consumptions of the VMs which are maintained by the energy consumption monitor of the cloud. The power consumption of executing a task on a computing server is predicted using a power model. We use the Locally Corrected Multiple Linear Regression (LC-MLR) power model as stated in Equation 3 . LC-MLR is selected in this paper because it is found to be accurate in a cloud computing environment [14] . where , , , , , , where , 1 are the errors calculated as the difference between the actual and the predicted power consumption values obtained from the MLR model. The energy consumption of a task on can be then calculated using Equation 6 , based on the energy function proposed in [17] that considers the increase in the energy consumption of the ongoing tasks on a VM due to the increase in their execution time while calculating the energy consumption of a new task on that VM. where −1, is the new execution time of the task −1 that was ongoing on while task is scheduled on . The new execution time is the increment in the execution time of −1 as the processing speed of is distributed among −1 and . , ′ is the execution time of when running in parallel with −1 . 0 is the time when the task is executed in parallel with the task −1 and 1 is the time when the task is executed alone. Let us consider a set of tasks, T = {t1, t2, …, tm} that needs to be scheduled on a set of virtual machines, V. The scheduling of the tasks on the VMs is represented using a matrix S(m x v). For instance, Sji = 1 indicates that the task tj is scheduled on VMi for execution. The bi-objective optimization problem is to schedule tasks in a cloud computing environment in a way that the execution time and the energy consumption are the minima. These objectives are represented using a weighted sum cost function as stated in Equation 7 . where and (1 − ) are the weights for the execution time and the energy consumption objectives respectively such that 0 ≤ ≤ 1. The bi-objective tasks scheduling optimization problem can be now formulated as follows: Objective: ∀ ∈ ( , ), = {1, 2, 3, … , } Constraints: where Equation 8 shows the optimization objective, i.e., minimizing the cost function. Equations 9 and 10 represent the constraints. Equation 9 states that each task should be executed only on one VM and Equation 10 indicates that the total utilization of a VM should be always less than a threshold utilization to avoid performance degradation. The task scheduling optimization problem in a cloud computing environment can be designed as an autonomous agent system where the agent schedules the tasks on the VMs to minimize the objective function. The task analyzer, the VM manager, the power consumption monitor, and the resource utilization monitor components in the cloud broker represent the sensors of the system environment and the mapping of tasks to the VMs depict the actuator output. The agent's system environment for task scheduling is fully observable, stochastic, sequential, dynamic, discrete, and single-agent. The intelligent autonomous agent for task scheduling can be classified as a utility-based agent [18] as shown in Figure 2 . This is because the tasks' scheduling problem involves contradicting optimization objectives with a trade-off between energy consumption and execution time. Evolutionary genetic algorithm [13] is a search-based heuristic. The main components of the evolutionary algorithm are as follows: • Initial tasks-VMs mapping (population): The mapping of tasks to the VMs is the initial population in cloud tasks' scheduling. Each solution in the population is represented as a chromosome. The chromosome for tasks scheduling problem can be considered as the mapping of tasks to VMs. • Fitness function: The inverse of the cost function for task scheduling that minimizes the energy consumption and the execution time (Equation 7) is the fitness function for the problem under study. • Crossover: Crossover operation is achieved by selecting two parent population and then creating a new mapping by alternating some or all the genes of the parents. Each element of the chromosome is known as a gene. • Mutation: It is the operator that produces offspring by tweaking genes of a single chromosome. In this paper, we use an energy-efficient task scheduling algorithm Modified Worst Fit Decreasing (MWFD) for the selection of the initial population. MWFD is chosen for the selection of the initial population due to its optimal performance compared to other energy-aware task scheduling algorithms [19] . This reduces the time to obtain a global solution. In MWFD, each task is assigned to a VM where the increase in power consumption after scheduling the task is the maximum. Algorithm 1 shows the pseudocode for population initialization. Algorithm 2 shows the pseudocode for bi-objective optimization using evolutionary algorithm. allocated VM = null 6. foreach VM in VMList do 7. if VM has enough resources for Task then 8. powerAfterAllocation = Calculate power using Equation 7 9. powerDiff = VM.getPower() -powerAfterAllocation 10. if powerDiff >maxPower then 11. allocatedVM = VM 12. maxPower = powerDiff 13. if allocatedVM ≠ null then 14. allocation.add(Task, allocatedVM) 15. Return allocation Algorithm 2: Performance and energy-aware bi-objective tasks' scheduling using evolutionary algorithm 1. Input: TaskList, VMList Output: Scheduling of Tasks 2. Generate the initial tasks and VMs mapping using Algorithm 1 3. while (non-termination condition) do 4. SelectFitTasksVMsMapping //select initial tasks-VMs mapping 5. Perform_crossover_NewTasksVMsMapping //create new scheduled mapping 6. Perform_mutation 7. foreach newMapping do //check for each new scheduled tasks 8. if Fitness.newMapping