key: cord-0044873-spdcb9s5 authors: Díaz, Hernán; González-Rodríguez, Inés; Palacios, Juan José; Díaz, Irene; Vela, Camino R. title: A Genetic Approach to the Job Shop Scheduling Problem with Interval Uncertainty date: 2020-05-15 journal: Information Processing and Management of Uncertainty in Knowledge-Based Systems DOI: 10.1007/978-3-030-50143-3_52 sha: 6b701dfddf0268f69f94868edb76c51881dc2ddc doc_id: 44873 cord_uid: spdcb9s5 In this paper we tackle a variant of the job shop scheduling problem where task durations are uncertain and only an interval of possible values for each task duration is known. We propose a genetic algorithm to minimise the schedule’s makespan that takes into account the problem’s uncertainty during the search process. The behaviour of the algorithm is experimentally evaluated and compared with other state-of-the-art algorithms. Further analysis in terms of solution robustness proves the advantage of taking into account interval uncertainty during the search process with respect to considering only the expected processing times and solving the problem’s crisp counterpart. This robustness analysis also illustrates the relevance of the interval ranking method used to compare schedules during the search. Scheduling plays an important role in most manufacturing and production systems as well as in most information processing environments, transportation and distribution settings and, more generally, in service industries [24] . One of the most relevant problems in scheduling is the job shop, both because it is considered to be a good model for many practical applications and because it poses a challenge to the research community due to its complexity. This complexity is the reason why approximate methods and, in particular, metaheuristic search techniques, are especially suited for solving the job shop [28] . Traditionally, it has been assumed that scheduling problems are deterministic. However, for many real-world problems design variables such as processing times may be subject to perturbations or changes, dependent on human factors, etc. The most common approach to handling uncertainty is that of stochastic scheduling, modelling the duration of tasks by probability distributions [24] . However, not only does this present some tractability issues, but also, probability distributions are better suited to model variability of repetitive tasks instead of uncertainty due to lack of information [8] . Alternatively, fuzzy scheduling models uncertain durations as fuzzy numbers or fuzzy intervals, that is, possibility distributions representing more or less plausible values, in an approach that is computationally more appealing and presupposes less knowledge [9] . A third and simpler way of representing uncertainty for activity durations are intervals. Interval uncertainty is present as soon as information is incomplete and it does not assume any further knowledge. Also, it represents a first step towards solving problems in the other uncertain frameworks. Indeed, an interval can be seen as a uniform probability distribution or the support of an unknown probability distribution [1] . Also, an interval not only is a particular case of a fuzzy interval, but also the α-cuts of fuzzy intervals are intervals, so a fuzzy scheduling problem can be decomposed in multiple interval scheduling problems. Despite its interest, research on the job shop scheduling problem with interval activity durations is still scarce. In [15] a job shop scheduling problem with interval processing times is considered and a population-based neighborhood search (PNS) is presented to optimize the makespan. A genetic algorithm is proposed in [16] , but with the objective of minimising the total tardiness with respect to job due dates. Different variants of multiobjective interval job shop problems are considered in [17] and [19] . The former incorporates non-resumable jobs and flexible maintenance to the problem and proposes a multi-objective artificial bee colony algorithm to minimise both the makespan and the total tardiness. The latter considers a dual-resource constrained job shop with heterogeneous resources, and a dynamical neighbourhood search is proposed for lexicographic minimisation of carbon footprint and makespan. Finally, we find a flexible job shop problem with interval processing times in [18] , where a shuffled frog-leaping algorithm is adopted to minimise the makespan. Uncertain activity durations are also modelled as intervals in scheduling problems other than the job shop and its variants in [1, 11, 20, 26] . At a more theoretical level, several attempts have been made to study how to compute earliest and latest starting times of all activities and, therefore, critical paths, over all duration scenarios in an activity-on-node network where the duration of every activity is an interval. This is essential to devise successful local search methods, as shown in deterministic job shop. A summary of the main results together with a thorough literature review can be found in [3] . This paper constitutes a starting point for a systematic study of solving methods for the interval job shop scheduling problem with makespan minimi-sation. We propose a solution for Interval Job Shop Scheduling Problem using genetic algorithms. The paper is organised as follows. Section 2 briefly describes the interval job shop scheduling problem. In Sect. 3 the concept of -robustness is introduced to measure the error of the prediction made by the a-priori makespan compared to the executed makespan with respect to its expected value used in this work is introduced. Section 4 details the basic schema of a genetic algorithm and the coding approach. Finally, in Sect. 5 an experimental study is developed to check the performance of the genetic algorithm in solving this problem. In addition, we also test if modelling uncertainty as intervals for processing times during the search process is worth the while and carry out some preliminary analysis of the influence of the different interval rankings. Some conclusions are drawn in Sect. 6. A solution to this problem is a schedule s, i.e. an allocation of starting times for each task, which, besides being feasible (in the sense that all precedence and resource constraints hold), is optimal according to some criterion, most commonly minimising the makespan C max , that is, the completion time of the last operation (and therefore, of the whole project). In real-life applications, it is often the case that the time it takes to process a task is not exactly known in advance; instead, only some uncertain knowledge about the duration is available. If only an upper and a lower bound of each duration are known, an uncertain processing time can be represented as a closed interval of possible values denoted a = [a, a] = {x ∈ R : a ≤ x ≤ a}. Let IR denote the set of closed intervals. The job shop problem with makespan mimisation essentially requires two arithmetic operations on IR: addition and maximum. These are defined by extending the corresponding operations on real numbers [21] , so given two intervals a = [a, a], b = [b, b] ∈ IR, Another issue to be taken into account when processing times take the form of intervals is that of comparisons. Indeed, if several schedules are available, the "best" one would be the one with "minimal" value of the makespan (an interval). However, there is no natural total order in the set of intervals, so an interval ranking method needs to be considered among those proposed in the literature [7, 14] . In [7] , the authors highlight the following three total orders in IR with certain nice behaviour (called admissibility in that work): Both (3) and (4) are derived from a lexicographical order of interval extreme points while the last one is proposed in [27] . Obviously, all three linear orders can be used to rank intervals. In [15] , a different ranking method is used for the interval job shop: where P (a ≥ b) is the possibility degree that a is greater or equal than b as introduced in [13] : It can be easily shown that this ranking is equivalent to the one induced by the interval midpoint: where ∀a ∈ IR, m(a) = (a+a) 2 . It coincides with the classical Hurwicz criterion for interval comparison with α = 1/2 [12] , used for interval scheduling in [1] . Also, since the interval's midpoint is the expected value of the uniform probability distribution in that interval, using the midpoint for comparing interval-valued objective functions is also closely related to the stochastic dominance based on expectation used in stochastic scheduling [24] . A schedule s establishes an order π among tasks requiring the same machine. Conversely, given a task processing order π, the schedule s (starting times of all tasks) may be computed as follows. For every task o ∈ O with processing time p o , let s o (π) and c o (π) denote respectively the starting and completion times of o, let P M o (π) and SM o (π) denote the predecessor and successor tasks of o in the machine ν o according to π, and let P J o and SJ o denote respectively the predecessor and successor tasks of o in its job (P M o (π) = 0 or P J o = 0 if o is the first task to be processed in its machine or its job). Then the starting time s o (π) of o is an interval given by s o (π) = max(s PJo + p PJo , s PMo(π) + p PMo(π) ). Clearly, c o (π) = s o (π) + p o (π). If there is no possible confusion regarding the processing order, we may simplify notation by writing s o and c o . The completion time of the last task to be processed according to π thus calculated will be the makespan, denoted C max (π) or simply C max . We obtain an interval-valued schedule in the sense that the starting and completion times of all tasks and the makespan are intervals, interpreted as the possible values that the times may take. However, notice that the task processing ordering π that determines the schedule is crisp; there is no uncertainty regarding the order in which tasks are to be processed. We are now in a position to formulate the Interval Job Shop Scheduling Problem or IJSP in short, as follows: (9) subject to: where the minimum min R C max in (9) is the smallest interval according to a given ranking R in the set of intervals IR. Constraint (10) defines the makespan as the maximum completion time of the last task of each job. Constraints (11) and (12) establish the relationship between the starting and completion time of each task. Constraints (13) and (14) correspond to precedence relations between tasks within each job and constraints (15) and (16) establish that the execution of two tasks requiring the same machine cannot overlap. Notice that the completion time of each job J j in the resulting schedule s is the completion time of the last task in that job, given by C j = c o(j,m j ) . The resulting problem will be denoted J|p o ≤ p o ≤ p o |C max , following the three-field notation schema for scheduling problems. Clearly, the IJSP is NPhard, since setting all processing times to crisp numbers yields the classical JSP, which is itself NP-hard [24] . A solution to the IJSP provides an interval of possible values for the starting time of each task and, hence, an interval of possible values for the makespan. In fact, it is impossible to predict what the exact starting and completion times will be until the project is actually executed. This idea is the basis for a semantics for fuzzy schedules from [10] by which solutions to a job shop problem with uncertainty should be understood as a-priori solutions. Only when tasks are executed according to the ordering π provided by the schedule we shall know their real duration and, hence, obtain an a-posteriori solution with deterministic It would be expected that the predictive schedule does not differ much from the actual executed one. This is strongly related to the idea of robust schedule as one that minimises the effect of executional uncertainties on its performance [4] . This high-level definition is subject to many different interpretations when it comes to specifying robustness measures [25] . Here, we adapt the concept ofrobustness first proposed for fuzzy scheduling problems in [22] inspired by the work on stochastic scheduling from [5] . The rationale behind this concept is to measure the predictive error of the a-priori makespan, the interval C max , compared to the actual makespan C ex max obtained after execution. Notice that C ex max is a real number that corresponds to a specific realisation of task processing times , o ∈ O}, usually called a configuration in the literature. Assuming that tasks are executed without unnecessary delays at their earliest possible starting times (as explained in Sect. 2.2), it is clear that C ex max ∈ C max . Thus, the prediction is always accurate in terms of bounds for the possible makespan values after execution. Now, if we are to give a single value as predicted makespan based on the interval C max , in the absence of further information it seems natural to consider the expected or mean value of the uniform distribution on that interval, E[C max ] = (C max − C max )/2. We can then measure the error of the prediction made by the a-priori makespan as the (relative) deviation of the executed makespan with respect to this expected value. In consequence, for a given ≥ 0, a predictive schedule with makespan interval value C max will be considered to be -robust if the relative error made by E[C max ] with respect to the makespan C ex max of the executed schedule is bounded by , that is: Clearly, the smaller the bound , the more accurate the a-priori prediction is or, in other words, the more robust the interval schedule is. Although the expression for the expected value E[C max ] is the same as the interval's midpoint used in the ranking criterion ≤ MP , this is just a mere coincidence. In general, a robustness measure must be independent of the ranking method used to compare schedules. In particular, E[C max ] represents a prediction based on C max in the absence of further knowledge on how values are distributed in that interval, whereas the midpoint m(C max ) is a weighted average of the optimistic C max and pessimistic C max makespan values, representing a decision maker's equilibrium between those two extreme attitudes. Finally, this measure of robustness is dependent on a specific configuration P ex of task processing times obtained upon execution of the predictive schedule s. In the absence of real data regarding executions of the project, as is the case with the usual synthetic benchmark instances for job shop, we may resort to Monte-Carlo simulations. The idea is to simulate K possible configurations , o ∈ O} for task processing times, using uniform probability distributions to sample possible durations for every task. For each configuration k = 1, . . . , K, let C k max denote the exact makespan obtained after executing tasks according to the ordering provided by s. Then, the average -robustness of the predictive schedule across the K possible configurations, denoted , can be calculated as: This value provides an estimate of how robust is the predictive schedule s across different processing times configurations. Again, the lower , the better. Genetic algorithms have proved to be a very useful tool for solving job shop problems, either on their own or combined with other metaheuristics [28] . Roughly speaking, a genetic algorithm starts by building a set of initial solutions or initial population P 0 . This population is then evaluated and the algorithm begins an iterative process until a stopping criterion is met, typically a fixed number of iterations or consecutive iterations without improvement. At each step i, individuals from the population P i are selected and paired for mating and, recombination operators of crossover and mutation are applied to each pair with probability p cross and p mut respectively, creating a new population of offspring solutions Of f i . The new population is evaluated and a replacement operator is applied to merge P i and Of f i into the new population P i+1 for the next iteration. Once the stopping criterion is met, the best solution according to the interval ranking is selected and returned from the last population. Algorithm 1 summarises these steps. In this work, several well-known selection, recombination and replacement operators for Job Shop Scheduling problems are tried in order to find the best setup for the genetic algorithm. The set of operators and their impact on solving this problem are detailed in Sect. 5. A crucial part in designing algorithms is how to encode and decode solutions. Following [6] , we encode a solution as a permutation with repetition. This is a permutation of the set of tasks, where each task o(j, l) is represented by its job number j. For example, a topological order (o(2, 1), o(1, 1), o(2, 2), o(3, 1), o(3, 2), o(1, 2) ) is encoded as (2 1 2 3 3 1). The decoding is done using an insertion strategy: we iterate along the chromosome and for each task o(j, l) we schedule it at its earliest feasible insertion position as follows. Let η k be the number of tasks scheduled on machine k = ν o(j,l) and let σ k = (0, σ(1, k) , ..., σ(η k , k)) denote the partial processing order of tasks already scheduled in machine k. Then a feasible insertion position q, 0 ≤ q < η k for o(j, l) is a position such that max{c σ(q,k) , c o(j,l−1) } + p o(j,l) ≤ s σ(q+1,k) and max{c σ(q,k) , c o(j,l−1) } + p o(j,l) ≤ s σ(q+1,k) , so the earliest feasible insertion position is the smallest value q * verifying these inequalities. We set s o(j,l) = max{c σ(q * ,k) , c o(j,l−1) } if q * exists, and s o(j,l) = max{c σ(η k ,k) , c o(j,l−1) } otherwise. The purpose of the experimental study is threefold: assess the proposed genetic algorithm, see if considering the uncertainty in processing times during the search process is worth the while and carry out a preliminary analysis of the influence of the different interval rankings. To test the algorithm, we consider 12 very well-known instances for the job shop problem: classical instances FT10 (size 10 × 10) and FT20 (20 × 5), and instances La21, La24, La25 (15×10), La27, La29 (20×10), La38, La40 (15×15), and ABZ7, ABZ8, ABZ9 (20 × 15) that form the set of 10 problems identified in [2] as hard to solve for classical JSP. The processing times are modified to be intervals as follows: given the original crisp processing time of an operation Gold 6132 processor at 2.6 Ghz and 128 Gb RAM with Linux (CentOS v6.9), using a C++ implementation. Regarding the algorithm's parameter configuration, we have run several tests to find the best setup. A first batch of experiments are conducted to test different recombination operations and probabilities as well as selection strategies. The considered operators are given in Table 1 , with the best setup values in bold. In all cases the stopping criterion is set to 500 iterations. A second test based on convergence demonstrates that the best population size is 250. Figure 1 shows the average evolution of the expected value of makespan across 30 runs of the algorithm on instance FT10. The dotted line corresponds to the expected makespan of the best solution in the population and the continuous line to the average of the whole population. It is clear that within 500 iterations, the algorithm reaches a convergence point. The behaviour on the remaining instances is similar, so we adopt this number of iterations as stopping criterion for the algorithm. To asses the performance of the genetic algorithm (GA in the following), we compare it with the PNS algorithm proposed in [15] , which to our knowledge constitutes the state-of-the art in the IJSP with makespan minimisation. The authors use ≤ pd to rank different intervals in PNS. Since ≤ pd is equivalent to ≤ MP and for the sake of a fair comparison, we also adopt the same ranking. GA is run on the same set of 17 instances as PNS, which are adapted versions of the well-known crisp instances ORB1-5, LA16-25 and ABZ5-6, and the stopping criterion is set to 25 consecutive iterations without improvement. Table 2 shows for each algorithm, the average expected makespan across all the runs (20 runs for PNS and 30 for GA) together with runtimes, as well as a column with the relative difference between the performances of GA and PNS. We can see that, despite not having a neighbourhood search component, GA outperforms PNS in 14 out of the 17 instances, and is marginally worse in the remaining 3 instances (0.4% worse in ORB2, 0.5% in La16 and 0.01% in La17). Overall, GA obtains an average improvement of 1.2% compared to PNS. Additionally, a t-test for paired samples is run to compare the results of GA and PNS (after both samples pass a Kolmogorov-Smirnov normality test), confirming that there are indeed significant differences between both algorithms for a significance level of 0.05. Regarding runtime, GA is 93.8% faster than PNS. Notice however, that runtimes of PNS are those provided by the authors using their own machine and therefore comparisons in this sense must be done with caution. We have used the set of 17 instances considered in [15] in Table 2 to compare GA with the state-of-the-art. However, in the deterministic case, the original crisp instances have already been solved to optimality and their fuzzy counterparts offer little room for improvement, as shown in [23] . For this reason we will now switch to the set of more challenging instances introduced at the beginning of this section for the remaining experimental results. One may wonder if solving the crisp problem that results from considering only the midpoint of the interval processing times yields similar results to using intervals with the added advantage of having all the available tools for deterministic JSP. Including uncertainty in the search process adds some difficulty to the problem: different concepts need to be adapted or redefined and solving methods tailored to handle the uncertainty need to be proposed, usually with an increased complexity. It is also natural to see if the choice of a ranking method in the interval setting has any influence on the outcome. To try to answer these questions, we carry out a new set of experiments. For every IJSP instance we run GA 30 times considering each of the four different ranking methods and 30 times on the instance's crisp counterpart. Notice that the objective function is an interval in the first four cases and a crisp value in the last one, so they are not directly comparable. Instead, we measure the -robustness of the 30 solutions obtained by GA in each case using K = 1000 possible realisations, to compare the resulting solutions in terms of their quality as predictive schedules. Figure 2 depicts for each instance the boxplots with the values with the schedules that result from the 30 runs of GA in each case. We can see that, regardless of the ranking considered, solutions are more robust when intervals are taken into account during the search process. This is confirmed by several t-tests, showing that the -robustness of the interval schedules, regardless of the ranking, is significantly better than the one of the crisp schedule for a significance level of 0.05 on all instances expect La25. Regarding the choice of ranking method, according to the t-tests there is no significant difference between ≤ MP and ≤ Y X on any instance. This is actually not surprising, since ≤ Y X can be understood as a refinement of ≤ MP , but it shows that this refinement does not necessarily translate into more robust schedules. Also, there are no significant differences between ≤ MP and ≤ Lex1 on any instance except ABZ9, La21 and La24. More interestingly, the ranking ≤ Lex2 yields solutions significantly more robust than those obtained using any other ranking on all instances except FT20, La24 and La40, where it is not significantly better than ≤ MP and ≤ Y X . We may conclude that solving the interval JSP results in more robust schedules than solving a simpler deterministic counterpart and that the choice of interval ranking method does have an influence on the outcome. In this work we have developed an approach to solving the IJSP using a GA. Results show that GA is competitive with the existing methods from the literature. In addition, incorporating the interval uncertainty in the search process yields more robust solutions than solving an alternative crisp problem. On the other hand, the choice of interval ranking method plays an important role in the final solution's performance. Further work needs to be done to obtain more powerful search methods specifically designed for handling interval uncertainty and to thoroughly analyse the influence of different ranking methods in order to make a proper choice. Single machine scheduling problem with interval processing times to minimize mean weighted completion time A computational study of the job-shop scheduling problem Temporal analysis of projects under interval uncertainty Executing production schedules in the face of uncertainties: a review and some future directions A theoretic and practical framework for scheduling in stochastic environment A generalized permutation approach to jobshop scheduling with genetic algorithms Generation of linear orders for intervals by means of aggregation functions Representing partial ignorance On possibility/probability transformations Semantics of schedules for the fuzzy job shop problem Evolutionary multi-objective blocking lot-streaming flow shop scheduling with interval processing time A class of criteria for decision-making under ignorance. Cowles Commission Discussion Paper A nonlinear interval number programming method for uncertain optimization problems A comparative study of different order relations of intervals Population-based neighborhood search for job shop scheduling with interval processing time Interval job shop scheduling problems Multi-objective artificial bee colony for interval job shop scheduling with flexible maintenance A novel shuffled frog-leaping algorithm for flexible job shop scheduling with interval processing time An effective neighborhood search for scheduling in dual-resource constrained interval job shop with environmental objective Schedule execution for twomachine flow-shop with interval processing times Introduction to Interval Analysis. Society for Industrial and Applied Mathematics Robust swarm optimisation for fuzzy open shop scheduling Benchmarks for fuzzy job shop problems Scheduling. Theory, Algorithms, and Systems, 5th edn Robustness in operational research and decision aiding: a multi-faceted issue Single machine scheduling problem with interval processing times and total completion time objective Some geometric aggregation operators based on intuitionistic fuzzy sets Review of job shop scheduling research and its new perspectives under Industry 4.0