key: cord-0044877-ndahujb9
authors: Expósito-Márquez, Airam; Expósito-Izquierdo, Christopher; Melián-Batista, Belén; Moreno-Vega, José Marcos
title: Fuzzy Greedy Randomized Adaptive Search Procedure and Simulation Model to Solve the Team Orienteering Problem with Time Windows
date: 2020-05-15
journal: Information Processing and Management of Uncertainty in Knowledge-Based Systems
DOI: 10.1007/978-3-030-50143-3_56
sha: 80445f7014b80a3db8c3241a37240090ae08646d
doc_id: 44877
cord_uid: ndahujb9

Tourism is a relevant economic activity that provides important resources (income, employment,...) to countries. When a tourist visits a country or city, he/she wants to know their points of interest. To do this, he/she must select some of the places according to his/her preferences and design routes to visit them. This problem can be adequately modeled as a Team Orienteering Problem with Time Windows (TOPTW). In this paper we propose a fuzzy GRASP and a multi-agent simulation model to solve the TOPTW. Our proposal incorporates two criteria to build the restricted candidate list. The computational results obtained show the validity of the proposal.

Tourism plays an important role in the economy of the countries and represents an important source of income. According to the World Tourism Organization 1 , 1400 million people moved around the world in 2019. A significant percentage of these tourists want to visit the points of interest of the country or city they visit. Travel agencies and specialized companies offer routes to visit these points that generally do not satisfy the preferences of all tourists. An interesting route for one tourist may not be suitable for another. It is therefore necessary to design tools that provide routes adapted to the preferences of each tourist. In addition to tourist preferences, other factors must be considered to build customized routes. Some of the most relevant are the importance of the points of interest, the duration of the routes and the opening hours.

The previous problem can be modeled as a Team Orienteering Problem with Time Windows (TOPTW). The TOPTW is an extension of the Team Orienteering Problem (TOP) [4] , in which, given a set of visiting points with a score and service time, the objective is to maximize the sum of the collected scores in a given number of routes, while guarantying a maximum travel time in each route. For solving the TOP, different exact an heuristic algorithms have been proposed [6] . In the TOPTW, each visiting point has to be visited within a given time window.

In this paper we present a Fuzzy Greedy Randomized Adaptive Search Procedure (GRASP) to solve the TOPTW. GRASP is a metaheuristic which consists of two phases. In the constructive phase, a solution is built by iteratively selecting an element of a restricted candidate list. In the improvement phase, the above solution is improved by applying a local search. These steps are repeated until a stopping rule is met. In our case, the restricted list of candidates is formed by the appropriate points of interest to be included in some route. Unlike the standard GRASP, Fuzzy GRASP uses two criteria to build this restricted candidate list. The first criterion evaluates the point of interest by attending to time increment of the route. The second criterion uses the score to evaluate the point of interest. On the other hand, we also propose a multi-agent simulation model aimed at handling the randomness of the optimization problem in practical scenarios. We consider that the scores associated to the points of interest are not deterministic, but can have low, moderate and high levels of variability instead. This means that some points have been clearly bad or good for most visitors, but others have not.

We use a classic set of instances to test our proposal. The obtained computational results show its good behavior. Fuzzy GRASP efficiently identifies high quality solutions in insignificant computational time.

The problem addressed in this paper is modelled as a multiple route-planning problem, specifically as a TOPTW. It consists of a set I of n points of interests (POIs), each with a given score, and a set K of m routes with the aim of visiting some of them in a given limited time. Each POI has associated a score or profit, a visit time, and a time windows. The starting and ending points are fixed, and represent tourists place of stay. However, not all POIs have to be visited. The number of routes corresponds to the number of stay days at destination. The objective function is to maximize the total collected score.

The POIs are identified by an index i, i = 1, 2, ..., n. Indexes 1 and n represent the starting and ending POIs. Each route is represented by an index k, k = 1, 2, ..., m. The score obtained in each POI is s i , i = 1, 2, ..., n. The time spent by the tourist when visiting POI i is r i , i = 1, 2, ..., n. The travel time from POI i to POI j is denoted by t ij , i, j = 0, 1, ..., n. T k max is the limit time for route k, k = 1, ..., m, considering travel, visit and waiting times. m is the number of tourist stay days. The time windows are represented by e i , l i , the opening and closing times of a POI i, respectively; i = 1, 2, ..., n.

The variable decisions of the mathematical model are: x k ij , y k i , and a k i , i, j = 1, ..., n, k = 1, 2, ..., m. x k ij is a binary variable that takes value 1 if route k goes from POI i to j, and 0 otherwise. The binary variable y k i is equal to 1 if POI i is visited in route k and y k j = 0 otherwise. The arrival time at POI i in route k is contained in variable a k i . On the other hand, the TOPTW can be formulated as a linear programming problem (LP) as follows:

Maximize:

The objective function is represented in Eq. (1). The (2) constraints guarantee that routes start and end at point 0. The constraints (3) ensure that every location is visited at most once. The subtour elimination and flow conservation rule in constraints (4) and (5) establish the connectivity and time of each tour, where M is a large constant. The constraints (6) guarantee the maximum time for each tour. The time windows constraints are identified in (7) . The condition variables are defined in (8), (9) and (10) .

The values of x k ij , k = 1, 2, ..., m, i, j = 0, 1, ..., n variables define the routes. With these variables, the selected POIs can be obtained by Eqs. (4) . The combination of (5) and (7) leads to the arrival time at each POI j is iteratively given by

for the previously point i in this day k; i.e., such that x k ij = 1.

The Team Orienteering Problem with Time Windows (TOPTW) [18] has been extensively addressed in the literature over the last decade. For a comprehensive survey on variants of the orienteering problem, we refer the reader to the paper by Gunawan et al. [6] . Due to the complexity of the TOPTW, most papers in the scientific literature tackle it by means of heuristic algorithms. Among the metaheuristic algorithms proposed for solving the TOPTW, there are implementations of Iterated Local Search, Local Search algorithm, Simulated Annealing, Variable Neighborhood Search, Large Neighborhood Search and Greedy Randomized Adaptive Search (GRASP).

A simple, fast and effective iterated local search that combines insert and shake steps to escape from local optima is presented in [18] . A variable neighborhood search algorithm for a multiple time windows extension has been addressed in [17] . A hybrid algorithm that combines GRASP with the evolutionary local search is designed in [10] . A Local Search algorithm is proposed in [9] . Two versions of a simulated annealing algorithm are developed in [11] . A variable neighborhood search procedure based on exploring granular instead of complete neighborhoods is presented in [9] . Granular exploration allows to obtain an efficient and effective algorithm. A Simulated Annealing with restart strategy is proposed in [12] for a multiple time windows extension. A Capacitated TOPTW, in which customers have associated also demands is presented in [2] . An iterated local search and a hybrid algorithm based on simulated annealing and iterated local search are proposed in [7] . A new variant of the TOPTW that includes mandatory visits is considered in [13] . In order to solve this problem, a multistart simulated annealing is developed. A large neighborhood search algorithm is proposed in [15] . A multi-objective version of the TOPTW, in which the score has to be maximized and the time needed for the itinerary of the tourist is minimized, is presented in [8] . It is developed an iterated local search into adjustment iterated local search, in which the construction phase is performed heuristically. The uncertainty theory is used in [19] to consider uncertain travel times when solving the TOPTW. In this regard, simulation is a powerful tool to handle these scenarios. They have been successfully applied to a wide variety of environments. This is illustrated in [5] and [3] , among others. Due to its versatility and capability to model specific features of the environment under analysis, simulation based on multiple agents [1] has become in one of the most relevant paradigms. For this reason, this is the approach considered in this paper.

In this work, we propose a Fuzzy GRASP for solving the TOPTW, in which the restricted candidate list used in the construction phase of GRASP is the fuzzy set of the high quality points of interest according to their travel time and score. Furthermore, we propose a multi-agent simulation model to deal with the randomness associated with the optimization problem in practical scenarios.

A Fuzzy GRASP (Greedy Randomized Adaptive Search Procedure) is proposed in order to solve the proposed problem and obtain high-quality solutions in reasonable time. The GRASP metaheuristic [14] , in its standard version, consists of two phases; a first construction phase, and a second local search improvement phase.

The construction phase obtains a feasible solution by iteratively selecting at random a new location from a Restricted Candidate List (RCL) with size given by the parameter RCLsize. Finally, the neighborhood of the solution is explored until a local minimum is found in the second phase. The Algorithm 1 shows the GRASP phases cited above with maxIterations as the maximum number of iterations of the GRASP procedure. In Algorithm 2 the construction phase of GRASP is shown. The initial solution has m = |K| empty routes. The procedure starts with the initial solution, and subsequently the construction mechanism adds step-by-step a new POI to the current partial solution guaranteeing the solution feasibility.

The mechanism inserts each new no visited POI i in the best possible position guaranteeing the feasibility of the partial route. The several feasible positions in which insert a new POI i is defined as the (i, j, k) triplet with y k j = 1. For the partial route k, the mentioned insertion consists in inserting the POI i after j. The variables y k i , x k ji and x k ih are set to 1 in order to get the new solution. Then for the only index h such that x k jh is 1; this variable x k jh is also set to 0. The variables a k i for this route k have to be computed using (11) and feasibility tested by the corresponding time limit constraints (6) and time window constraints (7) . A greedy travel time function f computes the travel time increase in the route produced by an insertion represented by the triplet defined previously. The f function is defined as f (i, j, k) = t k ji + t k ih − t k jh for the above index h.

A Restricted Candidate List (RCL) with candidate POIs to be visited in the solution is constructed using the greedy function f . The standard version of GRASP designs the RCL introducing RCLsize feasible insertion triplets (i, j, k) with the best quality values for function f . In this paper a fuzzy version of the GRASP is proposed.

The proposed fuzzy mechanism considers the fuzzy set of the high quality POIs of the RCL based on their score. μ(·) is the membership function of the for all location i ∈ I do 6:

Find the best feasible triplet (i, j, k) to insert this new location i in partialSolution according to greedy time function f (i, j, k) 7:

Add the feasible triplets (i, j, k) to the Candidate List CL 8: end for 9:

Create the Restricted Candidate List, RCL, with the best RCLsize triplets (i, j, k) from CL according to f 10:

Select a random triplet (i, j, k) from RCL 11:

Update the variables of route k by inserting the location i at position j 12: end while 13: return partialSolution 14: end GRASPConstructionPhase set stated by μ(i) = s i /s max with s i the score related with POI i and s max the highest score in RCL. RCL * is the set with the best locations in RCL created by the α-cut of RCL related with μ(·), with α ∈ [0, 1].

The new POI inserted in the solution is randomly selected from RCL * . The decision maker is responsible of fixing the α parameter. The choice of this parameter influences the quality of the POI that will be inserted into the solution. The Algorithm 3 shows the fuzzy GRASP construction phase.

Basically, the selection of the best insertion position of POIs in a tour composes the candidate list. The greedy function f evaluates the increase in time due to the POI insertion in partial route. The value of f determine the position of candidate in the list. The best position to insert a candidate POI for all routes is located by the procedure. The goal of the best position is minimize the total time after the insertion of POI. Finally the list is ordered ascending by the increase in time such that lowest time increase candidates are placed at candidate list top. The candidate list size is settled in RCLsize due to build the RCL. Subsequently, the α-cut, that recognizes the best score POIs, conforms RCL * . The decision maker fix the α value.

The proposed GRASP mix the minimization of travel times in the RCL composition and maximization score in the RCL * composition. Therefore the RCL * contains the best POIs that combines the two criteria mentioned. The POIs that will be insert in current partial solution are those with low travel time increase and high score. Later, in RCL * a POI is randomly selected and inserted into partial solution. The inserted POI is banned for future candidate lists. Finally, the feasible solution is obtained and construction phase ends.

The next phase is an improvement phase that uses a local search for this purpose. Commonly, a local search procedure performances iteratively by replacing for all location i ∈ I do 6:

Find the best position j to insert i in a partial route k of the partialSolution according to greedy time function f (i, j, k) 7:

Add the feasible triplets (i, j, k) to the Candidate List CL 8: end for 9:

Create the Restricted Candidate List, RCL, with the top RCLsize triplets (i, j, k) from CL 10:

Get

Select a random triplet (i, j, k) from RCL * 12:

Update the variables of route k by inserting the location i at position j 13: end while 14: return partialSolution 15: end GRASPFuzzyConstructionPhase the current solution with a improved solution achieved in the neighborhood. The stop criteria of local search is not to find a better solution in the neighborhood. The Algorithm 4 presents a standard local search procedure. The proposed local search applies exchange movements between POIs in order to minimize the total travel time. The POIs can be selected in the same or different routes. A bestimproving strategy is used, so that the search explores all neighborhoods in order to replace the current solution by the best solution found. Once the total travel time of the solution has been improved, the procedure tries to insert a new POI in the solution with the aim of maximize the solution score. Overall, the aim of the proposed solver is to maximize the total solution score. The procedure is an two-phases iterated algorithm that progress until the stop criteria is met. The introduced fuzzy mechanism in the standard GRASP allows to consider two criteria in construction of the RCL list; the minimization of travel time and the maximization of score.

In this paper, we also propose a simulation model aimed at handling randomness associated with the optimization problem in practical scenarios. This randomness is derived from the potential changes in the preferences of the visitors, inadequate treatment of the staff in the points or unadvertised changes in the time windows, among others. Thus, the scores of the points of interest are considered as random variables instead of constant values, as done in previous works found in the scientific literature so far.

We design a multi-agent simulation approach that uses asynchronous time and receives a feasible solution of the problem. This solution is provided by the solver previously described. It also receives the random variables that defines the behaviour of the scores. The model is composed of a hierarchy of agents, The base agent is governed by the state machine shown in Fig. 1 and is aimed at simulating a route of the solution. The nodes in the state machine represent states of the visitor during its route, whereas the arcs represent transitions between pairs of consecutive states. This way, a transition is applied when the triggering event occurs. As can be checked, the visitor moves to the next point of interest and waits until it is open. This is visited once the point is open. Then, next point is checked or the visitor goes to its end point. It is worth mentioning that the score obtained when visiting a point of interest is derived from sampling the random variable that defines it. At the same time, there is a manager agent which is responsible for assign routes to the base agents and managing the performance metrics (i.e., number of points visited, time in each point, etc.). Thus, the model is composed of one manager agent and m base agents. 

This section describes the results from the first computational experiment carried out in our study. The aim of this experiment is to evaluate the accuracy of the Fuzzy GRASP metaheuristic proposed to solve the TOPTW and get high-quality solutions.

The benchmark suite tackled along this section is based on the classical set of instances introduced by [16] for Vehicle Routing Problem with Time Windows. Vansteenwegen et al. [18] provide a set of locations with a given score which can be visited. A time windows and service time have been defined for each location. The limit number of routes match with the number of vehicles of the original Solomon data set. The number of locations used is equal to 100 and the limit time per route varies according to the specific instance. This benchmark considers travel times equal to euclidean distances. Regarding the geographical data, there are three kinds of instances: instances with clustered customers (C), instances with random customers (R) and a mixture of random and clustered customers (RC).

The instances data set used are described in Table 1 in more detail. The parameter of the proposed Fuzzy GRASP are:

1. The α value of α-cut of μ(s j ) 2. The size sRCL of the GRASP restricted candidate list

The parameter values of the proposed solver assess during the computational experiments are shown in Table 2 .

The GRASP procedure was run 1000 times for each instance and parameter combination used in computational experiments. The results are presented in Table 3 and show the comparison between the best solutions obtained by standard GRASP and Fuzzy GRASP, and the optimal solution. The score of the optimal solutions is the sum of scores for all locations. Column one and two give the name of the instance and the score of the optimal solution, respectively. The third and fourth columns show the best solutions scores of standard and Fuzzy GRASP, respectively. The gap between the best solutions of the standard and Fuzzy GRASP and the optimal solution are given by the fifth and sixth columns, respectively. Finally, the last column shows the execution time of Fuzzy GRASP. The computational experiments confirm that Fuzzy GRASP is highly efficient when solving the instances under analysis.

We have carried out several experiments dedicated to check the suitability of the simulation model proposed in Sect. 4.2. In this case, the score associated with the points of interest is modelled as a random variable, S, that does not follow any arbitrary non-negative probability distribution. In particular, a Log-Normal Table 4 shows the average computational results obtained by the proposed model when simulating the behaviour of 10 solutions selected at random. In this case, the score, travelled distance, total time, minimum time in the routes, maximum time in the routes, and slack time used to satisfy the time windows are shown when changing the level of variability in the scenarios. Due to the fact that the variability impacts only on the score of the points of interest, the remaining metrics are not altered in the different scenarios and their values are only provided for further analysis. In this case, the results indicate that the score obtained when visiting the involved points of interest in the routes increases when the variability of the scenarios is higher. 

This work proposes a Fuzzy GRASP metaheuristic and a multi-agent simulation model to solve the Team Orienteering Problem with Time Windows, in which the scores associated to the points of interest are not deterministic. In the first place, we develop a Fuzzy GRASP to solve the optimization problem with deterministic scores to obtain a set of high-quality solutions. Then, these solutions are used by the designed multi-agent simulation model to consider low, moderate and high levels of variability in the scores.

The computational experiments carried out in this work corroborate that the GRASP metaheuristic can reach a set of high-quality solutions in reasonable computational time. Moreover, the simulation model is a tool able to perform experiments about the impact of changes in the score of the points of interest on the quality of the solutions in realistic scenarios. However, there are still several promising lines for further research. In particular, uncertainty can be associated with travelling times between pairs of points of interest, and therefore the optimization and simulation approaches must be modified appropriately to handle this. At the same time, it is highly interesting to develop recommendation systems to propose visiting routes to the tourists according to their individual features.

Agent based modelling and simulation tools: a review of the state-of-art software

Solving the capacitated team orienteering problem with time windows through variable neighborhood search

Simulation optimization: a review of algorithms and applications

The team orienteering problem

Discrete event simulation for performance modelling in health care: a review of the literature

Orienteering problem: a survey of recent variants, solution approaches and applications

Well-tuned algorithms for the team orienteering problem with time windows

Solving multi-objective team orienteering problem with time windows using adjustment iterated local search

The team orienteering problem with time windows: an LP-based granular variable neighborhood search

Hybridized evolutionary local search algorithm for the team orienteering problem with time windows

A simulated annealing heuristic for the team orienteering problem with time windows

A simulated annealing heuristic for the multiconstraint team orienteering problem with multiple time windows

Solving the team orienteering problem with time windows and mandatory visits by multi-start simulated annealing

Optimization by GRASP: Greedy Randomized Adaptive Search Procedures

An effective large neighborhood search for the team orienteering problem with time windows

Algorithms for the vehicle routing and scheduling problems with time window constraints

Heuristics for the multiperiod orienteering problem with multiple time windows

Iterated local search for the team orienteering problem with time windows

Uncertain team orienteering problem with time windows based on uncertainty theory

Airam Expósito-Márquez would like to thank the Canary Government for the financial support he receives through his post-graduate grant.