key: cord-0047508-1x03ui2p authors: Guo, Yi-nan; Ji, Jianjiao; Tan, Ying; Cheng, Shi title: Multi-objective Combinatorial Generative Adversarial Optimization and Its Application in Crowdsensing date: 2020-06-22 journal: Advances in Swarm Intelligence DOI: 10.1007/978-3-030-53956-6_38 sha: 372fd499fbe691bc6a47d2e8a456b4558377e57b doc_id: 47508 cord_uid: 1x03ui2p With the increasing of the decision variables in multi-objective combinatorial optimization problems, the traditional evolutionary algorithms perform worse due to the low efficiency for generating the offspring by a stochastic mechanism. To address the issue, a multi-objective combinatorial generative adversarial optimization method is proposed to make the algorithm capable of learning the implicit information embodied in the evolution process. After classifying the optimal non-dominated solutions in the current generation as real data, the generative adversarial network (GAN) is trained by them, with the purpose of learning their distribution information. The Adam algorithm that employs the adaptively learning rate for each parameter is introduced to update the main parameters of GAN. Following that, an offspring reproduction strategy is designed to form a new feasible solution from the decimal output of the generator. To further verify the rationality of the proposed method, it is applied to solve the participant selection problem of the crowdsensing and the detailed offspring reproduction strategy is given. The experimental results for the crowdsensing systems with various tasks and participants show that the proposed algorithm outperforms the others in both convergence and distribution. Multi-objective combinatorial optimization problems (MOCOPs), such as traveling salesman problem, vehicle routing planning, participant selection problem of crowdsensing and so on, is to find the optimal resource assignment that takes multiple objectives into consideration [1, 2] . To address the issues, the classical linear programming, metaheuristic search, evolutionary algorithm, and some other population-based intelligent optimization algorithms are introduced. However, with the increasing of the decision variables, the exploration ability of above-mentioned algorithms become limited due to the low efficiency for generating diverse offspring by a stochastic mechanism. Many researchers [3] [4] [5] [6] [7] employed the handcrafted strategies that were especially designed according to the characteristics of specific problems to improve the performance. However, such problem-specific methods depends a lot on the artificial experience. The recent advances [8] [9] [10] [11] in machine learning algorithms have shown their strong ability of helping engineers to design optimization algorithms with learning ability to solve different problems with a relatively good performance. Based on this, various studies [12] [13] [14] have tried to utilize the neurol networks to learn the useful information about the fitness landscape or the distribution of the individuals to generate the more rational offspring. But a large number of the training data must be provided for building the learning models, which is always difficult or expensive to be achieved in practice. The Generative Adversarial Networks (GAN) proposed by Goodfellow in 2014 is able to learn high-dimensional distributions efficiently with the limited training data by the following adversarial learning mechanism [15] , which consists of a generator and a discriminator [16] . The former learns the distribution P data (x) of a real data x and generates a new sample G(z) with the prior distribution P prior (x), while the latter works hard on identifying whether the sample is real or fake, and outputs a discriminant probability D(x). More specifically, the generator tries to produce the samples as real as possible, with the purpose of decreasing the accuracy of a discriminator. By contrast, the discriminator makes an effort on enhancing its recognition ability. Both of them are trained in a minimax game manner as follows. Various studies have been done on GAN-based optimization algorithms, in which the offspring is generated along the direction learned from the distribution of the better candidate solutions. Tan et al. [17] proposed a generative adversarial optimization framework for continuous optimization problems with a single objective. A new solution is produced from a candidate with the input noise by the generator, and then the discriminator predicts whether the fitness value of a generated solution is better than that of the original candidate. He et al. [15] first presented a GAN-based multi-objective evolutionary algorithm. The candidates are classified into two datasets that are labeled as real and fake, respectively. After training the generator with the above samples, the offspring is formed by it or the genetic operators with the same probability. Despite the application of GAN on continuous optimization problems, fewer works have been done on the combinatorial optimization problems. Probst [18] employed GAN to build the probability model that approximates the distribution of the candidates in the estimation of distribution algorithm. The offspring are gotten from the probability model, with the purpose of finding the optima for a scalar combinatorial optimization problem. Being different from it, a multi-objective combinatorial generative adversarial optimization algorithm (MOCGAO) is proposed for the MOCOPs in this paper. And the main contributions of this study are summarized as follows: (1) MOCGAO takes advantage of the learning and generative abilities of GAN to generate superior solutions for the MOCOPs. (2) A classification strategy is developed to identify the non-dominated solutions found in the evolution as the real samples for training GAN, as they provide the distribution information of the population. (3) An offspring reproduction strategy is designed to form a new feasible solution from the decimal output of the generator for the MOCOPs. (4) The proposed method is applied to solve the participant selection problem of the crowdsensing and the detailed offspring reproduction strategy is given. The rest of the paper is organized as follows. In Sect. 2, the key issues of multiobjective combinatorial generative adversarial optimization are presented in detail. MOCGAO-based participant selection strategy for the crowdsensing is illustrated in Sect. 3. The experimental results are compared and further analyzed in Sect. 4. Finally, Sect. 5 concludes the whole paper and plans the topic to be researched in the future. According to the algorithm steps of MOCGAO shown in Algorithm 1, the initial population P(0) is constructed by the randomly produced individuals and the optima of each objective obtained by the greedy algorithm. The latter ones provide the extremum for training GAN so as to speed up the convergence of the network. After training GAN by the classified individuals, the generator outputs the offspring Q(t). N individuals selected from the combination of P(t) and Q(t) by the non-domination sort and elite selection strategy of NSGA-II, compose of the population in the next generation P(t + 1). Finally, the Pareto-optimal solution is found until the termination condition is satisfied. Apparently, the key issues of the proposed method are to classify the population, train GAN and reproduce the offspring. The detailed selection strategy of NSGA-II refer to Ref. [19] . Both the real and fake samples provide effective information for training GAN. For image processing [20] [21] [22] and text generation [23] , the real data distributions of the images and texts are normally obtained in advance. However, it is difficult to know the fitness landscape, even the true Pareto-optimal solutions before the evolution of actual multi-objective optimization problems. To address the issue, He et al. [15] partitioned the population in each generation into two datasets with the same size, and the one with the better convergence and diversity is treated as real samples. However, both the nondominate individuals and the dominated ones with the even distribution are classified to form the training data. The latter is conducive to form the GAN that can produce the offspring with better diversity, but the distribution of the worse individuals also is learned by GAN, which slows down the convergence speed. Different from it, only the non-dominated individuals are labeled as real and employed to form the real dataset x r . The rest of the population are classified into the fake dataset x f . Because GAN iteratively learns only the distribution of better solutions, the generated offspring may more approximate to the true Pareto solutions. The generator and discriminator of MOCGAO both consist of the feedforward neural network with a single hidden layer, as shown in Fig. 1 . Each node in the hidden layer employs the ReLu activation function, while the nodes in the output layer adopt the sigmoid function. The noise obeying the continuous uniform distribution Z ∼ U (−1, 1) is employed as the input signal to the generator. Following that, the output of G(z) is transformed into an offspringx by the offspring reproduction strategy Off ( ). The generated samplẽ x is utilized to calculate the loss function of the generator as follows. Subsequently, the real samples x r , the fake ones x f and the generated onesx are all employed to train the discriminator. The comprehensive loss function is defined. The parameters of GAN are learned by the Adam algorithm [24] , in which each parameter remains the various learning rate, instead of the traditional gradient descent method. For most of the combinatorial optimization problems, a solution is normally encoded by the binary or the integer. However, the outputs of the generator in GAN is a decimal between 0 and 1, which cannot represent an individual of a combinatorial optimization problem directly. To this end, a novel offspring reproduction strategy is presented, with the purpose of forming an available individual based on the decimal output of GAN. The number of neurons in the output layer of a generator is equal to the dimension of decision variables. For a binary-coded individual, the output of each neuron is treated as the probability of each gene being set to 1. In the offspring reproduction strategy, the output of all neurons in the generator are sorted in the descending order, and the variable corresponding to the first unselected one is labeled by 1. Different from it, the possible integer-values of the decision variables are assigned to the genes corresponding to the sorted neurons in the same descending order. That is, the maximum integer-value is first set to the gene corresponding to the neuron having the maximum output. As shown in Fig. 2 , the same decimal output of a generator is mapped to the different offspring in terms of the encoding scheme of an individual. To verify the effectiveness of the proposed MOCGAO algorithm, it is applied to the participant selection problem of Crowdsensing, which allocates all the sensing tasks to suitable participants that are selected from the sensing-user set. Assume that the sensing tasks denoted as T = {t 1 , t 2 , . . . , t m } are published on the crowdsensing system, and the mobile users expressed by U = {u 1 , u 2 , . . . , u n } want to participate the works as executors. Suppose that each task need to be fulfilled by ξ users, and each user is assigned to at most one task. Let e ij and a ij be the sensing ability and reward of the user u j for the task t i , respectively. Moreover, the sensing ability is determined by the distance between the task and the user, and the reward depends on the sharing mechanism of reward for the user and his cooperative friend. More details refer to Reference [25] . Based on this, a participant selection model that maximize both the sensing quality of tasks and the reward of participants can be constructed as follow: In the above formula, [x ij ] m×n is the Task-User matrix, as shown in Fig. 3 (a). x ij = 1 means that the task is allocated to the jth user. The Task-User matrix is converted to the binary vector, as shown in Fig. 3(b) , with the purpose of simplifying the training process of GAN. According to the above-mentioned encoding scheme of the participant selection model, the output of each neuron in the output layer of a generator represents the probability of a user being assigned to a task. In order to obtain the feasible offspring by the proposed offspring reproduction strategy, the number of neurons is set to m*n. For each task, ξ unallocated users are selected to complete the task in the descending order of the corresponding probability, and the corresponding gene is set to 1. As shown in Fig. 4 , two tasks are allocated to six users in the crowdsensing system, and each task needs two participants. For the task t 1 , the two users with the maximum probability, u 1 and u 6 , are selected. The first and sixth genes are set to 1. u 4 and u 5 are chosen to carry out t 2 . u 6 having a higher probability than u 5 is not employed by this task because of the constraint for the available users. Fully experiments are conducted to examine the performance of the proposed MOCGAO. The main experimental parameters are listed in Table 1 , and the other parameters of the participant selection problem refer to paper [25] . Besides, the hypervolume (HV) and coverage(C) metrics [26] are employed to evaluate the performance of the proposed algorithm, and the best results are labeled by bold. To verify the effectiveness of the Adam method, we compare the performances of MOCGAO with the Adam method (Adam) against one with the gradient descent method (Gradient). From the statistical results listed in Table 2 , no matter how many users participate to complete the tasks, MOCGAO with the Adam method converges to the best Pareto solutions because the Adam adopts the adaptive learning rate for each parameter instead of the static on in the gradient descent method, which is more efficient for training GAN. The rationality of the Greedy-based initialization method (Greedy) is analyzed by comparing the performance of MOCGAO with the random initialization strategy (Random), as listed in Table 3 . Apparently, the extreme of each objective provides more information on the true distribution of the Pareto front, which is helpful for speeding up the training process of GAN and generating the more promising offspring. Thus, MOCGAO with the Greedy-based initialization method can found better Pareto-optimal solutions, showing the larger HV-values. Two representative multi-objective evolutionary algorithms, including NSGA-II [19] and MOEA/D [27] are employed as the comparison algorithms. Figure 5 depicts the Pareto fronts obtained by the three compared algorithms. It can be observed that the Pareto-optimal solutions of all instances generated by NSGA-II have the worst convergence, while MOEA/D tends to generate the Pareto solutions distributed in a small region, but with the better convergence. In contrast, the Pareto fronts found by the proposed MOCGAO achieve the best performance in both convergence and distribution for 11 out of 12 instances, except for the 10*100 instance. The statistical results of HV and C summarized in Tables 4 and 5 confirm the above observations. Consequently, the proposed algorithm outperforms the other compared algorithms due to the better diversity of the offspring maintained by the improved GAN, especially for the problems with the large-scale participants. (a)m=5,n=100 (b)m=5,n=200 (c)m=5,n=300 (d)m=10,n=100 (e)m=10,n=200 (f)m=10,n=300 (g)m=15,n=100 (h)m=15,n=200 (i)m=15,n=300 (j)m=20,n=100 (k)m=20,n=200 (l)m=20,n=300 To overcome the weakness of the traditional evolutionary algorithms on solving multiobjective combinatorial optimization problems with the large-scale decision variables, a generative adversarial network that has the strong learning and generative abilities is introduced to construct a multi-objective combinatorial generative adversarial optimization algorithm. The extreme of each objective obtained by the greedy algorithm is combined with the randomly produced individuals to form the initial population, with the purpose of speeding up the training process of GAN. During the evolution, the optimal non-dominated solutions in the current generation are identified as the real samples, while the rest are fake. Classified solutions are employed to train GAN. More specifically, the Adam method with the adaptive learning rate is employed to update the parameters of GAN, and an offspring reproduction strategy is presented to obtain a feasible offspring from the decimal output of the generator. Following that, the proposed algorithm is utilized to solve the participant selection problem of the crowdsensing, and a detailed offspring reproduction strategy is given. The experiments are conducted on the crowdsensing system with the various tasks and participants, and the results show that the proposed algorithm superior to the others in both convergence and distribution. To combine the evolutionary operators with GAN will be our future work, with the purpose of generating the offspring with more uniform distribution. Robust dynamic multi-objective vehicle routing optimization method Hybrid metaheuristics in combinatorial optimization: a survey Immune cooperation mechanism based learning framework Grey wolf optimizer-based approach to the tuning of pi-fuzzy controllers with a reduced process parametric sensitivity Accelerated cuckoo optimization algorithm for capacitated vehicle routing problem in competitive conditions Efficient resource allocation in cooperative coevolution for large-scale global optimization An evolutionary algorithm for large-scale sparse multiobjective optimization problems Model-based evolutionary algorithms: a short survey Ensemble prediction-based dynamic robust multi-objective optimization methods Evolutionary computation meets machine learning: a survey Novel interactive preference-based multi-objective evolutionary optimization for bolt supporting networks Deep reinforcement learning for multi-objective optimization Reinforcement learning for solving the vehicle routing problem ERNEAD: training of artificial neural networks based on a genetic algorithm and finite automata theory Evolutionary multi-objective optimization driven by generative adversarial networks Advances in Neural Information Processing System Generative adversarial optimization Generative adversarial networks in estimation of distribution algorithms for combinatorial optimization A fast and elitist multiobjective genetic algorithm: NSGA-II Photo-realistic single image super resolution using a generative adversarial network Sharp and real image super-resolution using generative adversarial network Single image dehazing via conditional generative adversarial network Language generation with recurrent generative adversarial networks without pre-training Adam: a method for stochastic optimization MOEA/D-based participant selection method for crowdsensing with Social awareness PlatEMO: a MATLAB platform for evolutionary multiobjective optimization MOEA/D: a multiobjective evolutionary algorithm based on decomposition