key: cord-0508865-0qvw615q authors: Mart'inez-'Alvarez, F.; Asencio-Cort'es, G.; Torres, J. F.; Guti'errez-Avil'es, D.; Melgar-Garc'ia, L.; P'erez-Chac'on, R.; Rubio-Escudero, C.; Riquelme, J. C.; Troncoso, A. title: Coronavirus Optimization Algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model date: 2020-03-30 journal: nan DOI: nan sha: 70995e600ed966de4ab5a7e9a47386c333f2181b doc_id: 508865 cord_uid: 0qvw615q A novel bioinspired metaheuristic is proposed in this work, simulating how the Coronavirus spreads and infects healthy people. From an initial individual (the patient zero), the coronavirus infects new patients at known rates, creating new populations of infected people. Every individual can either die or infect and, afterwards, be sent to the recovered population. Relevant terms such as re-infection probability, super-spreading rate or traveling rate are introduced in the model in order to simulate as accurately as possible the coronavirus activity. The Coronavirus Optimization Algorithm has two major advantages compared to other similar strategies. First, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Second, the approach has the ability of ending after several iterations, without setting this value either. Infected population initially grows at an exponential rate but after some iterations, the high number recovered and dead people starts decreasing the number of infected people in new iterations. As application case, it has been used to train a deep learning model for electricity load forecasting, showing quite remarkable results after few iterations. The coronavirus (COVID-19) is a new respiratory virus, firstly reported in the city of Wuhan, China, has spread worldwide, having been reported more than 100000 infected people so far. Much remains unknown about the virus, including how many people may have very mild or asymptomatic infections, and whether they can transmit the virus. The precise dimensions of the outbreak are hard to know. Bioinspired models typically mimic behaviors from the nature and are known for their successful application in hybrid approaches. Viruses can infect people and these people can either die, infect other people or simply get recovered after the disease. Vaccines and the immune defense system typically fight the disease and help to mitigate their effects whereas an individual is still infected. This behavior can be modeled to search for suboptimal solutions in large search spaces. The Virus Optimization Algorithm (VOA) was proposed by Liang and Cuevas-Juárez in 2016 [13] and later improved in [14] . However, as many other metaheuristics, the results of its application are highly dependent on its initial configuration. Additionally, it simulates generic viruses, without adding individualized properties for particular viruses. It is known that metaheuristics must deal with huge search spaces, even infinite for the continuous cases, and must find suboptimal solutions in reasonable execution time. The rapid propagation of the coronavirus along with its ability of infecting most of the countries in the world impressively fast, has inspired the novel metaheuristic proposed in this work, named Coronavirus Optimization Algorithm (CVOA). The main CVOA advantages regarding other similar approaches can be summarized as follows: 1. Coronavirus statistics are known by the scientific community. In this sense, the rate of infection, the mortality rate or the re-infection probability are already known. That is, CVOA is parametrized with actual values for rates and probabilities, preventing the user to perform an additional study on the most suitable setup configuration. 2. CVOA can stop the solutions exploration after several iterations, with no need to be configured. That is, the number of infected people increases during the first iterations, however, after a certain number of iterations, the number of infected people starts decreasing, until reaching a void infected set of individuals. 3. Another relevant contribution of this work is the proposal of a new codification, discrete and of dynamic length, specifically designed for hybridizing LSTM with CVOA (or any other metaheuristic). As for the limitations of the current approach, there is mainly one. Since there is no vaccine currently, it has not been included in the procedure to reduce the number of individuals candidates to be infected. This fact involves an exponential increase of the infected population in the first iterations and, therefore, an exponential increase of the execution time for such iterations. This fact, however, is partially solved with the isolation condition that simulates, somehow, individuals that cannot be infected at a particular iteration. A study case is included in this work to discuss the CVOA performance. CVOA has been used to configure a Long Short-Term Memory (LSTM) architecture [12] , which is widely used for artificial recurrent neural network (RNN) in the field of deep learning [5] . Data from the Spanish electricity consumption have been used to validate the accuracy. The results achieved verge on 0.45%, substantially outperforming other well-established methods such as random forest, gradient-boost trees, linear regression or deep learning optimized with other algorithms. The code, developed in Phyton with a discrete codifica-tion, is available in the supplementary material (along with an academic version in Java for a binary codification). Finally, the authors acknowledge the need of further study on the performance of wellknown functions [11] , however, given the relevance of coronavirus is acquiring throughout the world (declared as pandemic by the World Health Organization) and the remarkable results achieved when combined with deep learning, they wanted to share this preliminary work hoping it inspires future research in this direction. The rest of the paper is organized as follows. Section 2 discusses related and recent works. The methodology proposed is introduced in 3. Section 4 proposes a discrete codification to hybridize deep learning models with CVOA and provides some illustrative cases. The results achieved are reported and discussed in 6. Finally, the conclusions drawn and future work suggestions are included in Section 7. There are many bioinspired metaheuristics to solve optimization problems. Although CVOA has been conceived to optimize any kind of problems, this section focuses on optimization algorithms applied to hybridize deep learning models. It is hard to find consensus among the researchers on which method should be applied to which problem, and, for this reason, many optimization methods have been proposed during the last decade to improve deep learning models. Generally, the criterion for selecting a method is its associated performance from a wide variety of perspectives. Low computation cost, accuracy or even implementation difficulty can be accepted as one of these criteria. One of the most extended metaheuristics used to improve deep learning parameters is genetic algorithms (GA). Hence, a LSTM network optimized with GA can be found in [4] . To evaluate the proposed hybrid approach, the daily Korea Stock Price Index data were used, outperforming the benchmark model. In 2019, a network traffic prediction model based on LSTM and GA was proposed in [3] . The results were compared to pure LSTM and ARIMA, reporting higher accuracy. Multi-agents systems have also been applied to optimize deep learning models. The use of Particle Swarm Optimization (PSO) can be found in [15] . The authors proposed a model based on kernel principal component analysis and back propagation neural network with PSO for midterm power load forecasting. The hybridization of deep learning models with PSO was also explored in [8] but, this time, the authors applied the methodology with image classification purposes. Ants colony optimization (ACO) models have also been used to hybridize deep learning. Thus, Desell et al. [6] proposed an evolving deep recurrent neural networks using ACO applied to the challenging task of predicting general aviation flight data. The work in [7] introduced a method based on ACO to optimize a LSTM recurrent neural networks. Again, the field of application was flight data records obtained from an airline containing flights that suffered from excessive vibration. Some papers exploring the Cuckoo Search (CS) properties have been published recently as well. In [18] , CS was used to find suitable heuristics for adjusting the hyper-parameters of another LSTM network. The authors claimed an accuracy superior to 96% for all the datasets examined. Nawi et al. [17] proposed the use of CS to improve the training of RNN in order to achieve fast convergence and high accuracy. Results obtained outperformed those than other metaheuristics. The use of the artificial bee colony (ABC) optimization algorithm applied to LSTM can also be found in the literature. Hence, and optimized LSTM with ABC to forecast the bitcoin price was introduced in [21] . The combination of ABC and RNN was also proposed in [2] for traffic volume forecasting. This time the results were compared to standard backpropagation models. From the analysis of these works, it can be concluded that there is an increasing interest in using metaheuristics in LSTM models. However, not as many works as for artificial neural networks can be foundin the literature and, none of them, based on a virus propagation model. These two facts, among others, justify the application of CVOA to optimize LSTM models. This section introduces the CVOA methodology. Thus, Section 3.1 describes the steps. Section 3.2 suggests how parameters must be set. Section 3.3 shows the CVOA pseudocode. Finally, Section 3.4 comments the pseudocode. Step 1. Generation of the initial population. The initial population consists of one individual, the so-called patient-zero (P Z). As in the coronavirus epidemy, it identifies the first human being infected. Step 2. Disease propagation. Depending on the individual, several cases are evaluated: 1. Some of the infected individuals die. They die according to the coronavirus death rate (P DIE). For simplicity, it is considered that such individuals cannot infect new individuals. 2. The individuals surviving the coronavirus will infect new individuals (intensification). Two types of spreading are considered, according to a biven probability (P SU P ERSP READER): • Ordinary spreaders. Infected individuals will infect new ones according to the coronavirus spreading rate (SP READIN G RAT E). • Super-spreaders. Infected individuals will infect new ones according to the coronavirus superspreading rate (SU P ERSP READIN G RAT E). 3. There is another consideration, since it is needed to ensure diversification. Both ordinary and super-spreaders individuals can travel and explore solutions quite dissimilar. Therefore, individuals have a probability of traveling (P T RAV EL) thus allowing to propagate the disease to solutions that may be quite different (T RAV ELER RAT E). In case of not being traveler, new solutions will change according to an ORDIN ARY RAT E. One individual can be both super-spreader and traveler. Step 3. Updating populations. Three main populations are maintained and updated for each generation. 1. Dead population. If any individual dies, it is added to this population and can never be used again. 2. Recovered population. After each iteration, infected individuals (after spreading the coronavirus according to the previous step) are sent to the recovered population. It is known that there is a reinfection probability. Hence, an individual belonging to this population could be re-infected at any iteration provided that it meets the reinfection criterion (P REIN F ECT ION ). Another situation has to be considered as well, since individuals can be isolated, simulating the how the population has behaved according to local governments policies. For the sake of simplicity, it is considered that an isolated individual is sent to the recovered population as well when meeting an isolation probability (P ISOLAT ION ). iteration, according the procedure described in the previous steps. Since CVOA simulates the coronavirus disease propagation, most of the rates (propagation, re-infection or death) are already known. This fact prevents the research from wasting time in selecting values for such rates and turns the CVOA into metaheuristic quite easy to execute. The suggested values and associated discussion are listed below: 1. P DIE. An infected individual can die with a given probability. Currently, this rate is set as almost 5% by the scientific community. Therefore, P DIE = 0.05. 2. P SU P ERSP READER. It is the probability that an individual spread the disease to a greater number of healthy individuals. It is known that this situation affects to a 10% of the population, therefore, P SU P ERSP READER = 0.1. After this condition is validated, two situations can be found: • ORDIN ARY RAT E. If the infected individual is not a super-spreader, then the infection rate is 2.5. It is suggested that this rate varies from 0 to 5. • SU P ERSP READER RAT E. If the infected individual turns out to be super-spreader, then he/she infects up to 15 healthy individuals, as per reported by the scientific community. It is suggested that this rate varies from 6 to 15. 3. P REIN F ECT ION . It is known that a recovered individual can be re-infected. The current reported rate is 14%. Therefore, P REIN F ECT ION = 0.14. 4. P ISOLAT ION . This value is uncertain because countries are taking different policies to isolate people. This parameter helps to reduce the exponential growth of the infected population after each iteration. Therefore a high value must be assigned to. It is suggested that P ISOLAT ION = 0. This section provides the pseudo code of the most relevant functions for the CVOA, along with some comments to better understand them. This is the main function and its pseudo code can be found in Algorithm 1. Once the new population is formed, it is evaluated any solution outperforming the current one has been found and, in such case, the best individual is updated. dead ← die(infectedPopulation) 12: for all i ∈ inf ectedP opulation do 13: aux ← infect(i,recovered,dead) 14: if notnull(aux) then 15: newInfectedPopulation ← aux 16: end if 17: end for 18: currentBestIndividual ← selectBestIndividual(newInfectedPopulation) 19: if fitness(currentBestIndividual) > bestIndividual then 20 The effective generation of the new infected individuals must be carried in the function replicate, whose pseudo code is not provided because it depends on the codification and the nature of the problem to be optimized. This function must return a set of new infected individuals, according to the aforementioned rates. Specific information on how this codification and replication is done for LSTM models. The pseudo code for the described procedure can be found in Algorithm 3. This function is called from the main function. It evaluates all individuals in the infected population and determines whether they die or not, according to the given P D IE. Those meeting this condition, are sent to the dead list. Algorithm describes this procedure. This is an auxiliary function used to find the best fitness in a list of infected individuals. Its peudo code is shown in Algorithm 5. if fitness(i) > bestFitness then 6: bestFitness ← fitness(i) The PZ, as it has been described previously, is the individual of the first iteration in the CVOA algorithm. Following the particular hybridization proposed, a random individual is created taking into account the codification defined above. In first place, a random value for the learning rate of the PZ is generated. Specifically, a number between 0 and 5 is generated randomly in an uniform distribution. Such limits are the indicated in Figure 1 , according to the possible encoded values of the learning rate element. The same process is carried out to produce a random value for the dropout element. In such case, a random number between 0 and 8 is generated. In second place, a random number of layers is generated for the element L of the patient zero individual. Such number layers is a random number between 2 and 11. Note that the first layer is reserved for the input layer of the neural network, as it has been discussed before. In last place, for each one of the L layers, a random number of units is generated between 0 and 11, covering the possible encoded values for the number of units previously defined (see Figure 1 ). The infection procedure described here corresponds to the functionality of replicate(), introduced in the line 4 of the Algorithm 3. This procedure takes an individual as input and returns an infected individual according to the following procedure. The first step is to determine the element L of the infected individual that will be mutated. The probability of such mutation occurs has been set to 1 3 so that every element has the same probability to mutate. If the mutation occurs, then the element L of the individual is modified according to the process described in Section 4.4. When an individual is infected at the position of the element L, the list of elements that encodes the number of units per layer (LAYER 1, ..., LAYER L) must be resized accordingly. In the case that the new number of layers after the infection is lower than its previous value, then the last leftover elements are removed. For instance, if the initial individual is The process carried out to change the value of a specific element of an individual is described below. First, a signed change amount C ∈ {−2, −1, +1, +2} is randomly determined using the following criteria. A random real number P between 0 and 1 is generated using an uniform distribution. If P < 0.25, then the change amount will be C = −2. Else if P < 0.5, then the change amount will be C = −1. Else if P < 0.75, then the change amount will be C = +1. Else, the change amount will be C = +2. Once the amount of change is determined, the new value for the infected element is computed. If its previous value is V , then the new value after the single position mutation will be V = V + C. If the new value V exceeds the limits defined for the individual codification, such value is set to the maximum or minimum allowed value accordingly. This section provides an overview on how populations evolve over time, and how the search space is explored to reach the optimum value for a given fitness function. To conduct this experimentation, a simple binary codification has been used. The Tables 1-5 summarize the results achieved for each of these lengths, respectively. Every experiment has been launched 50 times, determining that, on average, the optimum value was found for 11, 12, 14, 15 and 17 iterations, respectively. Each table shows the results of a particular execution meeting this criterion. 7 4 6 2 33 5 4 7 2 38 19 4 8 3 48 13 1 9 3 61 16 1 10 4 77 18 1 11 5 95 20 0 Table 6 Finally, Figure 3 shows the number of recovered and dead people. These two curves accumulates these numbers since dead and recovered people are sent to their respective lists and are no longer infected (except for those in recovered that can be reinfected for a its given probability P REIN F ECT ION 9 40 787 586 7963684 10 69 1369 1100 7873636 11 124 2461 2129 597529 12 230 4579 3957 68121 13 428 8499 7211 17956 14 789 15644 13305 36 15 1454 28807 24167 36 16 2662 52622 43184 1 17 4821 95116 76288 0 This section reports the results achieved by hybridizing a deep learning model with CVOA. Section 6.1 describes the study case selected to prove the effectiveness of the proposed algorithm. Section 6.2 describes the dataset used. Section 6.3 discusses the results achieved and includes some comparative methods. The forecasting of future values fascinates the human being. To be able to understand how certain variables evolve over time has many benefits in many fields. Electricity demand forecasting is not an exception, since there is a real need for planning the amount to be generated or, in some countries, to be bought. The use of machine learning to forecast such time series has been intensive during the last years [16] . But, with the development of deep learning models, and, in particular of LSTM, much research is being conducted in this application field [1]. The time series considered in this study is related to the electricity consumption in Spain from January 2007 to June 2016, the same as used in [19] . It is a time series composed of 9 years and 6 months with a high sampling frequency (10 minutes), resulting in 497832 measures in total into a 33 MB file. As in the original paper, the prediction horizon is 24, that is, this is a multi-step strategy with h = 24. The size of samples used for the prediction of these 24 values is 168. Furthermore, the dataset was split into 70% for the training set and 30% for the test set, and in addition, a 30% of the training set has also been selected for the validation set, in order to find the optimal parameters. The training set covers the period from January This section reports the results obtained by applying hybridizing LSTM with CVOA, by means of the codification proposed in Section 4, to forecast the Spanish electricity dataset described in Section 6.2. Linear regression (LR), decision tree (DT), gradient-boosted trees (GBT) and random forest (RF) models have been used with a parametrization setup according to that studied in [10, 9] . A deep neural network optimized with a grid search (DNN-GS) according to [19] has also been applied. Finally, another deep neural network but optimized with random search (DNN-RS) and smoothed with a low-pass filter (DNN-RS-LP) [20] have also been applied. These results along with those of LSTM-CVOA are summarized in Table 7 , expressed in terms of the mean absolute percentage error (MAPE). It can be observed that LSTM-CVOA outperforms all evaluated methods which have showed particularly remarkable performance for this real-world dataset. Another relevant consideration that must be taken into account is that the compared methods generated 24 independent models, each of them for every value forming h. So it would expected that LSTM-CVOA performance increases if independent models were generated for each of the values in h. These results have been achieved with the individual {4, 0, 8}{9, 7, 2, 7, 2, 7, 10, 7}, which decoded involves the following architecture parameters: • Learning rate: 10E-04. • Dropout: 0. • Number of layers: 8. • Units per layer: [250, 200, 75, 200, 75, 200, 275, 200] 7 Conclusions and future works This work has introduced a novel bioinspired metaheuristic, based on the coronavirus behavior. One the one hand, CVOA has two major advantages. First, its highly relation to the coronavirus spreading model, prevents the authors to make any decision about the inputs values. Second, it ends after a certain number of iterations due to the exchange of individuals between healthy and dead/recovered lists. Third, a novel discrete and dynamic codification has been proposed to hybridize deep learning models. On the other hand, it exhibits some limitations. Such is the case for the exponential growth of the infected population as time (iterations) goes by. Additional experimentation must be conducted in order to assess its performance on standard F functions and find out the search space shapes in which it can be more effective. Some actions must be taken to reduce the size of the infected population after a number of iterations, that grows exponentially. In this sense, a vaccine must be implemented. This case would involve adding to the recovered list, at a given V ACCIN E RAT E healthy individuals. Another suggested research line is using dynamic rates. For instance, the observation of the preliminary effects of the isolation in countries like China or Italy suggests that the IN F ECT RAT E could be simulated as a Poisson process, but more time and country recoveries is required to confirm this trend. A parallel and distributed version is proposed for future works. CVOA can be easily transformed into a multi-agent metaheuristic, in which different agents search for the best solution in a collaborative way. In this sense, the simulation of the existence of more strains can be considered. So far, it is known that there exists one strain way more aggressive than the original one. This could be modeled as a new agent with different initial setup (higher DEAT H RAT E, for instance), sharing recovered or dead lists. Finally, for the particular multi-step forecasting problem analyzed it would be desirable to generated independent models for each of the values that form the prediction horizon h. Along with this paper, an academic version in Java for a binary codification is provided, with a simple fitness function (https://github.com/DataLabUPO/CVOA_academic). Additionally, the code in Phyton for the deep learning approach is also provided, with a more complex codification and the suggested implementation, according to the pseudocode provided (https://github.com/DataLabUPO/CVOA_LSTM). Deep learning framework to forecast electricity demand Recurrent Neural Network Training using ABC Algorithm for Traffic Volume Prediction Network Traffic Prediction Based on LSTM Networks with Genetic Algorithm Genetic Algorithm-Optimized Long Short-Term Memory Network for Stock Market Prediction Deep learning on big, sparse, behavioral data Evolving deep recurrent neural networks using ant colony optimization Using ant colony optimization to optimize long short-term memory recurrent neural networks Particle swarm optimization of deep neural networks architectures for image classification. Swarm and Evolutionary Computation Multi-step forecasting for big data time series based on ensemble learning. Knowledge-Based Systems Scalable forecasting techniques applied to big electricity time series Handbook of metaheuristics Stock market prediction using optimized deep-convlstm model A novel metaheuristic for continuous optimization problems: Virus optimization algorithm. Engineering Optimization A self-adaptive virus optimization algorithm for continuous optimization problems Midterm power load forecasting model based on kernel principal component analysis and back propagation neural network with particle swarm optimization A survey on data mining techniques applied to electricity-related time series forecasting A New Optimized Cuckoo Search Recurrent Neural Network (CSRNN) Auto Tuning of RNN Hyper-parameters using Cuckoo Search Algorithm A scalable approach based on deep learning for big data time series forecasting Random hyper-parameter search-based deep neural network for power consumption forecasting Artificial Bee Colony-Optimized LSTM for Bitcoin Price Prediction The authors would like to thank the Spanish Ministry of Economy and Competitiveness for the support under project TIN2017-88209-C2.