key: cord-0058066-l7i59r1m authors: Jackson, Ilya title: Neuroevolutionary Approach to Metamodel-Based Optimization in Production and Logistics date: 2021-01-06 journal: Reliability and Statistics in Transportation and Communication DOI: 10.1007/978-3-030-68476-1_8 sha: 960ec288a07c2920d102ad91f2ca5f1ad2cc9d27 doc_id: 58066 cord_uid: l7i59r1m Considering recent success of neuroevolutionary approaches in automated machine learning, their application to metamodeling seems to be promising. This paper examines, whether it is feasible and efficient to combine artificial neural network with genetic algorithm for metamodeling automation of logistic and production systems. Besides, the possibility of using the proposed approach to derive optimal control parameters is discussed. The recent COVID-19 pandemic has tested sustainability of supply chains to its limits and revealed serious vulnerabilities of logistic and production systems across the globe. As a result, evolution of traditional ways of creating and distributing value has been notably accelerated [1] . Problems in production and logistics are industry-specific and contain non-standard features. This fact along with complexity, dimensionality and stochasticity frequently leads to analytic intractability. That is why, approximate solutions, simulation-based optimization and derivative-free "black-box" optimization are common choice for solving real-world problems in production and logistics. Nevertheless, realistic and detailed models require sheer amount of computational resources. In the light of this fact, it is quite reasonable to take advantage on an alternative model, which is specifically built in order to approximate an original one with a tolerable degree of accuracy. Such an approach is widely known as metamodeling [2] . Nowadays such machine learning (ML) algorithms as artificial neural networks (ANN), support-vector machines, gradient boosting algorithms and many others demonstrate state-of-the-art performance in such nontrivial tasks as image recognition, machine translation, speech recognition, and fitness approximation. This undisputable success could not bypass metamodeling. Smarter solutions that take advantage on MLbased metamodels significantly accelerate optimization search, thus lead to faster decision-making allowing companies to maintain business continuity in the face of disruptions and uncertainties, which is especially important in post-pandemic times. Despite the promising potential of ML algorithms in metamodeling, there is a serious challenge that cannot be ignored. Although ML algorithms are universal and task-unspecific, they are sensitive to a plethora of design decisions. For example, data preparation and pre-processing, feature selection and extraction, model selection and hyperparameter optimization [3] . That is why, data scientists, artificial intelligence developers and other experts are deeply involved into development, fine-tuning and customization of ML models. This procedure is labor-intense and requires additional time of highly qualified specialists along with financial resources. Automation of machine learning (AutoML) clearly has potential to make advanced ML-based metamodels accessible to domain scientists working in the field of logistics and production, which can be considered as "democratization" of ML. However, recent publications emphasize that widespread industrial applications require AutoML solutions to be tailored for solving a particular class of problems within the domain [3] [4] [5] . There are dozens of ML-based solutions for metamodeling of production and logistic systems. Jarosz et al. compared such metamodeling approaches as ANN, supportvector machines and random forest based on a two-stage production process of blister copper [6] . Yang et al. have demonstrated advantages of ML-based metamodeling techniques for optimization of additive manufacturing systems [7] . The recent paper reports on the computational framework for industrial lot-sizing. The framework incorporates the concepts of parallel computing, metaheuristics and ML (gradient tree boosting regressor) for solving an economic lot-size problem [8] . Notwithstanding the astonishing performance of ML metamodels there is a serpent in this Eden. Although the vast majority of ML models are universal function approximators, human experts still play a significant role. Taking into consideration recent success of neuroevolutionary approaches in neural architecture search and hyperparameter optimization, their application to metamodeling in logistics and production becomes promising. The neuroevolution comprises neural and evolutionary components. In the proposed computational framework GA is chosen as the evolutionary component and multilayer perceptron (MLP) as the neural one. The computational framework for automated metamodeling relies on the extension of the universal approximation theorem. The theorem states that any feedforward fully connected ANN with at least one hidden layer, finite number of neurons and non-linear activation function can approximate nonlinear input-output maps [9] . MLP is the most general feed-forward ANN, and other feedforward networks can be considered as its specific case. For this reason, MLP is used as a baseline for a metamodel. MLP derives its computational efficiency and robustness from distributed structure and capability to generalize. The generalization refers to the ability of producing correct output for inputs that have not been encountered during the learning phase. In the proposed computational framework, the search space defines the plethora of attainable structures. Since this paper is rather focused on a proof of concept and does not attempt to develop the all-embracing solution, the search space contains only the most vital components of MLP, such as depth, width, activation functions, optimizer and learning rate. Mimicking Darwinian natural selection, GA appears to be the driving force behind the neuroevolution. GA samples, crosses over, mutates and resamples the population of highly fit MLP-based metamodels in order to form a new even fitter population. As the result, instead of exhaustive search for nearly optimal topologies, GA improves the solution step by step from the most promising topologies. The implemented GA is distinguished by Gray-code chromosome encoding, uniform crossover and tournament selection. Components that define the search strategy. An appropriate chromosome representation for the particular problem domain can drastically accelerate the search. Space-efficiency makes binary representation an apparent choice, however, chromosome representations vary noticeably even among binary schemes. The choice falls on the reflected Gray-code, because such encoding scheme can guarantee gradualism for the reason that exactly one bit has to be flipped in order to reach the nearest value [10] . This fact prevents problems related to large Hamming clefts (Fig. 1) . Uniform crossover compensates a plethora of serious drawbacks of more traditional crossovers. Since the number of crossover points is not constant, the decision of flipping is made independently for every single bit, thus, the operator will not tend to split up the fit sections of a chromosome. Additionally, the uniform crossover makes premature convergence less likely to happen [11] . Tournament selection is a standard selection operator in contemporary neuroevolutionary techniques. It is distinguished by several significant benefits over alternatives, for instance it is simple to implement and robust to noisy fitness functions [12] . In order to formalize automated metamodeling via neuroevolution, let's consider a nonlinear input-output map represented by the "black-box" function y = f(x), where x stands for the vector of input parameters and y is the output. Despite f(x) is a "blackbox", the set of observations s ¼ f x i ; y i g N i¼1 is given. Since MLP-based metamodel can be built from such components as layers, activation functions, optimizers and respective hyperparameters, this information is encoded into a single chromosome a 2 A. After that GA searches for such a * 2 A that allows MLP-based metamodel F(a*, x) to approximate the original model. Such that F a; x ð ÞÀf x ð Þ k k , where e is a positive value, small enough for the problem under consideration (Fig. 2 ). In the following numerical studies, the search space encompasses the number of hidden layers from 0 to 10 and the number of neurons in each layer from 10 to 200 with the discretization step of 1. Additionally, the search space includes learning rates from 0.01 to 0.3 with the discretization step of 0.01 and all the activation functions and optimizers available in Keras 2.3.0 [13] . So that GA orchestrates evolutionary morphism to find a structure in feasible region that produces the metamodel with the highest coefficient of determination (R 2 ). The total number of trainable parameters that include both neurons and weights is restricted to 20000, because the obtained metamodel has to be more computationally efficient than the original one. An accurate metamodel can significantly ease computational burden. It can be trained once based on a dataset derived from experiments with the original model. After that optimization search can be conducted iteratively. There is a plethora of suitable optimization algorithms, but GA is a convenient choice for demonstration. The computational framework for metamodeling is already equipped with GA, which is universal can be applied for both neuroevolution and the following optimization. The logic behind metamodel-based optimization with GA is the same as for the network morphism. To be more specific, by applying genetic operators to the parental solutions, a set of new candidate solutions is obtained. After that, the produced offspring compete for a place in the next generation based on its fitness. This process continues until termination criteria is met (Fig. 3 ). This research takes into consideration the ongoing reproducibility crisis. Since the reproducibility of experiments is the cornerstone of the scientific method, all the source-code of models, metamodels and computational framework are available in the GitHub repository [14] . The models of the considered systems are implemented in the form of a discrete-event simulation. The first model simulates an inventory control system and referred to as the Fig. 3 . The logic behind GA-driven metamodel-based optimization. model 1. It is mainly based on the recent work [15] . The model can be characterized as multiproduct and stochastic with lost-sales, discounts and perishability. The second model simulates production-inventory system and referred to as the model 2. The model adopts Markov-modulated demand, in which the demand process is related to an underlying Markov chain [16] . For both models averaged net-profit is considered as an output of interest (dependent variable). The procedure begins with generation of random inputs in the feasible range. The model 1 comprises 151 independent variables and replicated 35 times. The model 2 comprises 81 independent variables and replicated 44 times. Storing the vector of inputs and the average output as an observation, the datasets s 1 and s 2 of 600 observations are generated. Both datasets are split by training and test subsets using 10-fold cross validation. As soon as s is generated, metamodeling can be formulated as multivariate regression problem. Therefore, the metamodeling procedure can be reduced to minimization of the mean squared error (MSE) between the actual output y and prediction made by the metamodel F(a, x). The data is standardized prior to training. AutoML procedure is driven by GA with the empirically adjusted parameters. Namely, tournament size of 5, crossover probability of 0.30 and mutation probability of 0.05. It is worth to point out that slight variations of hyperparameters have not notably affected either the convergence speed or the likelihood of premature convergence. The evolution lasts 10 generations, and each generation is populated with 40 MLPs. In 10 generations the metamodels with the average coefficients of determination of 0.81 and 0.82 have been trained (Fig. 4) . Coefficients of determination calculated based on all the cross-validation slices are insignificant, which means that the MLPs have been capable to generalize nonlinear relations between the inputs and output (Fig. 5) . Furthermore, violin plots demonstrate that the original samples generated by the simulation and the samples recreated using the trained metamodels belong to the same population (Fig. 6) . The Table 1 contains several accuracy-related statistics. Besides, ordinary least squares regression is used in order to demonstrate the presence of nonlinearity. In the context of regression analysis, the role of residuals should not be disparaged. Residuals provide a general approach to indicate the quality of the metamodels and diagnose possible problems. The expected value of residuals for both metamodels is close to 0, namely −0.011 and 0.013 for the model 1 and 2 respectively. It implies that both metamodels are not systematically biased toward over-prediction or underprediction. After the distributions of residuals are plotted against theoretical standard normal, the possible issue related to heavy tails immediately catches the eye (Fig. 7) . Nevertheless, the Anderson-Darling normality test is confidently passed in both cases ( Table 2 ). The null hypothesis that a sample is taken from a normally distributed population is tested. Since the calculated statistic is smaller than the critical value for both samples, the null hypothesis that the data is drawn from normal distribution can be accepted for the significance level 0.05. It is also worth to test, whether residuals are homoscedastic. In practical sense, if residuals are heteroscedastic, this implies that the predictive power of the model is different for different sections of the data. Applying Goldfeld-Quandt test for heteroscedasticity, the null hypothesis that the two subsamples of the dataset have the same variance is not rejected for residuals of both metamodels (Table 3) . For the model 1 and model 2 all the inputs except control parameters are randomly generated producing 30 test scenarios. For each scenario, the output (net profit) is subject to maximization. Net-profit achieved by the metamodel-based optimization is compared with net-profit obtained by classical simulation-based optimization driven by the same GA. Metamodel-based optimization and simulation-based optimization last 31 generations with 100 candidate solutions in each generation. Percentage error is used for comparing the accuracy (Fig. 8 ). Mean percentage errors of 5.21 and 3.23 can be interpreted as insignificant taking into account complexity, computational budget and presence of stochastic noise. It is important to keep in mind that MLPs are approximations and an error is inevitable. MLP-based metamodels derived using neuroevolution have demonstrated capability to learn and generalize complex nonlinear relations between variables. Therefore, they can be applied for metamodeling of complex real-world systems in logistics and production. The experiments have been conducted to proof the concept, however, the accuracy of the metamodels can be improved further by performing more exhaustive search. However, exhaustive search is associated with an increase in the computational budget and the common trade-off between accuracy and computational efficiency. Optimization using MLP-based metamodels has several distinct advantages over the classical approaches. Firstly, the output of the MLP-based metamodel is always deterministic. Secondly, MLPs, by definition, are computationally less expensive, which is especially notable for shallow architectures that use activation functions from the linear unit family. A COVID-19 and shattered supply chains: Reducing vulnerabilities through smarter supply chains. IBM institute for business value The construction and implementation of metamodels Automated Machine Learning AutoML: A Survey of the State-of-the-Art Evolutionary neural AutoML for deep learning Metamodeling and optimization of a blister copper two-stage production process A domain-driven approach to metamodeling in additive manufacturing Economic lotsize using machine learning, parallelism, metaheuristic and simulation Approximating Continuous Functions by ReLU Nets of Minimal Width Representation and hidden bias: Gray vs. binary coding for genetic algorithms Evolution strategies and other methods Genetic algorithms, tournament selection, and the effects of noise Keras: The python deep learning library GitHub repository "metainventory Simulation-optimisation approach to stochastic inventory control with perishability Neuroevolutionary approach to metamodeling of production-inventory systems with lost-sales and markovian demand