key: cord-0780915-12cedlb1 authors: Melin, Patricia; Sánchez, Daniela; Monica, Julio Cesar; Castillo, Oscar title: Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy integration for COVID-19 time series prediction date: 2021-01-13 journal: Soft comput DOI: 10.1007/s00500-020-05549-5 sha: 7bba03aa09e8e9aa31164d765fb8398ab839f121 doc_id: 780915 cord_uid: 12cedlb1 In this paper, the latest global COVID-19 pandemic prediction is addressed. Each country worldwide has faced this pandemic differently, reflected in its statistical number of confirmed and death cases. Predicting the number of confirmed and death cases could allow us to know the future number of cases and provide each country with the necessary information to make decisions based on the predictions. Recent works are focused only on confirmed COVID-19 cases or a specific country. In this work, the firefly algorithm designs an ensemble neural network architecture for each one of 26 countries. In this work, we propose the firefly algorithm for ensemble neural network optimization applied to COVID-19 time series prediction with type-2 fuzzy logic in a weighted average integration method. The proposed method finds the number of artificial neural networks needed to form an ensemble neural network and their architecture using a type-2 fuzzy inference system to combine the responses of individual artificial neural networks to perform a final prediction. The advantages of the type-2 fuzzy weighted average integration (FWA) method over the conventional average method and type-1 fuzzy weighted average integration are shown. In recent months, we have observed the behavior of the latest global pandemic, the COVID-19 virus, and how it has affected countries worldwide with different consequences. There are countries with a high rate of confirmed and death cases, such as China, Brazil, and the USA, as well as countries that managed to keep their numbers low of confirmed and death cases (HDX 2020). The COVID-19 virus has motivated numerous investigations related to finding risk factors, symptoms, treatments, predictions, and sequels. In Zhang et al. (2020) , the authors describe the characteristics of COVID-19 patients with type-2 diabetes and analyze the risk factors for severity. For their analysis, they collected information about demographics, symptoms, treatments, and outcomes of COVID-19 patients with diabetes. They concluded that patients with type-2 diabetes patients are more susceptible to COVID-19. In Sakalli et al. (2020) , the authors determine the frequency and severity of symptoms, especially smell and taste loss of sense in COVID-19 disease, where patients with a positive COVID-19 diagnosis were questioned about general information such as age, sex, date of symptoms, and smoking history. Also, the patients were questioned about the most apparent symptoms. They conclude that smell and taste loss of sense are symptoms related to COVID-19. In Jin et al. (2020) , the authors analyzed the clinical use and efficacy of clinically approved drugs. They analyzed drug development progress for the treatment against COVID-19 in China, intending to provide information on the epidemic control in other countries. Regarding prediction, recent works have addressed prediction about a specific country or in the prediction of confirmed COVID-19 cases worldwide. In Torrealba-Rodriguez et al. (2020) , the authors presented Communicated by V. E. Balas. the modeling and prediction of confirmed cases of COVID-19 in Mexico, proposing mathematical and computational models. They proposed the Gompertz, logistic, and inverse artificial neural network model to predict information of the next eight days (from May 9 to 16). In Salgotra et al. (2020) , the times series forecast of the COVID-19 is analyzed for India country using genetic programming. In their work, they analyze the COVID-19 information about confirmed and death cases for the whole country and the most states affected by the pandemic: Maharashtra, Gujarat, and Delhi. To perform this analysis, they applied gene expression programming (GEP) to generate reliable models to perform prediction for the next 10 days. In Shastri et al. (2020) , the authors proposed deep learning models to analyze Covid-19 cases in India and the USA, using recurrent neural networks. According to their results, the confirmed and death cases for both countries will rise in the next 30 days. In Kırbas et al. (2020) , confirmed COVID-19 cases of Denmark, Belgium, Germany, France, the UK, Finland, Switzerland, and Turkey are modeled with autotegressive integrated moving average (ARIMA), nonlinear autoregression neural network (NARNN), and long short-term memory (LSTM) approach. They conclude that their model of LSTM provides a better prediction in the next 14 days. In previous works, we applied intelligence techniques such as ensemble neural networks (ENN), fuzzy logic (FL), and self-organizing maps (SOM) to analyze COVID-19 information. In Melin et al. (2020) , an analysis of coronavirus pandemic evolution by self-organizing maps (a type of unsupervised neural network) is performed. The achieved results allowed that the countries were grouped depending on their rate of confirmed, recovered, and death cases. These kinds of results allow making decisions about strategies for pandemic control around the world. In Melin et al. (2020) , we applied ensemble neural networks to predict COVID-19 confirmed and death cases of 12 states in Mexico. For each state, the ensemble neural networks are formed with three neural networks, and to the combination of the responses, a type-1 fuzzy inference system is used to apply weighted average integration. The achieved results were compared with the individual performance of each neural network. In most results, the proposed integration achieved better results than conventional monolithic neural networks predicting information of 10 future days. However, we also aim to propose a general method to apply it to other countries. An essential part of developing a method applicable to other countries is to find optimal architectures of ensemble neural networks. These architectures will allow predicting according to the cases of each country, i.e., there are countries whose cases are on a constant increase and others that have days when the number of cases unexpectedly shoots up. Hence, it is crucial to find an optimal architecture for the behavior of each country. For this reason, it was decided to use an optimization technique. In this work, a firefly algorithm is proposed because we have already applied this optimization in pattern recognition in previous work, specifically in human recognition using biometric measures (Sánchez et al. 2017 ). This optimization technique provided better neural network architectures against other optimization techniques, such as the genetic algorithm (GA) (Goldberg 1989; Sánchez and Melin 2014) , gray wolf optimizer (GWO) (Mirjalili et al. 2014; Sánchez et al. 2017) , and particle swarm optimization (PSO) (Eberhart and Kennedy 1995; Eberhart and Shi 2000; Sánchez et al. 2020 ) when the number of data for the training phase of the neural networks is decreased. In this work, the number of neural networks that form the ensemble neural network and their architecture in parameters, such as the number of hidden layers, neurons, and goal error, is optimized. We proposed a type-2 fuzzy integration to increase the performance between other integration techniques, such as the conventional average and the type-1 fuzzy weighted average. The optimization of ensemble neural network architectures with a firefly algorithm is proposed to improve the results of conventional monolithic neural networks and try to correctly predict more days than previous works. The proposed method proved its effectiveness by comparing its results of confirmed and death COVID-19 cases of 26 countries: Austria, Belgium, Bolivia, Brazil, China, Ecuador, Finland, France, Germany, Greece, India, Iran, Italy, Mexico, Morocco, New Zealand, Norway, Poland, Russia, Singapore, Spain, Sweden, Switzerland, Turkey, UK, and the USA. The main contribution of the proposed method is the optimization of the ensemble neural network architecture Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… and the combination of responses using a type-2 fuzzy inference system to assign a weight to each prediction and in this way be able achieve efficient prediction of 20 future days (from 06/28/2020 to 07/17/2020). This paper is organized as follows. The intelligence techniques applied in this work are briefly described in Sect. 2. In Sect. 3, the proposed method is described. In Sect. 4, the achieved experimental results are presented and explained. The statistical comparisons of results are presented in Sect. 5. The conclusions are finally given in Sect. 6. In this section, a brief description of the techniques applied in the proposed method is presented An artificial neural network is a popular intelligent technique that simulates the abilities of a human brain, such as its learning capability, and to generalize information. Its cells are emulated with units (known as neurons) interconnected, which manages weights. These weights store knowledge during the learning process (Aggarwal 2018) . Figure 1 shows an artificial neuron j with inputs (x 1 , x 2 ,…,x n ) and weight associated (w 1 , w 2 ,…w n ) called synaptic weights. The synaptic weights are added together as: This summation is the activation of the neuron j. The output of the neuron j is finally computed by an activation function being this output, the input of another neuron (except in the output layers). When in ANNs, the activation function is nonlinear (for example, hyperbolic tangent or sigmoid). This allows having better learning in complex patterns and nonlinearity behaviors. A conventional artificial neural network has three kinds of layers: input, hidden, and output layer, where each layer contains neurons interconnected among layers. The input layer transmits the input information; meanwhile, it can have one or several hidden layers that send information to the output layer, which produces a final result (Gurney 1997; Haykin 1998 ). In Fig. 2 , an example of an artificial neural network is shown. The neurons of the input and hidden layer are connected to all neurons in the next layer. The information is propagated through the network up to the output layer. An ensemble neural network is composed of various monolithic artificial neural networks (also known as modules). All the artificial neural networks are trained for the same task (Hansen and Salomon 1990; Soto et al. 2015) , becoming each neural network an expert of the same problem, where each one provides an answer; these answers can differ, in this work; for example, each artificial neural network provides a different prediction; even each one had learned the same information. For this reason, to obtain a final answer or decision, each answer is combined with the other answers using a unit integration . Figure 3 shows a representation of an ensemble neural network. We used this kind of neural network because it has been an excellent tool for time series prediction Soto et al. 2015) , each neural network gives us a prediction, and through an integration method, a final prediction is obtained. Fuzzy logic is an intelligent technique successfully used to model complex systems and derive useful fuzzy relations or rules proposed by L.A. Zadeh in 1965 (Zadeh 1965 Fig. 6 Structure of a type-2 fuzzy inference system P. Melin et al. Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… Zadeh 1998). In Boolean logic, an element belongs absolutely to a set (1) or not (0). In type-1 fuzzy logic, the element can partially belong with a membership grade represented with a crisp number in [0,1]. An example of a type-1 membership function is shown in Fig. 4 . A type-1 fuzzy set A is characterized by a type-1 membership function l A x ð Þ, where x 2 X in a universe of discourse X (Castro et al. 2007) . It can be represented as a set of ordered pairs of elements x, and its membership value is given as: L.A. Zadeh also proposes the concept of a type-2 fuzzy set in 1975 (Zadeh 1975) . The membership of an element is defined with a fuzzy membership function, i.e., the membership grade for each element of the set is a fuzzy set in [0, 1] . This type of fuzzy logic is recommended for application in situations where it is complicated to assign a crisp number in [0,1] as in type-1 fuzzy logic (Al-Jamimi and Saleh 2019; Melin and Castillo 2005) . A type-2 fuzzy set à can be defined as: where the domain of the fuzzy variable is denoted by X. The primary membership of x is denoted by J x 0; 1 ½ , and the secondary membership is a type-1 fuzzy set denoted by l A x; u ð Þ. The uncertainty is represented by a region known as the footprint of uncertainty (FOU). There is an interval type-2 membership function if l A x; u ð Þ = 1, 8 u 2 J x 0; 1 ½ as Fig. 5 shows with a uniform shading for the footprint of uncertainty (FOU) with its upper l A x ð Þ and lower l A ðxÞ membership function Mittal et al. 2020 ). An interval type-2 fuzzy set can be defined as: The union of all the primary memberships J x contained in the FOU can be defined as: The FOU A ð Þ is delimited by the upper membership function (UMF) and the lower membership function (LMF) defined as: A basic structure of a type-2 fuzzy inference system (T2FIS) has the components shown in Fig. 6 . These components are: (a) fuzzifier: in this process, the crisp input values are converted to fuzzy values, (b) inference: fuzzy reasoning is applied to obtain a type-2 fuzzy output, (c) defuzzifier: it maps the output to crisp values, (d) type reducer: it transforms a type-2 fuzzy set into a type-1 fuzzy, and (e) rule base: it contains fuzzy if-then rules and a membership function set known as database (Karnik et al. 1999a , Karnik et al. 1999b . The decision process is conducted by an inference system using the fuzzy if-then rules. These fuzzy rules define the connection between input and fuzzy output variables. The inference system values all the rules dorm the base of rules and combining weights of consequents of all the relevant rules in an only fuzzy set using the aggregation operation (Castillo et al. 2008; Karnik et al. 1999b ). The firefly algorithm was initially proposed in Yang (2009) and Yang and He (2013) , and is based on the firefly's behavior and flashing. Three basic principles are used in this algorithm: (1) the fireflies are unisex. For this reason, the fireflies can be attracted to other fireflies no matter their sex, and (2) the firefly attractiveness is proportional to its brightness. A couple of fireflies' behavior consists of the firefly with less brightness moves in the direction to the brighter one. If they both have the same bright, the firefly will move randomly, and (3) the objective function determines the brightness of a firefly. The variation of attractiveness b with the distance r is proposed in Yang and He (2013) and given by the equation: where b 0 is the attractiveness at r = 0. The movement of a firefly i to the brighter one j to the next iteration is defined by the equation: where x i represents the position of a firefly i in the iteration t, b 0 e Àr 2 ij x t j À x t i represents the attraction between a firefly j and a firefly i, and t i is a vector with random numbers whose randomization parameter is represented by a t ; this parameter is the initial randomness scaling factor defined by: where d is a value between 0 and 1. The values for a, b and d applied in this work are based on the recommendation of other work. To avoid local minimal, this algorithm uses a random array, which allows moving the fireflies and avoids stagnation. The proposed method combines ensemble neural networks, type-2 fuzzy integration, and the firefly algorithm, and its general architecture is described in this section. The proposed method consists of ensemble neural networks (ENNs), where the predictions of each artificial neural network (also known as module) are combined using a type-2 fuzzy weighted average, and a firefly algorithm is applied to optimize the ensemble neural networks architecture. In Fig. 7 work is similar to a feedforward network and has connections directly from the input layer to the subsequent layers Budak et al. 2020 ). The prediction error of the neural network k, k = {1, 2, 3,…,m} is given by equation: Fig. 16 Average convergence of confirmed cases for China using (a) 30%, (b) 20%, and (c) 30% for testing phase Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… where y i is the real value in the time i,ŷ ki is the prediction of the neural network k in the time i, and N is the number of data point of the testing set. The m value is defined by the optimization technique (number of neural networks or modules). In this work, type-2 fuzzy logic is applied, where a Mamdani type-2 fuzzy inference system is proposed to combine responses of the ensemble neural network. The number of inputs and outputs is determined by the number of neural networks that form the ensemble neural network. The fuzzy inference system has as inputs the prediction error (MSE) of each module (from module #1 to module #m). The outputs are the weights produced to combine the predictions allowing obtaining a final prediction of the ensemble neural network. In Fig. 8 , an example of the type-2 fuzzy inference system for three modules is presented. The fuzzy if rules are automatically generated depending on the number of inputs (modules) of the FIS, each variable (inputs and outputs) has 3 Gaussian membership function, and their linguistic labels are ''low,'' ''medium,'' and ''high.'' The ranges of each fuzzy output variable are 0 to 1. Meanwhile, for the inputs, the range adapts depending on the neural networks errors, i.e., the range is generated based on the prediction error (MSE, normalized values between 0 and 1) of the neural networks, where the errors (MSE) are sorted, and the minimal and maximal values are taken to establish the range of all the fuzzy inputs variables. As the input ranges are adaptable, a new type-2 fuzzy inference system is generated for each evaluation of the ensemble neural network. Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… In this work, type-2 Gaussian symmetric membership functions with uncertain mean are used and given by Eq. 12. An example of this kind of membership function is shown in Fig. 9 . It is important to emphasize that the firefly algorithm does not optimize the fuzzy inference system. Only the prediction error (MSE) of each neural network that forms the ensemble neural network is used to establish the ranges of the fuzzy input variables. The minimal and maximal range of the fuzzy input variables is given by Eqs. 13 and 14. Meanwhile, the fuzzy output variables values are established in Fig. 10 . The difference between R min and R max is defined by Eq. 15. where m 1 \m 2 . Sigma is represented with r, the values of m 1;k and m 2;k represent, respectively, mean1 and mean2, where k = 1, 2, and 3 are the number of membership functions in each fuzzy input variable. The r value for the input variables is established using Eq. 16. The separation between the mean1 and mean2 is defined by Eq. 17. The mean values for each of the three membership functions used in each fuzzy variable are given by Eqs. 18-23. An example of the fuzzy output variable design is shown in Fig. 11 , where R min is equal to 0, and R max is equal to 1. Equation 18-23 are applied to generate the fuzzy input variable parameters. The total number of possible fuzzy if-then rules is given by the equation: where m is the number of inputs (modules) forming the ensemble neural network; the fuzzy if-then rules are formed to combine all neural network predictions based on their prediction error. An example of fuzzy if-then rules when the ENN has two modules (m = 2) is the following: 1. If (e 1 is small) and (e 2 is small), then (w 1 is high) and (w 2 is high). 2. If (e 1 is small) and (e 2 is medium), then (w 1 is high) and (w 2 is medium). 3. If (e 1 is small) and (e 2 is high), then (w 1 is high) and (w 2 is low). 4. If (e 1 is medium) and (e 2 is small), then (w 1 is medium) and (w 2 is high). If (e 1 is medium) and (e 2 is medium), then (w 1 is medium) and (w 2 is medium). 6. If (e 1 is medium) and (e 2 is high), then (w 1 is medium) and (w 2 is low). 7. If (e 1 is high) and (e 2 is small), then (w 1 is low) and (w 2 is high). 8. If (e 1 is high) and (e 2 is medium), then (w 1 is low) and (w 2 is medium). 9. If (e 1 is high) and (e 2 is high), then (w 1 is low) and (w 2 is low). As was previously mentioned, the type-2 fuzzy inference system has as inputs the MSE values of each neural network. After the defuzzification, the type-2 FIS has as outputs the corresponding weights (as numeric values) for each neural network according to its prediction error (MSE) to obtain a final prediction given by the equation: where w 1 is the weight of module #1, w 2 is the weight of module #2, and so on up to w m , which is the weight of module m,ŷ 1 is the prediction of module #1,ŷ 2 is the prediction of module #2 and so on up toŷ m , which is the prediction of module m. The main contribution of this method is to know which and how many neural networks are needed to perform a good prediction. The firefly algorithm aims at finding optimal ensemble neural network architectures. The architecture consists of: The backpropagation algorithm used in the training phase to perform the learning process is the Levenberg-Marquardt (LM) algorithm. This algorithm has achieved better results with artificial neural networks applied to time series forecasting . In this work, three feedback delays are also applied. The objective function is to minimize the MSE of the ensemble neural network (testing set) and is given by the equation: where Y i is the real value in the time i, P i is the prediction of the ensemble neural network in the time i, and N is the number of data point of the testing set. In Table 1 , the minimum and maximum values for search space to establish the ensemble neural network architecture are shown. These parameters are based on previous works, where pattern recognition was applied Sánchez et al. 2017a, b) . In Table 2 , the parameters used to perform the evolutions of this algorithm are shown, values of the number of fireflies and the maximum number of iterations are based on (Sánchez and Melin 2014; Sánchez et al. 2017) , and for Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… parameters as a, b, and d, their values are based on the parameters recommended in Yang (2009) and Yang and He (2013) . In Fig. 12 , the diagram of the proposed method is illustrated. The proposed method is applied to the prediction of the COVID-19 time series for confirmed and death cases of 26 countries. The optimized results are obtained using as the testing set 30%, 20%, and 10% (black points in the graphs) of the information because we wanted to know how much information is necessary to achieve a good generalization, leaving the rest (70%, 80%, and 90%), respectively, for the learning phase (blue points in the graphs), divided into the training and validation sets (80/20). The achieved results by the proposed method are compared against the conventional average method, and type-1 fuzzy weighted average integration proposed in Melin et al. (2020) , performing 30 runs for a country (in each test). Each neural network (module) of the ensemble neural network performs a prediction of the next 20 days (pink points in the graphs). To integrate their prediction, the weights used in Eq. 25 are used to obtain a final prediction of the next 20 days in type-1 and type-2 fuzzy average integration tests. It is essential to mention that the prediction error presented in the following tables is based on the testing set. We present comparative figures with real next days in this work, predicting confirmed and death cases in the next 20 days. In Table 3 , the best architectures for confirmed cases for China are presented, where for all the tests, the best architecture uses three modules. The best result is obtained when 30% of the data points are used for the testing phase with three fitting neural networks. In Fig. 15 , the prediction of each module for the confirmed cases for China is shown, where 30% of data points for the testing phase are used, and as integration, the Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… conventional average method is applied. In Fig. 15a , the prediction of the next 20 days (pink points) tends to decrease, which indicates that it has a bad future prediction, but because the other modules have a good prediction, the final integration improves as Fig. 15d shows. The average convergence for each test for confirmed cases for China is shown in Fig. 16 , where the behavior of the runs with the type-2 fuzzy integration has a better performance than others method. The type-1 FWA integration has a convergence very similar to the average method, except for when 10% is used for the testing phase, where the average method obtains better performance. The average predictions of the next 20 days of each test for confirmed cases for China are shown in Fig. 17 . As these results show, the type-2 fuzzy logic (20% testing set) is the test that achieved predict more close to real data up to the eighth day (Day #166, 07/05/2020). It occurs because the previously confirmed cases were increasing slowly, which caused the neural networks to learn this pattern, and for all the techniques, it was difficult to predict more days. We can notice on the Y-axis that the number of cases increases from 100 to 100. Although type-1 FWA integration at the end of the next 20 days, it was closer to the number of real cases. Table 4 , the best architectures for death cases for China are shown. The function fitting neural network prevails as the best neural network. For death cases, the best architecture has four modules using type-2 FWA integration, and 30% of the data points are used for the testing phase. In Fig. 18 , the prediction of each module for the death cases for China is shown, where 30% of data points for the testing phase are used, using as integration method the type-2 FWA. In Fig. 18b and c, the prediction of the next 20 days tends to decrease, but the other modules allowed with the type-2 FWA integration have a more stable prediction, as Fig. 18e shows. The type-2 fuzzy variables generated for this ensemble neural network are shown in Fig. 19 . The average convergence for each test for death cases for China is shown in Fig. 20 , where the behavior of the runs with the three integration methods seems similar, but the type-2 fuzzy integrator achieved better results than the conventional average method and the type-1 FWA. The average predictions of the next 20 days of each test for death cases for China are shown in Fig. 21 , and as these results show, the type-2 fuzzy logic (10% testing set) is the test that achieved predict more close to real data up to the seventeenth day (Day #175, 07/14/2020). In Table 5 , the best architectures for confirmed cases for the USA are presented. The best architecture has four modules using as integration the type-1 FWA. In Fig. 22 , the prediction of each module for the confirmed cases for the USA is shown, where 30% of data points for the testing phase using as integration method type-1 FWA. Figure 22a shows how the prediction begins ascending, but it begins to descend after a few days. This situation does not affect the final result shown in Fig. 22d because the other modules had a better prediction, which allowed the final prediction of the next 20 days to rise as expected. The average convergence for each test for confirmed cases for the USA shown in Fig. 23 , where the behavior of the runs with the three integration methods seems similar when 30% of data points are used as the testing set, but the type-1 FWA achieved a better average than the other integration methods. When 20% and 10% of data points are used for the testing phase, the type-2 FWA had better performance. The type-1 FWA integration and the average method had a convergence very similar. The average predictions of the next 20 days of each test for confirmed cases for the USA are shown in Fig. 24 . As Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… these results show, the type-2 fuzzy logic (20% testing set) is the test that achieved predict more close to real data up to the thirteenth day (Day # 171, 07/10/2020). In Table 6 , the best architectures for death cases for the USA are shown, where for all the tests, the best architecture uses three modules. The cascade-forward neural network prevails in these results where type-2 FWA integration is applied. In Fig. 25 , the prediction of each module for death cases for the USA is shown, where 30% of data points are used for the testing phase with integration method type-2 FWA. The prediction of the next 20 days for each module is good, although for modules 2 and 3, Fig. 25b and c, respectively, their prediction has a faster ascent. The type-2 FWA integration allowed a good final prediction shown in Fig. 25d , with a more gradual increase. The type-2 fuzzy variables generated for this ensemble neural network is shown in Fig. 26 . The average convergence for each test for death cases for the USA is shown in Fig. 27 , where the runs with the type-2 FWA integration have a better performance only when 30% of the data points are used for the testing phase. In the other tests, the average method achieved better performance. The average predictions of the next 20 days of the tests for death cases for the USA are shown in Fig. 28 . As these results show, the type-2 fuzzy logic (30% testing set) is the test that achieved predict more close to real data up to the ninth day (Day # 167, 07/06/2020). In Table 7 , the best architectures for confirmed cases for Mexico are shown. The function fitting neural network prevails as the best neural network, where the best architecture has four modules using as type-2 FWA integration. In Fig. 29 , the prediction of each module for the confirmed cases for Mexico is shown, where 10% of data points for the testing phase are used, using as integration method a type-2 fuzzy inference system. The prediction of the next 20 days shown in Fig. 29 (b-d) shows a faster increase in confirmed cases. The combination with the prediction of Module #1 shown in Fig. 29a allows to have a better final prediction using the type-2 fuzzy weighted integration. The type-2 fuzzy variables generated for this ensemble neural network are shown in Fig. 30 . The average convergence for each test for confirmed cases for Mexico is shown in Fig. 31 , where the behavior of the runs with the three integration methods also seems similar when 30% of data points are used for the testing phase, but the type-2 fuzzy integrator achieved a better average than the other integrations in all the tests. In Fig. 31b , the average method and type-1 FWA achieved a behavior very similar. Meanwhile, in Fig. 31c , type-1 FWA had the worst performance. The average predictions of the next 20 days of each test for confirmed cases for Mexico are shown in Fig. 32 . As these results show, the type-2 fuzzy logic (30% testing set) is the test that achieved predict more close to real data up to the tenth day (Day #168, 07/07/2020). In Table 8 , the best architectures for death cases for Mexico are presented. In this case, the best architecture has four modules using the average method. Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… In Fig. 33 , a prediction of each module for death cases for Mexico is shown, using as integration method type-2 FWA. We want to show how a type-2 FWA allows us to have a good prediction even when a module (in this case, module #2, shown in Fig. 33a ) had a bad performance. The advantage of the proposed integration can be observed in the predictions shown in Fig. 36 . The type-2 fuzzy variables generated for this ensemble neural network are shown in Fig. 34 . The average convergence for each test for death cases for Mexico is shown in Fig. 35 . The behavior of the runs with the type-2 fuzzy integration has a better performance than the others method. The average method and the type-1 FWA seem to have similar performance, although, in Fig. 35c , the average method had a better result. The average predictions of the next 20 days of each test for death cases for Mexico are shown in Fig. 36 . As these results show, the type-2 fuzzy logic (30% testing set) is the test that achieved predict more close to real data up to the sixth day (Day #164, 07/03/2020). This section presents a summary of results obtained with the conventional average method, type-1, and type-2 fuzzy weighted average. The tests were performed using 30%, 20%, and 10% of the data points for the testing phase for confirmed and death COVID-19 cases of 26 countries. In Table 9 , the results achieved (MSE) using 30% for the testing phase for the three integration methods are shown for confirmed cases; as the best averages indicate in bold in the table, most countries obtain a better result with the type-2 FWA integration. Only for two countries: New Zealand and the USA, the type-1 FWA was a better performance. Meanwhile, the conventional average method only had a good performance with France. In Fig. 37 , the results of confirmed cases using a testing set of 30% are graphically illustrated. In Table 10 , the results achieved (MSE) using 30% for the testing phase for the integration methods are shown for death cases; as the best averages indicate in bold in the table, all the countries obtain a better result with the type-2 fuzzy weighted average integration. In Fig. 38 , the death case results using a testing set of 30% are graphically illustrated. In Table 11 , the results achieved using 20% for the testing phase for the three integration methods are shown for confirmed cases. As the Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… best averages indicate in bold in the table, most countries obtain a better result with the type-2 FWA. Only for one country, the average method and the type-1 FWA had a better performance, for New Zealand and Switzerland, respectively. In Fig. 39 , the results of confirmed cases using a testing set of 20% are graphically shown. In Table 12 , the results achieved using 20% for the testing phase the three integration methods are shown for death cases; as the best averages indicate in bold in the table, most countries obtain a better result with the Type-2 FWA integration. Only for two countries, New Zealand and the USA, the conventional average method achieved better performance. In Fig. 40 , the death case results using a testing set of 20% are graphically shown. In Table 13 , the results achieved using 10% for the testing phase for the three integration methods are shown for confirmed cases; as the best averages indicate in bold in the table, most countries obtain a better result with the type-2 FWA integration. The conventional average method only had better performance in Bolivia and the UK. Meanwhile, type-1 FWA integration only works with Finland and Switzerland. In Fig. 41 , the results of confirmed cases using a testing set of 10% are graphically shown. In Table 14 , the results achieved using 10% for the testing phase for the three integration methods are shown for death cases, as the best averages indicate in bold in the table. Also, most countries obtain a better result with the type-2 FWA integration. The conventional average method only had better performance with Morocco and the USA. Meanwhile, type-1 FWA only works well with New Zealand. In Fig. 42 , the death case results using a testing set of 10% are graphically shown. The results shown above indicate that a type-2 FWA method allows having, on average, better results in most tests. In the next section, tests are performed to prove their effectiveness statistically. In this section, Wilcoxon signed-rank tests results are presented. The critical values are shown in Table 15 , where the different values of a are shown depending on the statistical significance. For this work, a 0.10 level is used. The averages shown for each country in each test are used to perform these statistical tests. In Table 16 , the results of the Wilcoxon test statistic for confirmed cases are shown comparing the conventional average method and the type-2 FWA integration proposed in this work. To compare the results achieved by the proposed method with a 0.10 level of significance, the result in the column named ''W'' must be equal o smaller than the critical value (column named ''W 0 '') to reject the null hypothesis. As the results have shown, the type-2 FWA integration achieved to improve results over the conventional average method. In Table 17 , the results of the Wilcoxon test statistic for death cases are presented. As the results showed, the type-2 FWA is also achieved to improve results over the conventional average method for death cases. In Table 18 , the results of the Wilcoxon test statistic for confirmed cases are shown comparing type-1 and the type-2 FWA integration proposed in this work. As the results have shown, the type-2 fuzzy FWA integration achieved to improve results over the type-1 FWA integration. In Table 19 , the results of the Wilcoxon test statistic for death cases are presented. As the results showed, the type-2 FWA integration is also achieved to improve results over the type-1 FWA integration for death cases. Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… In this paper, a firefly algorithm is proposed to find optimal ensemble neural network architectures using type-2 fuzzy logic for improving weighted average as the integration method to predict confirmed and death COVID-19 cases of 26 countries. The FA finds essential architecture parameters, such as the number of artificial neural networks with their types of artificial neural networks (feedforward, function fitting, or cascade-forward neural network). As an integration method, we proposed a type-2 fuzzy inference system to calculate the weights for an average method. Its input ranges are based on the prediction error (MSE) of the artificial neural networks that form the ensemble neural network, i.e., in each evaluation performed by the firefly algorithm, a type-2 fuzzy system is created, which allows the integration specifically of the ensemble neural network that is being evaluated. The input of the fuzzy inference Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… system is the corresponding MSE error. After the defuzzification, the outputs are the weights (numeric values) for each prediction according to its MSE to obtain a final prediction (testing set and the 20 next days). The results obtained by the proposed integration are compared against a conventional average method and type-1 fuzzy weighted average. The results achieved show how the type-2 fuzzy weighted average obtained better results (MSE) than the other integrations techniques when a final prediction of the testing set is performed, but also this integration showed how its prediction of the next days is the more close to real data. The other methods applied to integrate the responses had better performance in a few countries (1 or 2). This demonstrates the stability of the proposed integration. In conclusion, the presented results show that the type-2 fuzzy weighted average integration allows us to obtain a good prediction of the next days, even when a module has a bad result, like for the case of Mexico. The results also show that the number of correctly predicted future days may vary by country and the percentage of information used for the ensemble neural network training phase. In some results, it can only predict six days; in other results, it shows that it can predict up to 17 days. The ensemble neural networks are demonstrated to be a useful tool when a good unit integration is applied, as in this work. As future works, the optimization of the fuzzy if-then rules is considered, and for the ensemble neural network, the percentage of data for the training phase are considered. Other optimization techniques will also be used to compare ensemble neural network architectures and reaffirm our proposed integration. Funding This research work did not receive funding. Conflict of interest All the authors in the paper have no conflict of interest. Ethical approval This article does not contain any studies with human participants or animals performed by any of the authors. Optimization using the firefly algorithm of ensemble neural networks with type-2 fuzzy… Transparent predictive modelling of catalytic hydrodesulfurization using an interval type-2 fuzzy logic Critical flow prediction using simplified cascade fuzzy neural networks Cascaded deep convolutional encoder-decoder neural networks for efficient liver tumor segmentation An interval type-2 fuzzy logic toolbox for control applications Feed-forward neural networks training: a comparison between genetic algorithm and backpropagation learning algorithm Applying artificial neural network and curve fitting method to predict the viscosity of SAE50/MWCNTs-TiO2 hybrid nanolubricant A new optimizer using particle swarm Comparing inertia weights and constriction factors in particle swarm optimization Feedfoward and feedback adaptive controls for Continuously Variable Transmissions Genetic Algorithms in Search Optimization and Machine Learning Drug treatment of coronavirus disease 2019 (COVID-19) in China Applications of type-2 fuzzy logic systems to forecasting of time Type-2 Fuzzy Logic Systems Comparative analysis and forecasting of COVID-19 cases in various European countries with ARIMA, NARNN and LSTM approaches Hybrid Intelligent Systems for Pattern Recognition Using Soft Computing: An Evolutionary Approach for Neural Networks and Fuzzy Systems, 1st edn Analysis of Spatial Spread Relationships of Coronavirus (COVID-19) Pandemic in the World using Self Organizing Maps Multiple Ensemble Neural Network Models with Fuzzy Response Aggregation for Predicting COVID-19 Time Series: the Case of Mexico Grey Wolf Optimizer A comprehensive review on type 2 fuzzy logic applications: past, present and future Assessment of thermal conductivity enhancement of nano-antifreeze containing single-walled carbon nanotubes: optimal artificial neural network and curve-fitting Optimization of Ensemble Neural Networks with Type-2 Fuzzy Integration of Responses for the Dow Jones Time Series Prediction Particle swarm optimization of ensemble neural networks with fuzzy aggregation for time series prediction of the Mexican Stock Exchange Ear nose throat-related symptoms with a focus on loss of smell and/or taste in COVID-19 patients Time Series Analysis and Forecast of the COVID-19 Pandemic in India using Genetic Programming Optimization of modular granular neural networks using hierarchical genetic algorithms for human recognition using the ear biometric measure A Grey Wolf Optimizer for Modular Granular Neural Networks for Human Recognition Optimization of modular granular neural networks using a firefly algorithm for human recognition Comparison of particle swarm optimization variants with fuzzy dynamic parameter adaptation for modular granular neural networks for human recognition Time series forecasting of Covid-19 using deep learning models: India-USA comparative case study Optimization of the fuzzy integrators in ensembles of ANFIS model for time series prediction: the case of Mackey-Glass The Humanitarian Data Exchange (HDX) Modeling and prediction ofC OVID-19 in Mexico applying mathematical and computational models Firefly algorithms for multimodal optimization Firefly Algorithm: recent Advances and Applications Fuzzy sets The concept of a linguistic variable and its application to approximate reasoning Some reflections on soft computing, granular computing and their roles in the conception, design and utilization of information/intelligent systems Clinical analysis of risk factors for severe COVID-19 patients with type 2 diabetes