key: cord-0935410-gfu4bp2x authors: Melin, Patricia; Monica, Julio Cesar; Sanchez, Daniela; Castillo, Oscar title: Analysis of Spatial Spread Relationships of Coronavirus (COVID-19) Pandemic in the World using Self Organizing Maps date: 2020-05-21 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2020.109917 sha: 5478334a757501d84632af5917543d067359a8f0 doc_id: 935410 cord_uid: gfu4bp2x We describe in this paper an analysis of the spatial evolution of coronavirus pandemic around the world by using a particular type of unsupervised neural network, which is called self-organizing maps. Based on the clustering abilities of self-organizing maps we are able to spatially group together countries that are similar according to their coronavirus cases, in this way being able to analyze which countries are behaving similarly and thus can benefit by using similar strategies in dealing with the spread of the virus. Publicly available datasets of coronavirus cases around the globe from the last months have been used in the analysis. Interesting conclusions have been obtained, that could be helpful in deciding the best strategies in dealing with this virus. Most of the previous papers dealing with data of the Coronavirus have viewed the problem on temporal aspect, which is also important, but this is mainly concerned with the forecast of the numeric information. However, we believe that the spatial aspect is also important, so in this view the main contribution of this paper is the use of unsupervised self-organizing maps for grouping together similar countries in their fight against the Coronavirus pandemic, and thus proposing that strategies for similar countries could be established accordingly. Recently we have witnessed the rapid spread of the Coronavirus around the globe, beginning originally in China and then spreading to Korea and Japan, and after that to Europe and America. In particular, in the case of Europe, Italy and Spain have been hit very hard with the spread of the virus, having many confirmed cases and deaths. After that, in the American continent, the United States has also been hit very hard with the spread of the virus. So it is very critical understanding all the facets of this problem, for being able to cope with its complexity and at the same limit its negative impact on the health of the population around the world and also the economic implications for the countries. Due to the importance of finding ways to control the propagation of the virus, many papers have been put forward on these last months related to different aspects of this problem, and in particular several authors have attempted to apply computational intelligence techniques in this area. As a sample of these works we can mention the ones below. The coronavirus disease (COVID-19) is a viral infection highly transmittable caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which originally appeared in Wuhan, China, and it has sequentially propagated around the world. The intermediate source of origin and transfer to humans is not known, but the quick human to human transfer has been confirmed in many experiments. Nowadays there is not yet a clinically approved antiviral drug or vaccine that can be used against . Recently at the end of 2019, the city of Wuhan, China, the epicenter of the current COVID-19 experienced an outbreak of a novel coronavirus that killed more than eighteen hundred and infected thousands of individuals within the first two months of the epidemic [9] . More recently, the epicenter has moved to other cities in Europe and then in America. The patients' most notable found symptoms (according to the collected experimental data) are dry cough, dyspnea, fiver and bilateral lung infiltrates on imaging. Initially all the cases were associated to Wuhan's Huanan Seafood Wholesale Market, which trades in seafood and a wide variety of live animal species. Due to the many reported cases up to January 30 th 2020, the World Health Organization (WHO) declared the Chinese outbreak of COVID-19 to be a Public Health Emergency of International Concern posing a high risk to countries with vulnerable health systems around the world [10] . There have recently been several studies with the goal of understanding the patterns of COVID-19, and one of this is: using a dataset of X-ray medical images from patients with common bacteria pneumonia confirmed with COVID-19 disease to identify possible patterns that may lead to the automatic diagnosis disease using convolutional neural networks, and the results demonstrate that the used method has significant effects on the automatic detection and diagnosis of COVID-19 [11] . Another interesting study is the investigation of the cases of COVID-19 in China using dynamic statistical techniques [12] . Other cases are: predicting commercially available antiviral drugs that may act on the novel coronavirus using a deep learning model [13] and early prediction of the 2019 novel coronavirus outbreak in mainland China based on simple mathematical model [14] . Also, the paper in [15] offers pointers to, and describes, a range of practical online/mobile GIS and mapping dashboards and applications for tracking the 2019/2020 coronavirus epidemic and associated events as they unfold around the world. In addition, in [16] the authors proposed applying the concept of cartograms to visualize both the expansion and spread of COVID-19. Finally, we have to mention that some research has been done using Artificial Intelligence (AI), for example the study in [17] in which the authors proposed the use of machine learning algorithms for improving possible case identifications of COVID-19 more quickly when using a mobile phone-based web survey. Also several AI techniques are applied in analyzing data and decision-making processes in healthcare. This means that AI-driven tools can help in identifying COVID-19 outbreaks, as well as forecast their nature of spread rate across the world [18] . However, most of the previous works deal with the temporal aspect of the problem, which means that these works are attempting to predict or forecast in different ways the coronavirus numeric data. Of course, this facet of the problem is also important, as governments want to be able to know the estimated future values of the coronavirus cases to make the right decisions regarding funds to be assigned to solving the problem. On the other hand, it is our firm believe that the spatial aspect is also very important, so in this regard the main contribution of this paper is the use of unsupervised self-organizing Kohonen maps for grouping together similar countries in their fight against the Coronavirus pandemic, and thus in this way be able to propose that strategies for similar countries could be established accordingly. In our opinion, this contribution is very important as it could complement the temporal perspective that has been developed by most of the previous papers by providing the spatial component to achieve a complete solution to the Coronavirus problem. The remaining contents of the paper are structured in the following form. Section 2 outlines the fundamental concepts of self-organizing maps, which are a particular form of unsupervised neural networks. Section 3 describes the problem at hand and the proposed methodology in this work. Section 4 summarizes the simulation results achieved with the proposed approach. Finally, Section 5 offers the conclusions and possible future works. The Self-organizing maps (SOM), also called the Kohonen map, is a model being used to explore and visualize patterns in high-dimensional datasets. This model was first introduced by Teuvo Kalevi Kohonen in 1982. SOM is a clustering technique that identifies groups in a dataset without having to use traditional statistical techniques. The SOM consists of only two layers: the input layer and the output layer [1] . The goal of this neural network is to transfer all input data objects with n attributes (n dimensions) to the output in a way that the objects are related to each other. The SOM is based on an unsupervised training where there is no given output target, the objective of the algorithm is to find the set of centroids (neurons) to represent the cluster, but with topological restrictions. Topology refers to a centroid arrangement on the output grid, the most common used topology grids are the hexagonal and rectangular. Each of all data objects in the dataset is assigned to each centroid. Each neuron in the SOM grid is closely related to each other and each of the inputs are connected to each of the output nodes by means of a connection weight. Weights from N input nodes to M output nodes are initialized in small values randomly [2] . The activation of the output units according to Kohonen's is shown in the Eq. 1. The modification of the weights is shown in Eq. 2: where =activation of output unit j, =activation value from input unit, = lateral weights connecting to output unit, = neurons in neighborhood, = unity function returning 1 or 0, = gain term decreasing over time. The lateral connections enable the SOM to learn "competitively", meaning that the output neurons in the output layers compete for the classification of the input patterns. At the beginning of the training, the input patterns are presented to the SOM and the output object with the nearest weight vector will be the winner to represent that cluster. Equation 1 shows how the Euclidean distance is used to select the winning neuron [3] . In Figure 1 , the SOM neural network structure is illustrated with its neighborhoods around the winning neuron. Artificial neural networks, such as the SOM have widely been used in many applications, such as for identification of groundwater salinity sources [4] , Determination of plant communities based on bryophytes [5] , Prediction of arthritis [6] . However, here the SOM is In Figure 2 a sample of a SOM neural network used for clustering and is classification for the countries is shown. In the case of the 32 states of Mexico, two of the most prevalent diseases in the population were also studied, which are hypertension and diabetes. This is in order to find similarities and form groupings by states between the diseases and Covid-19. The database of these diseases was obtained from the open data web page of the Mexican Institute of Social Security (IMSS) [9] . The proposed method based on the Kohonen self-organizing maps was used to form groupings or clusters of countries in the world, and after that their classification was done by considering 4 classes according to the severity of the number of Coronavirus cases: Very High, High, Medium and Low (indicated by red, orange, yellow and green, respectively, in the maps). In Table 1 In Figure 5 we show a plot of the clusters formed with the SOM method, clearly indicating the classes for the Covid-19 recovered cases for the January 22 of 2020 to May 13 of 2020 period of time. In addition, the same analysis can be done for the spatial distribution of deaths due to Coronavirus around the globe. In Figure 6 we show a plot of the clusters formed with the SOM method, clearly indicating the classes for the Covid-19 death cases for the January 22 of 2020 to May 13 of 2020 period of time. We were also interested in taking down this spatial analysis to the country level, and for this we applied it to the country of Mexico. In this case, we have to consider 32 states in Mexico and the SOM method clusters states according to their similarities to other states, finding in this way a colored map similar to the world map. In Figure 7 we can find the clustering of states in Mexico according to the confirmed Coronavirus cases during the period of time from February 27 of 2020 to May 13 of 2020. In Table 2 , states of Mexico are ordered by number of cases in the clusters, and then alphabetically inside the clusters. In addition, the same analysis can be done for the spatial distribution of deaths due to Coronavirus in the states of Mexico. In Figure 8 we In this case, we were also interested in the possible relation of propensity of Coronavirus deaths to the chronic degenerative Hypertension and Diabetes diseases. Based on this, we also applied SOM clustering to the publicly available data in Mexico of these cases [18] . In Figure 9 we Once again, if we compare Figures 8 and 10 we can find that there is a similarity between states with higher number of deaths to the states with higher number of Diabetes cases, confirming a relation between these variables. In this regard, we believe a model could be constructed using the number of cases of hypertension and diabetes to estimate the number of Coronavirus cases, that could reflect the interaction among these variables. In this paper an analysis of the spatial evolution of coronavirus pandemic around the world by using a particular type of unsupervised neural network was presented. Based on the clustering abilities of self-organizing maps we were able to spatially group together countries that are similar according to their coronavirus cases, in this way being able to analyze which countries are behaving similarly and thus can benefit by using similar strategies in dealing with the spread of the virus. Publicly available datasets of coronavirus cases around the globe from the last months were used in the analysis. Interesting conclusions have been obtained, that could be helpful in deciding the best strategies in dealing with this virus. In addition, the proposed approach was tested with the spatial distribution of cases around the country of Mexico and its relation to the Diabetes and Hypertension cases. Most of the previous papers dealing with data of the Coronavirus have viewed the problem on its temporal aspect, which is also important, but this is mainly concerned with the forecast of the numeric information. However, we believe that the spatial aspect is also important, so in this view the main contribution of this paper is the use of unsupervised self-organizing maps for grouping together similar countries in their fight against the Coronavirus pandemic, and thus proposing that strategies for similar countries could be established accordingly. As future work, we envision integrating both the spatial and temporal aspects of the Coronavirus spread problem in a unified manner to achieve a complete view and solution to the problem. We can also consider applying other intelligent techniques (like fuzzy logic, evolutionary algorithms and swarm intelligence) that could help in dealing in a better way with this complex problem. Finally, we could also consider other recent approaches, as the ones presented in [20, 21] , and other recent interesting works related to evolving fuzzy models and chaos, like in [22] [23] [24] [25] [26] . In summary, we envision that there are many potential beneficial lines of research that could be engaged. The authors of the above manuscript whose names are listed above certify that they have NO affiliations with or involvement in any organization or entity with any financial interest (such as honoraria; educational grants; participation in speakers' bureaus; membership, employment, consultancies, stock ownership, or other equity interest; and expert testimony or patent-licensing arrangements), or nonfinancial interest (such as personal or professional relationships, affiliations, knowledge or beliefs) in the subject matter or materials discussed in this manuscript. Patricia Melin proposed the method and the experiments that were performed. Julio Cesar Monica analyzed and implemented the proposed method, and contributed to the simulations. Daniela Sanchez validated the implementation and the results. Oscar Castillo did his work on the neural model and explanation of results. All authors documented the results and prepare the manuscript, as well as worked on enhancing quality of writing. ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Clustering the ecological footprint of nations using Kohonen's selforganizing maps Data Science: Concepts and Practice Data mining using rule extraction from Kohonen self-organising maps Self-organizing maps for the identification of groundwater salinity sources based on hydrochemical data Determination of plant communities based on bryophytes: The combined use of Kohonen artificial neural network and indicator species analysis Prediction of arthritis using a modified Kohonen mapping and case based reasoning COVID-19 infection: origin, transmission, and characteristics of human coronaviruses World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19) Covid-19: Automatic detection from X-Ray images utilizing Transfer Learning with Convolutional Neural Networks Investigating the Cases of Novel Coronavirus Disease (COVID-19) in China Using Dynamic Statistical Techniques Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drugtarget interaction deep learning model Early Prediction of the Novel Coronavirus Outbreak in the Mainland China based on Simple Mathematical Model Geographical tracking and mapping of coronavirus disease COVID-19/severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) epidemic and associated events around the world: how 21st century GIS technologies are supporting the global fight against outbreaks and epidemics Visualising the expansion and spread of coronavirus disease 2019 by cartograms Identification of COVID-19 can be quicker through artificial intelligence framework using a mobile phone-based survey in the populations when Cities/Towns are under quarantine AI-driven tools for coronavirus outbreak: Need of active learning and cross-population Train/Test models on Multitudinal/Multimodal data Preliminary bioinformatics studies on the design of a synthetic vaccine and a preventative peptidomimetic antagonist against the SARS-CoV-2 (2019-nCoV, COVID-19) coronavirus Analysis and Forecast of COVID-19 spreading in China Anticipated Synchronization and Zero-Lag Phases in Population Neural Models Surrogate model based optimization of traffic lights cycles and green period ratios using microscopic simulation and fuzzy rule interpolation Evolving Fuzzy Models for Prosthetic Hand Myoelectric-based Control Evolving fuzzy models for prosthetic hand myoelectric-based control using weighted recursive least squares algorithm for identification Fuzzy granular gravitational clustering algorithm for multivariate data The authors declare the following financial interests/personal relationships which may be considered as potential competing: NONE