key: cord-0805185-pp2c7yxy authors: Rahman, Md Arafatur title: Data-Driven Dynamic Clustering Framework for Mitigating the Adverse Economic Impact of Covid-19 Lockdown Practices date: 2020-07-03 journal: Sustain Cities Soc DOI: 10.1016/j.scs.2020.102372 sha: 29fce84a42b62cc035d387f695dcee33047d5852 doc_id: 805185 cord_uid: pp2c7yxy The COVID-19 disease has once again reiterated the impact of pandemics beyond a biomedical event with potential rapid, dramatic, sweeping disruptions to the management, and conduct of everyday life. Not only the rate and pattern of contagion that threaten our sense of healthy living but also the safety measures put in place for containing the spread of the virus may require social distancing. Three different measures to counteract this pandemic situation have emerged, namely: i) vaccination, ii) herd immunity development, and iii) lockdown. As the first measure is not ready at this stage and the second measure is largely considered unreasonable on the account of the gigantic number of fatalities, a vast majority of countries have practiced the third option despite having a potentially immense adverse economic impact. To mitigate such an impact, this paper proposes a data-driven dynamic clustering framework for moderating the adverse economic impact of COVID-19 flare-up. Through an intelligent fusion of healthcare and simulated mobility data, we model lockdown as a clustering problem and design a dynamic clustering algorithm for localized lockdown by taking into account the pandemic, economic and mobility aspects. We then validate the proposed algorithms by conducting extensive simulations using the Malaysian context as a case study. The findings signify the promises of dynamic clustering for lockdown coverage reduction, reduced economic loss, and military unit deployment reduction, as well as assess potential impact of uncooperative civilians on the contamination rate. The outcome of this work is anticipated to pave a way for significantly reducing the severe economic impact of the COVID-19 spreading. Moreover, the idea can be exploited for potentially the next waves of corona virus-related diseases and other upcoming viral life-threatening calamities. Mitigating the Adverse Economic Impact of Covid- 19 Lockdown Practices The COVID-19 disease has once again reiterated the impact of pandemics beyond a biomedical event with potential rapid, dramatic, sweeping disruptions to the management, and conduct of everyday life. Not only the rate and pattern of contagion that threaten our sense of healthy living but also the safety measures put in place for containing the spread of the virus may require social distancing. Three different measures to counteract this pandemic situation have emerged, namely: i) vaccination, ii) herd immunity development, and iii) lockdown. As the first measure is not ready at this stage and the second measure is largely considered unreasonable on the account of the gigantic number of fatalities, a vast majority of countries have practiced the third option despite having a potentially immense adverse economic impact. To mitigate such an impact, this paper proposes a data-driven dynamic clustering framework for moderating the adverse economic impact of COVID-19 flare-up. Through an intelligent fusion of healthcare and simulated mobility data, we model lockdown as a clustering problem and design a dynamic clustering algorithm for localized lockdown by taking into account the pandemic, economic and mobility aspects. We then validate the proposed algorithms by conducting extensive simulations using the Malaysian context as a case study. The findings signify the promises of dynamic clustering for lockdown coverage reduction, reduced economic loss, and military unit deployment reduction, as well as assess potential impact of uncooperative civilians on the contamination rate. The outcome of this work is anticipated to pave a way for significantly reducing the severe economic impact of the COVID-19 spreading. Moreover, the idea can be exploited for poten- The world population currently faces one of the most life-threatening disasters since the past century with the emergence of a novel Coronavirus variation, commonly referred to as COVID-19 or 2019-nCoV [1] . With early confirmed cases of viral pneumonia dated back in December 2019 in Wuhan, China, this virus-induced disease has now affected at least 2.6 million population all over the world [2] with the death toll rate at 20% [2] within just less than 4-5 months. While the current mortality rate at 14.36% is less than the Coronavirus predecessors such as SARS (at 3.4% ) [3] and MERS (at 9.6%) [4] , the COVID-19 has distinguished characteristics of rapid spreading due to both the viral properties (e.g., long incubation period, asymptomatic effects for the super spreader, droplet carriers) [5] as well as current socio-demographic globalization and urbanization (ubiquitous mobility and affordable transportation). This had led the world health organization (WHO) to declare the virus as the global pandemic [6] due to the scale, impact, and pressure to the public health system across the continents. Given the wide prevalence of the Covid-19 outbreak, scientists and policymakers across the globe have continuously worked to mitigate the evolving impact of Covid-19 with varying measures planned and adopted over different places. The first one is the effort towards vaccine development and vaccination [7] . While this seems to be widely agreed with consensus among the vast majority of governments, the Covid-19 vaccines are not yet ready and it may take 12-18 months to complete the development [8] . Even once they are ready, the affordability and availability for population across different economic levels may take some time until extensive deployment. The second measure is relying on the development of herd immunity [9] by allowing the virus to spread in the 2 J o u r n a l P r e -p r o o f population until a sufficient number of people are infected and develop a selfimmune system to the virus. This method has been seen as controversial since it may overload and overburden the already pressurized healthcare systems, resulting in many deaths and fatalities. Therefore, only a few countries [9] are still consistently pursuing this route, despite various calls to revise the policy. The third method is to impose a variety of strict social and physical distancing policies, resulting in "lockdown" situations, in order to contain and/or limit the spread of the virus. It corresponds to isolating physical environments (areas, places, cities, or even countries) from inward and outward traveling as well as imposing and regulating resident mobility within the given areas. The rationale is that the virus has already infected part of the population with an accurate figure of infectious proportion (hidden and asymptotic cases) being not fully understood. This measure has seen the effectiveness in countries such as France [10], China [11] where the curves for confirmed cases and deaths have started flattening. For a variety of reasons, including its mitigation effectiveness in several countries, the lockdown policies seem to attract a growing popularity across a vast number of governments. A drastic lockdown measure imposes a national-level of mobility restriction with examples being practiced in countries such as Malaysia, Italy, France [12] while a more moderate restriction applies to provincial or regional levels. A major consideration of imposing lockdown in a given area is often due to a steady increase of uncontrollable and newly-appearing. Many countries are usually reluctant initially to directly impose a lockdown since the adverse prospect of the economy due to the restricted mobility [13] . In many cases, informal labors and daily-earning workers are among the first to feel the impact, which in turn can lead to protests [14] and civil unrest [14] . In this work, we investigate a data-driven framework to potentially reduce a severe economic impact of the lockdown through improved resident mobility embedded within systematic changing boundary of the lockdown area. We envision near real-time soft lockdown boundaries that exhibit mobility-health risks and can be directly linked with permitted economic activities regulated by the 3 J o u r n a l P r e -p r o o f authorities. More specifically, our contributions include the following. • We devise a data capture framework that integrates public healthcare and telecommunication data records for subsequent epidemic decision making. While the former record has been subject to extensive recent research, utilization of the second record is motivated by gathering the population mobility through their interactions with cellular networks, logged by cellular providers. The cumulative data can be huge (Big Data) due to availability and usage of smart mobile devices all over the population. • Based on the captured data, we suggest an artificial intelligence (AI) engine to generate a dynamic lockdown map with flexible data-driven updating capability in order to determine the lockdown boundaries. • Using the infection data as well as mobility data, we partition a given location into a set of lockdown and non-lockdown clusters with soft transition between the extreme cases. The cluster resolution (in terms of area) will depend on the data resolution available at that particular location. • By first assuming full cooperation from the population during this national crisis, we assess the effectiveness of this dynamic lockdown proposal in terms of the lockdown coverage area, the reduced economic loss as well as associated lockdown-deployment resources, highlighting the promises and potential issues. • Through simulation, we further examine the impact of non-cooperative segment of the population (a.k.a. noise) to the system effectiveness in terms of average number of contamination, demonstrating that such a noise can significantly change the decision on the size and distribution of the lockdown clusters. Due to possible cycle of pandemic situations, including corona virus [15] , we see that such a dynamic lockdown mechanism can assist in devising a suitable strategy, not only in the current situation, but also the future pandemic. This solution will be well fitted in sustainable smart city and society. J o u r n a l P r e -p r o o f The rest of the paper is organized as follows. Section 2 reviews related works with most focus on the non-clinical-related research for COVID-19. Section 3 discusses in details our proposed data-driven framework along with a dynamic clustering algorithm in support for reducing harsh economic consequences. Section 5 describes our simulation to assess the effectiveness of the framework and algorithm as well as discuss the results. Section 6 concludes this work with a summary of key findings. There are several numbers of works that have been done to battle against the outbreak of COVID-19 and any pandemic situation. Majority of them are inclined to medical domain such as potential vaccines [16, 17] , early detection of COVID-19 using X-Ray images [18] and so on. In addition, many technologies have been incorporated to mitigate the adverse impacts of COVID-19. In this section, we discuss that domain below. Since the outbreak of the COVID-19, several statistical, dynamic and numerical models have been proposed to dissect its escalation dynamics [19, 20, 21] . In reference [22] , the authors have proposed a forecasting model to predict COVID-19 patients based on a plain mathematical model using limited epidemiological information. Even though the epidemiological models help assess the dynamics of escalation, they expect knowledge of certain parameters and rely upon different hypotheses. To surmount constraints of those methods, in reference [23] the authors have developed an AI model for real-time prediction of new and accruing cases of COVID-19 across China. Besides, this model predicts potential data trend and area of COVID-19 escalation in China and stratify them into different clusters. These kinds of machine learning models otherwise known as AI-based predictive models are quite handy to classify COVID-19 positive and negative patients and would be helpful to mitigate the spreading of this virus [24] , provided that, we utilize appropriately. The main component of constructing an AI-based predictive model is information a.k.a. data or big data. Big data innovation has experienced rapid development and accomplished incredible achievements in various sectors. Big data would be useful to eliminate this kind of pandemic situation and has huge opportunities for harmonizing sustainable development goals. By 2050, it is assumed that two-third of the world population is expected to be urbanized [25] . Advanced big data practices would help to ensure the security and safety of those people and work towards compiling the sustainable development goals. Different national agencies are trying to utilize satellite imagery, mobile phone data, health records as a source to compile statistical indicators such as compiling mobility, road safety, healthcare issues etc. [26] . Countries like Taiwan has approached such techniques and mastered the pandemic with the help of big data analytics [27] . Moreover, to fight against COVID-19, GIS (Geographic Information Systems) and big data innovations have performed major role in many scenarios such as rapid visualization, spatial tracking, geographical segmentation, balancing of resources, etc. which provided exact information to support decision making, asses the effectiveness of this pandemic and control it properly [28] . These data are retrieved from different sources and thus, tend to be framed in a structured (e.g., tabular data) or unstructured (e.g. images, videos) form. Handling unstructured data are arduous and sometimes inconvenient for predictive models. Although structured data are flexible to use, AI models are perceptive to extreme values also knows as outlier which is common in real life tabular data sets. In some cases, noises (e.g., inconsistency, missing or null values, outlier) caused by behavioral patterns and by distinguished events can further make the prediction task convoluted. To deal with extreme values in time-series datasets, in [29] , the authors have proposed an algorithm based on Markov switching models that observe both regular and extreme values and output the result accordingly. The outcome of this algorithm outperforms several state-of-the-art methods of detection tasks. It is worth mentioning that, data collection is another crucial aspect of any data-driven approach and a time-consuming procedure as well. This ruins the purpose and advantages of AI in such pandemic circumstances. To reduce the 6 J o u r n a l P r e -p r o o f shortcoming, in [24] the authors have proposed to collect recent travel history alongside common symptoms of individuals using a cellular phone-based online survey. Such collected data can be fed into an AI framework to speed up the preliminary identification process of COVID-19 contamination. A risk map is designed further to isolate potential high-risk plateau early and abate the chance of expansion. Besides AI, visualization-based solutions are also provided by other researchers. In [30, 31] the authors have introduced a map-based information system during any contagious disease outbreak that visualizes the map spotting different clusters of risk levels (e.g., normal, caution and danger) and provides statistical analysis. Another major concern during and the post-pandemic situation is an economic perturbation. The timing of COVID-19 is unfortunate and caused countless travel cancellations, shutting down casinos, and so on [32] . The consequences of a pandemic typically affect a large portion of the workforce as well. Motivated by this fact, In [33] the authors have enhanced the dynamic inoperability input-output model (DIIM). Unlike the aforementioned works, in this paper, we present a data-intelligence framework for alleviating adverse economic distress during and post-pandemic situation. The framework is aimed to deal with public data, ensure data security, and construct a real-time dynamic clustering algorithm with the aid of AI. A data-driven framework (as depicted in Figure 1 ), which alleviates the adverse economic repercussion due to lockdown during the COVID-19 outbreak, is proposed to overcome the drawbacks of conventional lockdown policies. The framework is detailed in this section. The proposed framework has three primary components that are associated with each other. These are: i) the core telecommunication networks (CTNs), ii) the COVID-19 control center (CCC), and iii) the stakeholders. The CTNs are the base of this proposed framework whereas the CCC is responsible for receiving instructions from the stakeholders (e.g. National Health Organization and Health Ministry) upon which, it will accumulate information from the endusers exploiting the CTNs. This information will be further processed inside the CCC and the outcome of this procedure will be shared with stakeholders and end-users through the CTNs. The workflow of these phases is elaborated in detail in the following sub-sections. A core communication network will act as a bridge between the CCC and end-users considering its two-way communication characteristic. The healthcare information with geo-location and telecommunication records of end-users will be extracted and further sent to the CCC in this segment. However, other network techniques can also be utilised for developing such communication networks [34, 35, 36] . Moreover, the CTNs will convey further updates or decisions provided by the CCC or stakeholders to the end-users. The network components and architecture are discussed in the following. Network Components:. To operate the overall data exchange tasks, three network components will be used, which are elaborated below. • End Device: We choose cell phones asend devices due to their accessibility. Healthcare information (i.e., COVID-19 related features) and telecommunication records (i.e., user location and movement tracking information) will be retrieved from the end-user through a mobile application. • Base Station: A channel or route is a must need to transmit data from one location to another. Base stations are those channels to ensure data transmission. In this framework, base stations are telecommunication towers that are connected to the CTNs. With the aid of mobile application, endusers information will reach to nearest telecommunication tower a.k.a. the Base Station, and further transferred to the CTNs. J o u r n a l P r e -p r o o f Network Architecture:. All the network components will be connected based on this architecture to transfer and receive information. The end device will be connected with the base station in a single-hope-communication manner according to the star topology. Then the base station will connect to the CTNs in a single-or multi-hop communication mechanism. Finally, the CTNs will store the information into the data center. This architecture has a bidirectional data flow, which are End-device-to-Base-Station-to-CTNs-to-Data-Center and data center-to-CTNs-to-base station-to-end device. The earlier flow can be exploited for data acquisition from the end device, i.e., the people health information related to COVID-19. Based on this information, the suspected COVID-19 cases will be identified using data analytics (DA). The later flow will be exploited to inform people about their lockdown map and other COVID related information from the stakeholders. The CCC holds the core position of this proposed framework. It is accountable for data streaming and generating final decisions. The operation within this unit comprises five sub-phases, which are discussed in the following. 1. Data Center: Data transmitting from stakeholders and CTNs will be stored and preserved into the data center. The data transmission approach is designed as two-way communication between the stakeholders and CTNs. The purpose of two-way communication is to deliver final decisions to the stakeholders and end-users and accumulate data from those zones as well. Besides, depending on the situation, information and messages from stakeholders will be conveyed to end-users through the data center. Once the data are stored properly, the data center will handover this information to the data security for ensuring the privacy of individuals. 2. Data Security: It is an important part that will assure security Triad CIA (i.e., Confidentiality, Integrity and Availability) in order to develop secure data collection and trustworthy networks [37, 38] . The personal healthcare 11 J o u r n a l P r e -p r o o f information of a user will be kept confidential (C), cannot be changed by others (I) and the data can be accessed anytime (A). A block-chain enabled healthcare security system named MediChain was presented in [39] , which can be further enhanced and exploited here. One of the key issues of the MediChain is scalability, which needs to be addressed in the proposed system since a large number of users may involve with this system. Three features can be incorporated with MediChain to overcome the scalability issue, namely the multi-transnational block, multi-node approach, and lightweight framework. (a) Multi-Transnational Block: The first feature will facilitate quick and easy processing of data from the client to the data center. Transnational data is not restricted to a limited block size to ensure that multiple transactions within the same block can share the same unique hash code. Thus the time and computing power required to process the data will be exponentially reduced. Once the data security is ensured, these data will be passed to the data analytics unit. 3. Data Analytics: In this sub-phase, a behavioral analysis of aggregated COVID-19 data from the National Health Organisation and CTNs will be executed. Additionally, an AI engine will be devised here to predict potential COVID-19 carriers. The methodology of this predictive model is described below: 12 J o u r n a l P r e -p r o o f (a) Workflow: Our methodology (as depicted to Figure 2 ) toward predictive modeling starts with generating hypotheses depending on the problem statement, which, in this case detecting potential COVID -19 patients. We sketch an underlying AI framework to start-off the procedure. As reaping data from the NHO and CTNs is a continuous process, our proposed approach will differentiate them as preprocessed data and current data. Current data will go for the data integration, participate in theexploratory data analysis, and afterward enter into the data preparation section. Whereas, previously processed data will be used for data augmentation to increase the number of training examples in the data preparation section. Once data are processed, we will then extract significant features from the dataset otherwise known as feature engineering. This will help our model to predict more precisely. From that point onward, we 13 J o u r n a l P r e -p r o o f will train a deep neural network (NN) model feeding those processed data. The value of hyperparameters will be tuned for optimal solutions and our model will be evaluated further. We will perform this process repeatedly until we locate the desired result. Then, we will check if it satisfies our current objective or not. Negative feedback will force back this procedure to the beginning and it will go through the same processes until the result matches the end goal. We will then deploy the model for real-world usage. Here, two functionalities will be operated. One is the real-time prediction for potential COVID-19 patients, and another one is a data visualization of different scenarios of COVID-19. Moreover, periodical monitoring and debugging will also be available in our proposed workflow to keep updated our model. In that case, we must try each of them and evaluate to get the best model which is impossible. Thus, in practice, it is better to make some reasonable assumptions about the data and try out a few reasonable models and find the best amongst them. We will introduce NN to our model, which is an integral part of deep learning. Deep learning is a branch of machine learning algorithms that utilizes multiple layers to dynamically extract higherlevel features from the raw input. A deep neural network (DNN) is an artificial neural network (ANN) with numerous layers between the input and output layers. Regardless of a linear or non-linear relationship DNN identifies factual mathematical manipulation to transform the input into the output. The network travels through the layers computing the likelihood of each yield [41] . The reason behind using DNN in our model is DNN works better with complex data with a huge number of features, generally knows as high dimensional data, [42] than basic machine learning algorithms and in our case, there is a possibility of having a higher number of input features. Training examples will be the input for the input layer and the number of hidden layers will be defined according to the problem. We use ReLU (Rectified Linear Unit) [43] as an activation function for hidden layers aside from the last one. As our concern is binary classification thus, we use the sigmoid [43] function for the last hidden layer. Finally, we will get the predicted outcome from our model in the output layer. sub-phase. A clustering algorithm will be forged here to delineate the lockdown strategy. Unlike other clustering algorithms, the proposed algorithm is dynamic, means shrinking and expanding of different clusters will be varied and updated regularly due to the merit of the data. At first, using the infection data and mobility data processed in the Data Analytic sub-phase, a dynamic lockdown cluster will be initialized. This dynamic lock-down cluster will be forwarded to the Decision Engine and remain unchanged until new data arrives from DA. If new information about COVID-19 patients arrives from DA, it will be passed through a decision parameter that checks the similarity of new data with the previous one. If it is similar and no change is required, the lock-down cluster will remain the same. On the other hand, if it finds any dissimilarity the lock-down cluster will be altered according to the information. The algorithm is discussed in detail in section 4 5. Decision Engine: This phase of CCC will monitor the entire procedure of different sub-phases and account for generating end decisions too. DC will provide the cluster blueprint and Data Center will provide the necessary information and requirements asked by the stakeholders. Taking those into account, the Decision Engine will conclude a decision and forward it to Data Center. As Data Center communicates with both Stakeholders and CTN's, it will inform them about the decision. 6 . Stakeholders: Stakeholders are the end decision-makers as well as the information providers. Stakeholders could be any Government Organizations such as the Health Ministry, National Health Organization (NHO), or Non-Government Organizations (NGOs). They will provide significant information and requirements for CCC. CCC will process these things and convey them back to the stakeholders. Analyzing those results from CCC, stakeholders will take decisions on what to do depending upon circumstances. In order to deploy the aforementioned framework, there are several aspects that we need to take into account such as i) Network Infrastructure Deployment Cost; ii) System Design Cost; iii) Ensuring Security Cost. In terms of the first aspect, we don't need any costing as this project will exploit existing telecommunication networks. Moreover, the network utilization cost could be waived due to the support from telecommunication companies to the Government for taking control of this pandemic. For example, many telecommunication companies have already offered free services worldwide. Moving towards the second aspect, as this paper envisions that, the proposed system will be implemented by the Government, we assume they already have data centers to store all the information. We only need to consider the cost of system design which is supposed to be affordable to the Government. To ensure security, a scalable security platform is needed to be developed and the cost will be comparable to the existing security platform. Therefore the implementation cost will not be a big issue and the system complexity should not be high as well. From section 3, we get a clear overview of our proposed model. However, in this paper, we divide our methodology into five phases which are portrayed beneath. As mentioned earlier in Section 2, the conventional lockdown system is insufficiently flexible. The emerging opportunities of "Big Data" seems like potential alternatives. Many researchers have been exploiting this idea and proposing applications that make use of big data and machine learning to minimize the economic catastrophe [44] in different situations. However, a few works have been done to reduce the economic loss in a pandemic situation such as COVID-19 outbreaks. These non-clinical solutions are static and usually one dimensional, making them less suitable for mitigating the economic impact on different parameters in a pandemic situation. On the contrary, a data-driven dynamic clustering algorithm would be much more efficient than a static lockdown strategy in such scenarios. Like many other approaches, public healthcare information and geographic location can be used for designing a dynamic clustering algorithm in this paper. Machine learning and big data utilization to invent vaccines [45] or any clinical operations are out of the scope of this paper. The clustering algorithm is illustrated in Algorithm 1 where we initialize a lockdown map and we assume it will be updated every day. To start with, we define a set of variables that is depicted in Algorithm 1-Line#2. Here, TGSA, an 'N by N' array, refers to the total grid size of the area where N is the area size in km 2 and each grid is 1-km 2 . MGSCA is the minimum grid number of clusters. LIASC is the location information of active and suspected cases which will be collected from Core Telecommunication Networks. LP is the total number of days during the lockdown. LCIM is the lockdown cluster information map. We start searching for TGSA from day one until the lockdown period. At first, there will be no lockdown cluster information map, thus we initialize LCIM as zero. After that, as the search progresses, we count the number of patients using a function CNASC and store that in a variable called CPD in Line#7 17 J o u r n a l P r e -p r o o f of Algorithm 1. We then enter the value of CPD into LCIM. In Line#13, we introduce a function named CreateClusteringMap to generate clusters. This function takes three arguments that are lockdown cluster information map, the threshold for the red zone, and the threshold for the orange zone. We define a variable Clusters which is the number of active and suspected cases in all clusters. If this Cluster value surpasses the threshold of the red zone, a red color will be assigned there which means it should be under a hard lockdown zone. If the value of the cluster exceeds the threshold of the orange zone, orange color will be assigned there. If any cluster fails to exceed either of those thresholds mentioned above, the cluster will be marked by a green color indicating a safe zone. for cl = 1; cl < T otal − Cluster; cl + + do 18: if Cluster(cl) >TRZCC then 19: Cluster(cl) will be marked as Red-Zone i.e., Hard lockdown; 20: else if Cluster(cl) >TOZCC then 21: Cluster(cl) will be marked as Orange-Zone i.e., Soft lockdown; To analytically support our experiments, we have calculated the R factor often described as the basic reproduction number R Naught (R 0 ). The parameter R 0 is characterized as the number of cases that expected to happen on average in a homogeneous population due to infection by an individual when the population is susceptible at the start of an epidemic. For example, if a person develops a disease and affects three others, the R 0 is 3. From an epidemiological model called SIR [46] , we find that R 0 is based on mainly three factors namely susceptibility, infectivity, and removal rate. Susceptibility is the size of the population that is susceptible to disease. Infectivity is the accurate rate of infection and removal rate is the rate of closed cases either by the recovery or deaths. There are many ways to calculate the basic reproduction number. However, a commonly used equation for calculating R 0 is where β represents transmission rate and γ is the removal rate. S belongs to the susceptible population. At the start of an epidemic, it is assumed that the whole population is susceptible, thus the value of S is 1. In our experiment, we used the same value when MCO was not imposed. After imposing MCO, we assume 70% percent of people maintained social distancing and government orders. Thus 30% of the population were susceptible during MCO and the value of S became 0.3. As COVID-19 is an airborne disease, the transmission rate β is high. From [47] we find the transmission rate in Malaysia during the MCO and non-MCO periods which are 0.58 and 0.69, respectively. From [48] , we took a total of 45 days of data from the 3rd March 2020 to the 17th April 2020 to calculate R 0 . We collected the number of closed cases and using the equation below we calculated the removal rate γ, i.e., where, θ r is the total number of recoveries, θ d is the total number of deaths can notice that, the basic reproduction number reached its peak on around 15th day when MCO was not imposed and sharply declined in the MCO period. The value of this reproduction number depends on many factors with one of which being the mobility of infectious individuals. Towards the end of the paper, we investigate the impact of such a mobility when the movement control order is imposed. For this purpose, our analysis leverages the mathematical findings from the graph theory. More specifically, by modelling the infection spread as a multiple random walks problem and interacting particle system [49] , we inspect the graph cover time, which provides estimation on how quick the infection can spread in the community. Specific results and discussion are given in Section 5.4. In all the experiments, we have used the same device and MATLAB software for simulation. We have followed the scenarios based on the Malaysian Context. Malaysia has a land area of 329,613 km 2 , comprising of 13 states. However, this total area includes oceans, mountains, forests, etc. where people can hardly be found. Thus, in our simulation, we use 25% of the total land area corresponding to roughly 83,000 km 2 . For simplicity, we assume this portion to be a square shape and divide it into grids. We then set those 13 states as 13 squares with certain area values, a total equivalent to one-fourth of the entire land area of Malaysia. Finally, we generate random numbers to put victims inside those 13 states. To yield results from our simulation we retrieve data from the Malaysian Government Website [48] , where key information of this experiment is the COVID-19 active cases. Although we need other data such as the number of suspected cases, their location, etc., we will only highlight the active cases as portrayed in Figure 4 due to the unavailability of the aforementioned data. Besides, there will be some parameters related to specific coverage areas, namely minimal, moderate, maximum, and six weeks of movement control order (in days), which will play a pivotal role in the simulations. However, these parameters may vary country to country with respect to different factors such as population density, land area, spreading factors etc. From Figure 4 , we can observe that initially, the number of active cases accelerate exponentially and afterward decline linearly. As the incubation period takes time around 3 to 16 days, the basic reproduction number (as depicted in Figure 3 ) justifies the active cases (as depicted in Figure 4 ) to accelerate exponentially and then decline eventually. We will see how this phenomenon plays a role in our dynamic lockdown strategy in the following experiments, demonstrated in section 5. In this phase, we have highlighted the impact of dynamic clustering concerning several aspects such as the lockdown coverage area, the reduced economic loss, and also the number of involvements for instance number of militaries, doctors, vaccines, centers, and so on. Finally, we will also examine the "noise" component in a lockdown practice, which corresponds to asymptomatic patients and uncooperative civilians. Along with that, we have validated the results through an analytical model, explained in section 4.3. Results of these experiments are disclosed in the following section. In this section, we have performed four major experiments with respect to our proposed algorithm. The outcome of these experiments are discussed below¿ In the first experiment, we have done performance comparison (as depicted in concerning a specific coverage area and movement control order (in days). In the dynamic lockdown strategy, we categorize the coverage area into three parts, namely minimal, moderate, and maximum corresponding to five, seven, and nine kilometers, respectively. As expected, in this simulation we can see the number of coverage areas is dynamically evolving due to a reason indicated by Figure 4 . The static strategy contributes all the territory under lockdown whereas the dynamic strategy with maximum coverage expands a bit more than half of the static lockdown area. This captures an interesting trade-off of the dynamic lockdown strategy. Although minimal coverage area cuts down a lot of lockdown spaces and gives people the flexibility to move, the risk factor increases and it has a more chance of community spreading of COVID-19. These will play an important role related to the noise, which will be discussed in Section 5.4. On the contrary, the maximum coverage area emphasizes safety, restricting people's flexible movement. Week-1 Week-2 Week-3 Week-4 Week-5 Week Here we will reveal the economic impact of using the dynamic lockdown strategy. We intend to show how we can quantify the losses occurring during the COVID-19 outbreak by utilizing the three approaches of our dynamic lockdown framework. A recent report by the Malaysian Government [50], for example, shows an amendment of RM 2.4 billion per day because of this situation. If we use the static lockdown strategy where all the areas are shut down, from Figure 6 , we see that we cannot recuperate anything. However, using our approach, we can save up to 80 percent of economic loss in week one. For maximum, moderate, and minimal coverage areas, we can restore 60%, 70%, and 80% of the losses, respectively. This is reasonable in light of the fact that for the minimal approach, the coverage area is less, and saving can be improved. We further notice that the behavior of this reduced economic loss graph is also dynamic and follows according to Figure 4 . Week-1 Week-2 Week-3 Week-4 Week-5 Week Even though there are numerous resources contributing to deal with this pandemic circumstance, due to the unavailability of accurate information, in this experiment, we have only highlighted the deployment of the military resources. We have kept the simulation setup analogous as before. However, a few changes have been made to obtain useful findings. The Royal Military (7500 ATM units) is helping the Police in making 405 roadblock checkpoints across Malaysia [51] . These involvements are based on schedules with the duration of each being 6 hours. For the simplicity of this experiment, we have calculated these involvements as per day. We have randomly put a roadblock which covers around a radius of 3 kilometers. If an infected person presents in that region the roadblock will remain active, otherwise, it will be shifted somewhere else. After doing this simulation, we have delineated the findings in Figure 7 . Herein we notice that our dynamic lockdown strategy has performed outstandingly as 26 J o u r n a l P r e -p r o o f expected and subsequently our strategy has managed to reduce the military deployment rate down to practically 67%. It is worth mentioning that, the characteristic of this graph changes dynamically based on a reason given in Figure 4 . This is also logical since more casualties need increasingly military support, and thus, when the active cases are at the peak the requirement for military units is likewise high. Even though our previous experiments have delineated the benefits of using our framework, in this experiment we will show the potential shortcoming of our strategy. As our lockdown strategy works with public data, there could be a possibility of uncooperative civilians. These people might hide any information or not bother to provide any relevant data. Besides, asymptomatic patients can likewise be available in these situations. We model those uncooperative civilians and asymptomatic patients as the "noise", which can have a devastating effect due to its spreading nature. However, in this experiment, we will show how compelling our framework to deal with this necessarily. According to [52] , in one cycle a person can contaminate a normal flu up to 1.3 persons. However, in the case of COVID-19, a noise can spread up to 4 persons, which is triple to the rate of normal contamination. In this simulation, we have considered one cycle only and simulated the movement of each noise using a random walk mobility model. Figure 8 depicts the effects of noise (numbered from 20 to 80) on the average number of contamination for the dynamic lockdown strategy. Note that without any lockdown, 20 noises can spread up to 80 persons whereas, in the dynamic lockdown, as shown in Figure 8 , the contamination drops off to nearly 10 persons with maximum coverage area. With more area being in a locked-down state, the less the possibility of community spreading is occurring. Disregarding the way that as noise goes up the number of passive contacts is likewise high, the dynamic lockdown strategy appears to be a more reasonable choice than the no lockdown strategy. An analytical explanation from the noise findings in Fig. 8 from the graph theory. By modelling individual grid units as vertices and the mobility pattern as edges, we can cast the infection spread as a multiple random walks problem. By assuming k infectious individuals as noises at the beginning of the random walks and assuming their independent movements in an r-regular graph (r ≥ 3), we can deduce the cover time over the graph given by-with high probability- [49] C G (k) ∼ ϑ r k n log n for k = o(n/ log 2 n), and C G (k) = O n k log n + log n for any other k, where ϑ r = r−1 r−2 and n denotes the number of vertices. The cover time can offer a simplifying view about the speed of the virus spread. As the noises (non-complying individuals) k increase, the smaller the time required to cover the movement across every vertex in the graph. By a continuity argument, it means that at any given time after this movement, the probability of people residing in the vertices of the graph getting exposed to the virus is amplified with an increase in k. This might explain why the charts in Fig. 8 consistently escalate when the noise value is increased. Due to our unavailable access to required data, some of these experiments have been performed using randomly simulated data. However, if we have accurate and undistorted data such as the exact location of individuals, we could have derived more meaningful findings from that. In this paper, we have developed a data-driven dynamic clustering framework for mitigating the adverse economic impact of the COVID-19. Our propose framework has been described to have three central components, namely the Data Analytics, Dynamic Clustering, and Data Security. In our analysis, we have mainly focused on Dynamic Clustering and its impact so that both stakeholders and users can infer the benefits. A clustering algorithm has been proposed, which has been simulated extensively across four scenarios to identify the promises and shortcomings of this algorithm. We have noticed that our clustering algorithm significantly improves the relevant performance metrics by almost 50 percent in the lockdown coverage experiment, 60-80 percent in the potential reduction of economic loss, and 20 percent in the military unit utilization. An actual scheme will be further re-designed by taking into consideration the real data, which will serve as a future direction of this work. Using our proposed framework as a baseline, the future work may include investigation of specific machine learning techniques in which relevant features can be extracted and selected for training the machine learning models. Incubation period of 2019 novel coronavirus (2019-ncov) infections among travellers from wuhan, china Coronavirus cases Sars-cov-2: fear versus data Does sars-cov-2 has a longer incubation period than sars and mers? Covid-19 infection: origin, transmission, and characteristics of human coronaviruses Flattening the exponential growth curve of covid-19 in ghana and other developing countries; divine intervention is a necessity, Divine Intervention Is A Necessity Vaccine designers take first shots at covid-19 How long will it take to have a vaccine for covid-19? Herd immunity-estimating the level required to halt the covid-19 epidemics in affected countries The impact of social distancing and epicenter lockdown on the covid-19 epidemic in mainland china: A datadriven seiqr model study, medRxiv Countries continue with restrictions Govt reluctant to implement lockdown due to economic concerns Coronavirus lockdown protest: What's behind the us demonstrations? The deadly Coronaviruses: The 2003 SARS pandemic and the 2020 novel Coronavirus epidemic in china Immune responses in covid-19 and potential vaccines: Lessons learned from sars and mers epidemic Preliminary identification of potential vaccine targets for the covid-19 coronavirus (sars-cov-2) based on sars-cov immunological studies Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images Early transmission dynamics in wuhan, china, of novel coronavirus-infected pneumonia Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study Reporting, epidemic growth, and reproduction numbers for the 2019 novel coronavirus (2019-ncov) epidemic Early prediction of the 2019 novel coronavirus outbreak in the mainland china based on simple mathematical model Artificial intelligence forecasting of covid-19 in china Identification of covid-19 can be quicker through artificial intelligence framework using a mobile phone-based survey in the populations when cities/towns are under quarantine Urban big data and sustainable development goals: Challenges and opportunities The big (data) bang: opportunities and challenges for compiling sdg indicators Response to covid-19 in taiwan: big data analytics, new technology, and proactive testing Covid-19: Challenges to gis with big data Prospective infectious disease outbreak detection using markov switching models Peacock: A map-based multitype infectious disease outbreak information system Dove: An infectious disease outbreak statistics visualization system Going viral-covid-19 impact assessment: A perspective beyond clinical practice Estimating workforce-related economic impact of a pandemic on the commonwealth of virginia Lcpc error correction code for iot applications Software-defined wireless sensor networks in smart grids: An overview A cyber-enabled mission-critical system for post-flood response: Exploiting tv white space as network backhaul links Trustdata: Trustworthy and secured data collection for event detection in industrial cyber-physical system Scalable machine learning-based intrusion detectionsystem for iot-enabled smart cities Md Zakirul Alam Bhuiyan, Privacy-friendly platform for healthcare data in cloud based on blockchain environment The lack of a priori distinctions between learning algorithms Logistic regression and artificial neural network classification models: a methodology review A survey on deep learning for big data Activation functions: Comparison of trends in practice and research for deep learning Monitoring the impact of economic crisis on crime in india using machine learning Covid-19 coronavirus vaccine design using reverse vaccinology and machine learning When will it be over?": An introduction to viral reproduction numbers, r0 and re, The Centre for Evidence-Based Medicine develops Covid-19 epidemic in malaysia: Impact of lock-down on infection dynamics, medRxiv Covid-19: Malaysia outbreak monitor: Live updates Multiple random walks and interacting particle systems Majlis keselamatan negara (rasmi) An intensive-care expert broke down just how contagious the coronavirus is, showing how one person could end up infecting 59,000 in a snowball effect