key: cord-294468-0v4grqa7 authors: Kasilingam, Dharun; Prabhakaran, S.P Sathiya; Dinesh Kumar, R; Rajagopal, Varthini; Santhosh Kumar, T; Soundararaj, Ajitha title: Exploring the Growth of COVID‐19 Cases using Exponential Modelling Across 42 Countries and Predicting Signs of Early Containment using Machine Learning date: 2020-08-04 journal: Transbound Emerg Dis DOI: 10.1111/tbed.13764 sha: doc_id: 294468 cord_uid: 0v4grqa7 COVID‐19 pandemic disease spread by the SARS‐COV‐2 single‐strand structure RNA virus, belongs to the 7(th) generation of the coronavirus family. Following an unusual replication mechanism, it’s extreme ease of transmissivity has put many counties under lockdown. With uncertainty of developing a cure/vaccine for the infection in the near future, the onus currently lies on healthcare infrastructure, policies, government activities, and behaviour of the people to contain the virus. This research uses exponential growth modelling studies to understand the spreading patterns of the COVID‐19 virus and identifies countries that have shown early signs of containment until 26(th) March 2020. Predictive supervised machine learning models are built using infrastructure, environment, policies, and infection‐related independent variables to predict early containment. COVID‐19 infection data across 42 countries are used. Logistic regression results show a positive significant relationship between healthcare infrastructure and lockdown policies, and signs of early containment. Machine learning models based on logistic regression, decision tree, random forest, and support vector machines are developed and show accuracies between 76.2% to 92.9% to predict early signs of infection containment. Other policies and the decisions taken by countries to contain the infection are also discussed. Coronaviruses, though uncommon, are serious pathogens responsible for infections that posit flu-like symptoms in infected individuals. These symptoms sometimes resemble the cold and cough symptoms caused by the rhinovirus. Recently, the family has added its seventh generation coronavirus -SARS-CoV-2 (Chengxin et al., 2020) . The virus shares 79% identity to Severe Acute Respiratory Syndrome (SARS) and 50% identity to Middle East Respiratory Syndrome (MERS) epidemic outbreak in 2003 and 2012 (Salute, 2020) . SARC-CoV-2 that causes COVID-19 mutated to transmit from animal to human. This virus is believed to have transferred to humans through bats from a meat market in Wuhan, China (Rajendran et al., 2020; Shereen et al., 2020) . In March 2020, WHO declared the COVID-19 to be a pandemic; a pandemic being described as an infection that has spread across countries and international borders rather than within a local region or neighbouring countries. The SARC-CoV-2 is a deadly corona virus that is transmitted readily between humans and already infected more than 530,000 people all over the world in 198 countries as on 26 th March 2020 which led global shutdowns (WHO, 2020) . The fatality rate has varied among countries and age groups. Until June 2020, the fatality rate averaged 5.5% with Italy recording the highest of 14.49%. The fatality rate of US, Germany, and India were 5.56%, 4.72%, and 2.86% respectively until June 2020 (Our World in Data, 2020) . Of the total deaths, less than 5% belonged to the age group of less than 45 years thereby indicating that the younger population is much more resilient to the COVID-19 (Worldmeters, 2020) . While these fatality rates are significantly less than those of MERS (34.4%) and SARS (9.5%) (Petrosillo et al., 2020) , COVID-19 has severe transmissivity because of the possibility of asymptomatic people being carriers and spreaders of the virus (Daw et al., 2020) . The reproduction number R 0 for SARC-CoV-2 has been found to be between 2.06 to 2.52 (Sheng et al., 2020) . A value of R 0 greater than 1 indicates that the disease can invade the human population and higher the value, the easier is it's spread. SARC-CoV-2 is the largest single-strand RNA virus known to the humankind; while other viruses have a single protein spike for attachment to the human cell, this coronavirus family has 10 to 12 spike proteins, which makes it easier for the virus to attach itself to the ACE-2 protein in humans (Paraskevis et al., 2020) . The virus follows an unusual double step replication mechanism, which leads to high rates of proliferation (Luan et al., 2020) . The This article is protected by copyright. All rights reserved incubation period is typically 2 to 14 days, and the infected person often does not have serious symptoms, rather showing common symptoms associated with flu and pneumonia (Rodeny, 2020) . General symptoms of pneumonia include fever, cough, chest pain, shortness of pain, fatigue, headache, myalgia, and arthralgia (Sattar SBA, 2020) . In addition to symptoms of pneumonia, COVID-19 infected individuals may experience a loss of taste or smell, nausea, congestion, and diarrhoea (CDC, 2020) . There are a few drugs that are being recommended and used to manage the symptoms of COVID-19, but there has, as yet, been no vaccines that are proven to be effective against the coronavirus family, including COVID-19 (Sexton et al., 2016; Gautret et al., 2020) . In the absence of vaccines, it is imperative to check transmission of the virus by alternative ways (Dey et al., 2020) . Policy changes in pandemic and epidemic situations involve social distancing, lockdowns, travel restrictions, awareness campaigns etc. It has been speculated in past research that environmental conditions of countries like temperature and humidity also sometimes play a significant role in controlling pandemics (Lin et al., 2020) . Quantitative COVID-19 impact analyses are scarce in literature, given the recency of the pandemic and more studies in this area are necessary, given the seriousness of the infection. Epidemics are assumed to have an exponential growth at an early stage and the number of infections reduces over time, due to interventions like lockdowns, travel restrictions, awareness programs, etc. Mathematical modelling studies using exponential growth analysis coupled with machine learning could provide a better prediction model for COVID-19 transmission (Keeling and Danon, 2009; Siettos and Russo, 2013; Victor, 2020) . Such models must incorporate the various precautionary measures taken during the viral outbreak. The objective of the research is to develop a mathematical model using exponential growth analysis coupled with machine learning, to predict worldwide COVID-19 early containment signs. Models have been developed based on data collected from 42 countries. The objectives of this work are twofold. First, it seeks to identify countries that were successful in early containment of the COVID-19 virus. Secondly, the research aims at building supervised machine learning models with high accuracies for predicting signs of early containment with infrastructure availability, environmental factors, infection severity factors, and government policies of countries as independent variables. In the process of modelling, the significance of the above variables in containing the infection at early stages is This article is protected by copyright. All rights reserved also studied. This report also includes a discussion on other activities undertaken by the governments of various nations to flatten the infections curve and their corresponding effectiveness. COVID-19 is believed to have originated in an animal meat market in Wuhan, China and it is thought to have been transmitted from bats (Shereen et al., 2020) . Within few months, the virus has rapidly spread across the world, through transmissions of fluids and aerosol particles between humans. Initially, all diagnosed cases outside China had a travel history to the Wuhan market. Soon, community transfer caused exponential increases in infections in countries like Italy, US, UK, Korea, Japan, etc. The ability of the SARS-COV-2 virus to double replicate with the spike protein, has posed significant challenges to the development of vaccines (Shereen et al., 2020) . While hydroxychloroquine and azithromycin have been recommended by some researchers, to treat COVID-19-infected people, there haven't been too many clinical trials to validate the claim (Gautret et al., 2020) . Thus, until a scientifically validate cure or vaccine is developed, countries have to rely on preventive measures to contain the spread. This, in turn, depends on epidemiological studies that can predict spreading patterns so that policymakers can take appropriate protective measures. Several viruses including SARS have been reported to be vulnerable to hot temperatures, which results in differences in spreading patterns across geographic locations (Zhang et al., 2020) . However, such geographic variations have not yet been analysed for COVID-19. Other factors like government policies and interventions, infrastructure availability, and the severity of the infection itself can affect the ability of a country to contain epidemics and pandemics. This research seeks to explore all the above factors. The climatic conditions such as temperature and humidity play an important role in both airborne and aerosol virus transmissions. The 30-year human relationship with the Influenza virus has proven that the mortality rate is directly related to temperature and humidity (Lowen and Steel, 2014) . Hence, in order to minimize transmission of diseases, isolation wards in hospitals generally tend to have optimized pressure, temperature, and humidity (WHO, 2014) . Research on the virus in the Diamond Cruise Ship off the coast of Japan showed that a one-degree rise in temperature and a one percent increase in pressure This article is protected by copyright. All rights reserved could reduce the reproduction number R 0 down by 0.0224 -0.0383. It must be mentioned that the generalizability of the study is questionable because the ship was a contained environment and the results may not apply to the real world (Sheng et al., 2020) . Certain studies in China and Indonesia have investigated the relationship between the temperature and the spread of infection and resultant deaths and have reported a low to medium level of correlation (Tosepu et al., 2020; Yueling Ma et al., 2020) . Relative humidity was found to have low to no correlation with infection spread or deaths. Global warming has also been a reason for recent temperature increases and certain studies indicate that this can reduce flu based viral infections (National Research Council, 2001; Actuaries, 2010; Dincer et al., 2010) . However, these statements need to be further validated. While the spread of virus may be affected by climatic conditions, once the virus enters the human body, it is independent of the outside environment. However, since the virus lives outside the human body for a period of at least 12 hours under normal conditions (Richard, 2020) , it is necessary to study the effects of the environmental on the spreading patterns itself. Social distancing, although a new terminology for the 21 st Century, is not a new approach to epidemic control. It was used by the United Kingdom in 1918 to control the Influenza virus outbreak that caused about 100 million deaths. Social distancing involves the avoidance of mass gatherings and distancing of at least six feet between people. Such measures are combined with enhanced personal hygiene through regular hand wash, and wearing a protective mask for flu-like outbreaks (Yu et al., 2017; Leung et al., 2018) . This is done primarily because flu causing viruses are spread through aerosols generated from saliva and nasal fluid, which can be transmitted across distances as much as three feet. The average lifetime of COVID-19 viruses in the outer environment is believed to be about 12 hours, which increases transmissivity because aerosols from infected people can settle on doorknobs, lifts, transports, hotels, malls etc. and stay active for a long time, thus increasing the window of transmission. Direct physical contacts, such as hand-shaking, are also avenues of transmission of the virus. The reduction of social contact has been proven to significantly reduce flu-like diseases (Maharaj and Kleczkowski, 2012) . The closure of schools and malls flattened the This article is protected by copyright. All rights reserved spread curve during the Influenza pandemic in 2009 (Rashid et al., n.d.; MOH, 2014) . Thus, governments worldwide have stressed on social distancing and quarantining measures for at least 14 daysthe typical incubation period of COVID-19 virus -to contain its spread (Prem et al., 2020) . Lockdown is a preventive strategy taken by local, central or global administration during the spread of epidemic or pandemic diseases and involves stopping transportation between cities, provinces or counties. The world has so far seen four major pandemics, viz., plague in the 14 th century, Influenza in 1918, SARS in 2009, and the current COVID-19 in 2019 as reported by WHO (Porta, 2008; East et al., 2020; Pi, 2020) . In all these cases, lockdowns were implemented by various countries to control the outbreaks. China announced lockdown as early as January 2020, to flatten the curve of the COVID-19 infections over time. In March, most countries around the globe announced lockdowns of local transport, office, industries, city and national borders to contain the virus (Callaway et al., 2020) . Although quarantine centres for the infected are available in hospitals, large-scale infections necessitate self-quarantines and lockdown measures, in addition to the hospital-based quarantines (Wuhan, 2020) . During epidemic and pandemic viral outbreaks, the availability of and access to health care infrastructure such as hospitals, beds, healthcare workers, clinical equipment, first aid kits, ventilators, and protective equipment are vital to pandemic management (Bambas and Drayton, 2000; Persoff et al., 2018) . During the massive Influenza outbreak of 1918, even developed countries had inadequate health care infrastructure, which further expanded the outbreak (George, 2008). The Ebola outbreak in West Africa became uncontrollable due to lack of infrastructure facilities (Paweska et al., 2017) . After the outbreak, WHO in South Africa had asked the hospitals to report their available facilities to plan for future infections optimally (Murrin, 2018) . Innovative measures have been recommended, to create necessary healthcare infrastructure during pandemic and epidemic situations by converting schools, colleges, theatres, and stadiums into hospitals and quarantine centres (Wimberly, 2018; Nuzzo et al., 2019) . Health care workers supported by NGOs, youth, and volunteers also play a significant role in containing outbreaks (Itzwerth, 2013) . Hence studying health care This article is protected by copyright. All rights reserved infrastructure availability across countries can predict COVID-19 containment at an early stage. Predictive modelling using machine learning and growth models can provide actionable insights to policy makers and governments to contain epidemic and pandemic infections (Thompson et al., 2019) . During the onset of an epidemic, it is crucial to use exponential growth models to understand the infection rates and with proper policy implementations and behavioural changes among the susceptible group of the population, the slope reduces and the curve flattens over time (Keeling and Danon, 2009 ). For other outbreaks like smallpox, Ebola, SARS, and Influenza, various studies have used mathematical and statistical modelling to understand the growth of infections (Dietz, 2002; Nishiura, 2011; Kerkhove and Ferguson, 2020) . In fact, the Centres for Disease Control and Prevention has an exclusive book with established procedures for analysing disease outbreaks, stressing on the importance of the such modelling studies. (Dicker, 2006) In outbreaks, epidemiologists generally use the exponential growth model at the onset of an outbreak and proceed with prediction and classification techniques like regression, decision trees, neural networks deep learning, etc. to forecast outbreaks. (Sameni, 2020; Victor, 2020) . There are few studies on modelling and predicting containment of COVID-19 so far (Lin et al., 2020; Prem et al., 2020) . The research work reported in this paper, sought to integrate crucial variables concerning infrastructure, environment, policies, and severity of the disease to predict initial signs of containment. The study used a machine learning and exponential growth model. The variables used as part of the predictive mode were, doctors per 1000 population, beds per 1000 population, average temperature, average humidity, days since official lockdown, percentage of lockdown days, total cases per million population, deaths per million population, days since the first contact, and percentage of serious cases of infected. Data associated with the variables were collected from different official sources for a total of 42 counties with respect to COVID-19 infections as on 26 th March 2020. This accounts for 448,989 COVID-19 cases comprising of 84.78% of the total infections This article is protected by copyright. All rights reserved worldwide. The daily number of infections, recovery, and deaths were collected from the website of the WHO. The data for infrastructure-centred variables like the number of hospitals and the number of doctors were taken form (World Bank, 2020) . Environmentbased variables like average temperature and humidity since the onset of COVID-19 was taken from (Weather Underground, 2020). Day-wise COVID-19 case distributions extracted from WHO were used to identify countries that showed sign of containment of the virus based on a novel exponential growth modelling approach. Raw data from the sources were also consolidated and the variables physicians per thousand individuals, hospitals per thousand individuals, percentage of lockdown days since the first contact, cases per million population, deaths per million population, days since the first case, serious cases per thousand infections, average temperature since the first infection, and average humidity since the first infection were calculated to train the machine learning models. Most epidemic and pandemic diseases grow exponentially in the initial stages of the outset in a country (Ma, 2020) . A popular modelling technique that demonstrates this is the Susceptible-Infectious-Recovered (SIR) model (Kermack et al., 1927) . If S denotes the fraction of susceptible individuals to a pandemic, I indicates the fraction of infectious people, R is the fraction of recovered patients, β indicates the transmission rate per infectious individual, and the recovery rate is denoted by γ, the infectious period is exponentially distributed with a mean of 1/ γ as shown below. Linearizing this about the disease-free equilibrium, we get the following. Hence from the above expression, if − > 0, then the infection function I(t) grows exponentially about the disease-free equilibrium point. In addition to this, at the onset of the infection, ≈ 1 and hence the incidence rate = also grows exponentially. Hence, modelling the initial stages on a pandemic like COVID-19 is both relevant and crucial in understanding the growth of the infection. Although sub-exponential and polynomial modelling have worked in cases of outbreaks like Ebola, HIV, and foot and mouth diseases (Chowell et al., 2016) , they generally work well with proceeding generations. For pandemics like COVID-19, the exponential growth model is relevant and the use of least-squares at the initial stages can afford precise insights. Figure 1 shows the analysis plan to achieve the objectives of the research. This article is protected by copyright. All rights reserved The exponential growth model assumes that the onset of any outbreak follows an exponential distribution. However, due to government interventions, medical innovations, behavioural changes etc, at a later stage, the growth curve flattens and rate of infections gradually reduces (Kermack et al., 1927) . To identify such signs, we looked at the infections in the last seven-day period and the deviation of the data points from the modelled exponential curve was captured using the mean absolute percentage error metric. Based on the errors and the direction in which the actual data points were to the modelled growth curve, the countries were classified according to whether they showed initial sign of containment or not. In line with the objectives of the study, classifiers were built based on a set of independent variables to predict if a country that has COVID-19 infections showed early signs of infection containment as a reflection of policy implementations and behaviour changes. Logistic regression was used to understand the list of independent variables significantly affecting the infection containment and their corresponding importance in the model. Then, to predict signs of early containment, machine learning algorithms like logistic regression, decision trees, random forest and support vector machines were used and their corresponding accuracies are compared. For all models, cross-validation was done in 5 folds to address overfitting. Logistic Regression by le Cessie and van Houwelingen, (1992) is a Generalized Linear Model (GLM) and is one of the most widely used classifiers. According to (Kondofersky and Theis, 2018) , when there is binary response, as in this research, by using Logistic Regression one typically aims at estimating the conditional probability . As with simple linear regression, bearing equation = + , estimating the dependent variable y directly, the logistic regression estimates p( = 1) using the following equation. This article is protected by copyright. All rights reserved As with multiple linear regression, logistic regression can also handle multiple independent variables and its probability estimate can be represented as follows. = 1 1 + −( 1 1 + 2 2 + 3 3 ……+ The conditional probability ( = 1| = ) can be calculated using the odds ratio /(1 − ). The significance of the beta coefficient values ( 1 , 2 , 3 , … , ) in the above equation can be tested to see if their corresponding independent variables ( 1 , 2 , 3 , … , ) are influencers of the dependent variable. A Wald test is generally conducted to evaluate the statistical significance of the coefficients in the model. Since logistic regression falls under the category of GLM, the significance of each independent variable in predicting the outcome of the dependent variable, sign of early containment, can be studied. A decision tree is a decision support model that illustrates the consequences, chance, and event outcomes of certain decisions. Decision trees are used as a predictive model to make statistical conclusions about an item's target value, based on observations. In this tree structure, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. There are both classification trees where the response variable takes on a set of categorical values and regression trees where the response variable takes on a set of continuous values. The collective name for such trees is Classification and Regression Trees (CART), first introduced and developed by (Breiman et al., 1984) in Classification and Regression Trees. Decision trees use two metrics namely entropy and information gain to arrive at the final tree. Entropy is the measure of the total amount of uncertainity in the dataset and is given as follows: S -The data set for which entropy is to be calculated Cset of classes in the data set S p(c)ratio of number of elements in class c to the number of elements in set S This article is protected by copyright. All rights reserved When the entropy value is equal to zero, the dataset S is perfectly classified. The information gain metric is defined as the measure of the difference in the entropy from before to after the dataset S is split based on an atribute A and is given as follows. Step 1: Compute entropy for the dataset Step 2: For every feature in the dataset, compute the following i. Calculate the entropy for all the categorical values ii. Find the average information entropy for current attributes iii. Calculate the gain for curret attributes Step 3: Select the attribute with the highest gain Step 4: Repeat from step 1 till the desired tree is achieved Introduced by (Breiman, 2001) , Random Forest is a statistical supervised machine learning technique that we used for both regression and classification. This is an ensemble learning technique that uses an averaged combination of many decision trees for the final prediction. The technique of averaging a statistical machine learning model is called bagging and it improves stability and avoids overfitting (Hastie et al., 2009) . Normally, decision trees are not competitive to the best-supervised learning approaches in terms of prediction accuracy since they tend to have high variance and low bias. This is because building two different decision trees can yield in two different trees. Bagging is therefore well suited for This article is protected by copyright. All rights reserved decision tress since it reduces the variance. The idea behind Random Forests is to draw bootstrap samples from the training data set and then build several different decision trees on the different training samples. This method is called Random Forest because it chooses random input variables before every split when building each tree. By doing this, each tree would have reduced covariance, which, in turn, would lower the overall variance even further (Hastie et al., 2009) . The Random Forest algorithm has two stagesrandom forest creation followed by random forest prediction. The steps involved in the stages are as follows. Step 1: Randomly select 'k' features from the total 'm' features available in the dataset where k << m Step 2: Using the best split point, calculate the node 'd', among the selected 'k' Step 3: Split the node into daughter nodes using the best split Step 4: Repeat steps 1 to 3 until 'l' nodes are reached Step 5: Repeat steps 1 to 4 for 'n' number times to create a forest of 'n' number of trees Stage II: Random Forest Prediction Step 1: Using the features and applying the rules of randomly selected decision tree, predict the outcome and store it as a predicted target Step 2: Calculate the votes for each predicted target Step 3: The highest voted predicted target will be the prediction of the random forest algorithm The objective of Support Vector Machine (SVM) is to find a line that best separates the data into multiple groups. This is achieved by an optimization process supported by the data in the training set. These instances are called support vectors and they form a crucial role in the classification process (Flake and Lawrence, 2002) . Finally, few datasets can be separated with just a straight line. Sometimes a line with curves or even polygonal regions must be marked. This is achieved with SVM by projecting the data into a higher-dimensional space to draw the lines and make predictions. SVMs calculate a maximum margin around the boundary that ultimately results in a homogenous partition. The ultimate goal is to establish a This article is protected by copyright. All rights reserved margin as wide as possible. In order to so, a Lagrange multiplier has to be constructed as follows and maximized. ( , ) = + Table 3 shows the result for logistic regression with early containment as the dependent variable. Of all the independent variables, the availability of beds in hospitals and the percentage of lockdown days significantly and positively affected the signs of early This article is protected by copyright. All rights reserved containment. Other variables did not significantly influence the dependent variables. The model had an accuracy of 78.57% in the classification. The true positive and false negative rates were found to be 78.6% and 21.6% respectively. Precision and recall values were 0.788 and 0.786. The F1 score and ROC values were found to be 0.786 and 0.755 respectively. A J48 decision tree was constructed for predicting early infection containment with the independent variables listed in Figure 1 . The batch size was set to 10 and a confidence factor was selected as 0.25. The minimum number of objects on the tree was set as 2. The accuracy of the tree was found to be 80.95%. The variables in the decision tree were percentage lockdown days, days since official lockdown, and death rate per million population. The decision tree is shown in Figure 4 . The true positive and false negative rates were found to be 81% and 25.4% respectively. Precision and recall values were 0.857 and 0.81. The F1 score and ROC values were found to be 0.796 and 0.852 respectively. A random forest ensemble algorithm was created with 100 combined trees. The batch size was selected as 10 and the depth of the trees was set to unlimited. Other metrics for the random forest algorithm are given in Table 4 . This model reported a high accuracy figure of 92.9% in correctly classifying countries that showed signs of early containment. The true positive and false negative rates were found to be 92.9% and 8.1% respectively. Precision and recall values were 0.929 and 0.929. The F1 score and ROC values were found to be 0.928 and 0.993 respectively. In order to make predictions for signs of early containment, an SVM was modelled This article is protected by copyright. All rights reserved On 5-fold cross-validation with the data for all the algorithms and models, it can be inferred that the random forest design produces the minimum error and maximum accuracy as reported in Table 4 . It outshines all the other machine learning algorithms constructed in the study. J48 decision tree, logistic regression and SVM produce almost similar levels of accuracy in predicting the sign of containment of COVID-19. This research is one of the first of its kind to integrate exponential growth modelling with machine learning techniques for predicting the spread of COVID-19. The research presents machine learning models based on variables such as infrastructure, environment, policies, and the infection itself, to predict early signs of containment in the country. For the purpose, disease data from 42 leading countries in COVID-19 infections were taken and exponential growth modelling was used to see if the countries showed signs of containment. Then with the sign of the early containment of the infection as a dependent variable, supervised machine learning predictive models including logistic regression, decision tree, random forest, and support vector machine were developed. This research can directly be of use to countries and policymakers to understand if their proposed interventions are effective in containing infections even during early stages. (Tosepu et al., 2020; Yueling Ma et al., 2020) . However, the long-term effect of environmental factors on the infection rates may prove to be significant. Decision tree analysis also shows that early signs of containment are possible if the number of lockdown days is at least 33.7% of the days since the first contact to contain the infection. If that is not the case, countries show recovery signs if the lockdown is at least 10 days or more. For countries with a lockdown period less than 10 days, variable depicting the number of deaths per million population plays a significant role in containing the infection. This variable is indirectly related to the health care infrastructure of countries like beds, physician, ventilators, ICUs etc. Hence in any pandemic situation, governments must be proactive and frame policies even at the onset, thereby reducing the risk of spread, which would ultimately lead to early containment. This also emphasises on the need for resilient health care infrastructure to contain infections at an early stage. The machine learning models random forest and support vector machines were able to classify the countries with respect to their signs of early containment with an accuracy of 92.9 and 76.2 percentages, respectively, proving random forest to be the best machine learning algorithm for the problem studied. Although this research applies data from only 42 countries, the proposed models with their corresponding hyper parameters can be extended to predict early containment for the other countries as well. Similarly, although these models were built only for the COVID-19 pandemic, they can serve as a base for other future pandemics that have similar characteristics and reproduction numbers thereby giving governments the necessary information to take timely actions to protect both people and the economy. This article is protected by copyright. All rights reserved act called COVID-19 Act which has proven to be effective to contain the infection (Library of Congress, 2020). The number of hospital beds per 1000 population of Austria was also high, which facilitated early recovery. Chile has implemented sanitary barriers and intense screening mechanisms to track and quarantine the infected (US Embassy, 2020). In addition to tough quarantine measures, Denmark closed down schools and also announced lockdown in March. Employers were also instructed to not cut salaries of the employees on quarantine thereby encouraging social distancing and hence containing the infection (Carstensen, 2020a) . Japan, South Korea, and Singapore did not announce any lockdowns. South Korea used processes that led to early detection of the COVID-19 and quarantining the infected, thus stopping spread. They also predicted the movement of viruses and tactical interventions were taken to minimize spread (NPR, 2020). Singapore had a ready infrastructure with isolation wards in place during the SARS outbreak and was readily equipped, which led to early containment of COVID-19. Strong community engagement messages and communications from the government also led to better pandemic management in Singapore (Fisher, 2020a) . Most other countries that showed early signs of recovery rigorously followed lockdowns, social distancing, travel restrictions, and testing to contain infections. Another reason for the countries like Japan, Korea and Austria to contain the infection was the presence of availability of strong health care infrastructures in these countries to address the infections. The various actions taken by the government to control the transmission of COVID-19 are shown in Table 5 . Countries like Italy, Brazil, India, Malaysia, Pakistan, United Kingdom etc. do not have the necessary health care infrastructure to support mass admission of COVID-19 patients and hence need to rely on intense lockdowns to contain the infections. The increase in the number of COVID-19 cases in the US and the inability to contain it is also due to late lockdown decision of the government post-outbreak. The percentage of lockdown days since the first infection continues to be low for these countries to be on a recovery path against the infection. With time, there is a high probability that the infection will be contained. However, in the long run, these countries must invest in improving health care facilities to reduce causalities during pandemics. Countries must be prepared for epidemics and pandemics and proactive policies and infrastructure as in the case of Singapore can save more lives than reactive measures. It is evident that COVID-19, unlike SARS, will not be controlled by environmental factors and any future outbreaks will still rely on healthcare infrastructures, timely lockdowns, and social distancing for containment. This article is protected by copyright. All rights reserved There is no conflict of interest with the authors. No Funding Received. The data is openly available in World Health Organisation Report. The research confined to the highest level of ethics. This article is protected by copyright. All rights reserved This article is protected by copyright. All rights reserved Gesley, 2020) Brazil Employees at the airport were asked to wear a mask. Borders were closed for flights from affected countries (CDCP, 2020) Canada All travellers were forced to self-isolate for 14 days upon entry to control the outbreak (GC, 2020) Chile Screening in the airport was enhanced and people with symptoms were Iran Followed strict social distancing and lockdown (Duddu, 2020) Ireland Invested in massive testing facilities. Treated all patients equally irrespective of their income strata. All hospitals operated on a not for profit basis (BBC, This article is protected by copyright. All rights reserved 2020) Used technology to track the movement of infected individuals with their mobiles and quarantined the people who came in contact with the individual (Lomas, 2020) Italy Though Italy closed borders during the onset, lack of proper testing facilities caused a massive outbreak. This was followed by a strict lockdown (Gary, 2020) Japan Managed the outbreak with rules and excellent medical infrastructure. (Japan, 2020) Luxembourg Quarantined people over 60 years to reduce casualties (Piscitelli, 2020) Malaysia The banned entry of people from infected countries followed with additional screening measures in the airport. Promoted personal hygiene and eventually ended with a lockdown. (World, 2020a) Netherlands Travellers returning from affected countries were advised to visit doctors and medical facilities if symptoms were felt. Post outbreak, the country went under lockdown. (World, 2020b) Norway Travel bans and closure of schools, public services like gums, malls, theatres etc. (Norway Panorama, 2020) Formed a team to monitor situations and take necessary actions on a daily basis. (Pakistan, 2020) Portugal Employed strict lockdown (Ivo Oliveira, 2020) Qatar Proper tracking, and strict screening and testing of travellers (Health, 2020) Republic of Korea Proactively built a centralized testing and quarantine facility before an outbreak in the country. China's reports triggered this action (Beaubien, 2020) Romania Lockdown and Border Closing (Gherasim, 2020) Singapore With previous experience from SARS pandemic, the country had a proper infrastructure facility with negative pressure room for pandemic control. The testing was done rigorously and the infected were not let back into society. Migrants from other countries were not allowed to work until the pandemic is controlled. (Fisher, 2020b) Slovenia Used innovative ways to spread COVID-19 control messages before going into lockdown. (Slovenija, 2020) South Africa Immediately implemented entry and exit to affected countries. Declared as a This article is protected by copyright. All rights reserved national disaster and went for the lockdown to prevent a major outbreak (Fihlani, 2020) Spain Local movement controlled by social distancing. Travel to an affected country completely banned. Enhanced medical attention at arrival to control the spread. (Kate Mayberry, 2020) After closing school, colleges and non-essential business, the country used their military and civilian support to enhance infrastructure and healthcare needs to contain the infection (Keystone, 2020) The United Kingdom People with symptoms were asked to self-quarantine. Cancelled overseas travels and only tested people who were admitted. Followed social distancing, lookdown, isolation and house quarantined. The country did not force people for testing. (Yong, 2020) United States of America Enforce travel restriction and implemented mandatory quarantine in New York. A level of screening and lockdown was implemented. (Brittany Potential Impact of Pandemic Influenza On the U . S . Health Insurance Industry Sponsored By Committee on Life Insurance Research Health Section Joint Risk Management Section Society of Actuaries Prepared By Health & Human Development in the New Global Economy : The Contributions and Perspectives of Civil Society in the Americas Editors BBC, 2020: BBC, Coronavirus: Republic of Ireland introduces stronger measures 2020: Goats and Soda, How South Korea Reined In The Outbreak Without Shutting Everything Down Random Forests Classification and regression trees The Washington Post, Trump says quarantine for New York area 'will Accepted Article This article is protected by copyright. All rights reserved not be necessary coronavirus-related deaths double in two days The World, Denmark takes swift action against coronavirus The World, Denmark takes swift action against coronavirus: 'You can't do enough to contain this epidemic 2020: CDCP, Travel Health Notices COVID-19: France calls unemployed to work in fields as borders stay closed 2020: Protein structure and sequence re-analysis of 2019-nCoV genome does not indicate snakes as its intermediate host or the unique similarity between its spike protein insertions and HIV-1. 1-13 Characterizing the reproduction number of epidemics with early subexponential growth dynamics 2020: Preliminary Epidemiological analysis of suspected Cases of Corona virus Infection in Libya Analyzing the Epidemiological Outbreak of COVID-19: A Visual Exploratory Data Analysis (EDA) Approach Preliminary estimation of the basic reproduction number of novel Coronavirus (2019-nCoV) in China, from 2019 to 2020: A data-driven Analysis in the early phase of the outbreak Principles of Epidemiology in Public Health Practice Daniel Bernoulli ' s epidemiological model revisited Global warming: Engineering solutions 2020: Focus, COVID-19 in Iran: Coronavirus outbreak, measures and impact EClinicalMedicine Emerging zoonoses : A one health challenge Greece on total lockdown all new arrivals will be quarantined 2020: BBC, Coronavirus: African states impose strict restrictions Why Singapore isn't in a coronavirus lockdown -as told by a doctor of the country 2020b: The Print, Why Singapore isn't in a coronaviruslockdown -as told by a doctor of the Country Accepted Article This article is protected by copyright. All rights reserved Efficient SVM regression training with SMO 2020: HBR, Lessons from Italy's Response to Coronavirus 2020: Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial Prime Minister announces new actions under Canada's COVID-19 response Responding to Simulated Pandemic Influenza 2020: Euobserver, Romania's Orban sworn in again amid corona emergency The Elements of Statistical Learning: Data Mining, Inference, and Prediction 2020: State of Qatar, Coronavirus Disease Critical Infrastructure and Preparedness Perspectives on Pandemic Influenza 2020: Pololitico, Portugal shuts down to tackle coronavirus Accepted Article This article is protected by copyright. All rights reserved Emergency management, Epidemics, Government, Infectious and parasitic diseases, Public health Test every suspected case" of COVID-19 -Live updates Mathematical modelling of infectious diseases 2020: CNN, Why is Covid-19 death rate so low in Germany Epidemic and intervention modelling -a of the royal society of london. S.A. McKendrick Containing papers of a mathematical, and physical character, 1927: A contribution to the mathematical theory of epidemics 2020: SwissInfo, Coronavirus: the situation in Switzerland Statistical Learning with Sparsity: The Lasso and Generalizations 2020: Iceland Review, Steps taken to prevent spread of covid-19 in iceland Ridge Estimators in Logistic Regression Individual preventive social distancing during an epidemic may have negative population-level outcomes Austria: Government Tightens Rules to Contain Spread of Coronavirus 2020: International Journal of Infectious Diseases A conceptual model for the coronavirus disease 2019 ( COVID-19 ) outbreak in Wuhan , China with individual reaction and governmental action 2020: The reproductive number of COVID-19 is higher compared to SARS coronavirus 2020: Tech Crunch, Israel passes emergency law to use mobile data for COVID-19 contact tracing Roles of Humidity and Temperature in Shaping Influenza Seasonality 2020: Spike protein recognition of mammalian ACE2 predicts the host range and an optimized ACE2 for SARS-CoV-2 infection Estimating epidemic exponential growth rate and basic reproduction number Controlling epidemic spread by social distancing : Do it well or not at all MOH Pandemic Readiness and Response Plan for Influenza and other Acute Respiratory Diseases Office of Inspector General Hospitals Reported Improved Preparedness for Emerging Infectious Diseases After the Ebola Outbreak Linkages Between Climate, Ecosystems, and Infectious Accepted Article This article is protected by copyright. All rights reserved Disease Real-time forecasting of an epidemic using a discrete time stochastic model : a case study of pandemic influenza 2020: The Nordic Page, Norway Government Takes Radical Decisions against Spread of Coronavirus: First Time Since WW2 How South Korea Reined In The Outbreak Without Shutting Everything Down What makes health systems resilient against infectious disease outbreaks and natural hazards ? Results from a scoping review Channel News Asia, COVID-19 self-isolation is punishing the poor in Indonesia Gulf News, 10 steps Pakistan is taking to contain coronavirus 2020: Full-genome evolutionary analysis of the novel corona virus ( 2019-nCoV ) rejects the hypothesis of emergence as a result of a recent recombination event A modular high biosafety field laboratory The Role of Hospital Medicine in Emergency Preparedness: A Framework for Hospitalist Leadership in Disaster Preparedness, Response, and Recovery Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19 . The COVID-19 resource centre is hosted on Elsevier Connect , the company ' s public news and information Life lessons from the history of lockdowns 2020: MSAN, Measures taken by the Government Council on 12 March 2020 in response to the Coronavirus Dictionary of Epidemiology 2020: Articles The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan , China : a modelling study Systematic literature review on novel corona virus SARS-CoV-2: a threat to human era. VirusDisease1-13 Social distancing Evidence summary Coronavirus Epidemic , Humidity and Temperature Novel Coronavirus (2019-nCoV) Update: Uncoating the Virus Nuovo Coronavirus Covid-19. Minist. della Salut 2020: Mathematical Modeling of Epidemic Diseases ; A Case Study of the COVID-19 Coronavirus 2020: Statistica, Precautionary measures to prevent spread of COVID-19 India 2020 2020: Bacterial Pneumonia Homology-Based Identification of a Mutation in the Coronavirus RNA-Dependent RNA Polymerase That Confers Resistance to Multiple Mutagens 2020: Estimation of the reproductive number of novel coronavirus ( COVID-19 ) and the probable outbreak size on the Diamond Princess cruise ship : A data-driven analysis COVID-19 infection: origin, transmission, and characteristics of human coronaviruses Mathematical modeling of infectious disease dynamics 2020: Statnews, Ecuador becomes the latest country to eye compulsory licensing for Covid-19 products Gov of Slovenija Accepted Article This article is protected by copyright. All rights reserved 2019: Detection , forecasting and control of infectious disease epidemics : modelling outbreaks in humans , animals and plants 2020: Correlation between weather and Covid-19 pandemic in Jakarta 2020: U.S Embassy in Chile, Global Level 4 Health Advisory 2020: MATHEMATICAL PREDICTIONS FOR COVID-19 AS A GLOBAL PANDEMIC 2020: wunderground Infection prevention and control of epidemic-and pandemic-prone acute respiratory infections in health care COVID-19) outbreak situation Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). World Health Organization WHO, 2020c: Report of the WHO-China Joint Mission on Coronavirus Disease 2018: DigitalCommons @ UNMC Pandemic Planning : Estimating Disease Burden of Pandemic Influenza to Guide Preparedness Planning Decisions for Nebraska Medicine Physicians (per 1,000 people) [Online] Available at Accepted Article This article is protected by copyright 2020a: Garda World, Malaysia: New travel restrictions introduced February 6 /update 2020b: Garda World, Netherlands: Government confirms first case of COVID-19 Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study 2020: Locked down Wuhan and why we always overplay the threat of the new Kenan Malik China ' s reaction to the coronavirus outbreak may have the opposite effect to what ' s needed 2020: Tthe Atlantic, The U.K.'s Coronavirus 'Herd Immunity' Debacle Effects of reactive social distancing on the 1918 influenza pandemic Effects of temperature variation and humidity on the death of COVID-19 in SARS-CoV-2 turned positive in a discharged patient with COVID-19 arouses concern regarding the present standard for discharge 2020: Estimating the effective reproduction number of the 2019-nCoV in China This article is protected by copyright. All rights reserved