key: cord-0851517-rf9e2cox authors: Nazir, A.; Ulusoy, S.; Lotfi, L. title: Visual Exploratory Analysis of COVID-19 Pandemic: One Year After the Outbreak date: 2021-05-08 journal: nan DOI: 10.1101/2021.05.04.21256635 sha: 6ffb928b38c40b84e7e95ec0c1063fd4990f0d14 doc_id: 851517 cord_uid: rf9e2cox Background Since the beginning of the year 2020, governments across the globe have taken different measures to handle the Covid-19 outbreak. Many different policies and restrictive measures were implemented to prevent transmission outspread, to reduce the impacts of the outbreak (i.e., individual, social, and economic), and to provide effective control measures. Although it has been more than one year already after the outbreak, very little studies have been done to examine the long-term effects and impact of the pandemic, and to examine the government intervention variables that are most effective and least effective. Such analysis is critical to determine the best practices in support of policy decisions. Methods Visual exploratory data analysis (V-EDA) is highly recommended to evaluate the impact of the pandemic since it offers a user's friendly data visualization model that allows one to observe visual patterns on trends. The V-EDA was conducted on one-year data for the COVID-19 Pandemic- one year after the outbreak between 1st January and 31 December 2020. The data were analyzed using the student's t-test to verify if there was a statistical difference between two independent groups and the Spearman test was used to analyze the correlation coefficient between two quantitative data, as well as their positive or negative inclination. Findings We found that high-testing countries had more cases per million than low-testing countries. However, for low-testing countries, there was a positive correlation between the testing level and the number of cases per million. This suggests that countries that had tested more, did it in a preventive manner while countries with fewer tests may have a higher number of cases than confirmed. In the poorest developing countries, the reduced new cases coincide with the reduction in conducted tests, which was not observed in the high-testing countries. Among the restrictive measures analyzed, a higher population aged 70 or more and lower GDP per capita was related to a higher case fatality ratio. Restrictive measures reduce the number of new cases after four weeks, indicating the minimum time required for the measures to have a positive effect. Finally, public event cancellation, international travel control, school closing, contact tracing, and facial coverings were the most important measures to reduce the virus spread. As a result, it was observed that countries with the lowest number of cases had a higher stringency index. Since the beginning of the year 2020, governments across the globe have taken different measures to handle the Covid-19 outbreak. Many different policies and restrictive measures were implemented to prevent transmission outspread, to reduce the impacts of the outbreak (i.e., individual, social, and economic), and to provide effective control measures. Although it has been more than one year already after the outbreak, very little studies have been done to examine the long-term effects and impact of the pandemic, and to examine the government intervention variables that are most effective and least effective. Such analysis is critical to determine the best-practices in support of policy decisions. Visual exploratory data analysis (V-EDA) offers a user-friendly data visualization model that is very useful to evaluate the most effective government approaches for handling the outbreak. [1] was the first to conduct the visual exploratory data analysis (V-EDA) on COVID-19 for China and worldwide using one-month dataset from January-February 2020. [2] also explored V-EDA to analyze effective measures taken by the Kerala government in India. The most recent work on V-EDA for COVID-19 was conducted by [3] , which offers basic preliminary worldwide analysis on the number of positive, recovered, death cases, mortality and recovery rates using Johns Hopkins University dataset for 6 months period (i.e., 22 January-12 June 2020). Existing COVID-19 studies using V-EDA approach provide important visualization insights but they lack statistical evidence. They also focus specifically on gaining insights on short-term effects, rather than long-term effects, on the government intervention measures. Little is currently known about the long-term effectiveness of various government intervention strategies. Specifically, what are the long-term effects of very relaxed restrictive measures versus extreme ones on particular government intervention? Further, policymakers and governments often need to make a well-informed decision on whether to enforce moderate or extreme measures on a specific intervention. What is the guideline for such a decision to be made? In our study, we conducted an in-depth visual exploratory data analysis (V-EDA) of the global COVID-19 pandemic-one Year after the outbreak between 1st January and 31 December 2020, in the hope to better understand the long-term effects and the impacts of certain government policies, and to measure the effectiveness of such policies. Our analysis specifically focuses on the effects of very relaxed versus extreme restrictive measures on a particular government intervention or policy implementation. Furthermore, we examine the effects of moderate restrictive measures (at 75th percentile of the population) versus extreme measures (at 95th percentile) with the aim to gain additional insights on the effectiveness of the policy implementation based on the strictness level. Such insights may be useful for policymakers to make evidence-informed decisions on whether to enforce moderate or extreme measures on a specific intervention. Our V-EDA is backed with statistical evidence to validate the visual observations and to prevent biased insights. Our results offer several important insights on the best practices and the effective responses that governments and policymakers can implement for a successful COVID-19 control. We specifically analyzed the public datasets using the V-EDA approach with statistical reporting to examine the following important questions: . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint • What is the effect on the number of positive cases when countries conduct high or low numbers of tests? • Do countries that conduct most number of tests have better outcomes in containing the virus? What is the effect of conducting more tests? • What is the effect of conducting very few tests? • Can a country reduce Case Fatality Ratio (CFR) by conducting a higher number of tests as an early precautionary measure to prevent patients from health complications that lead to deaths? • Does the CFR strongly correlate with countries' hospital bed capacity, countries' economic output, the elderly population, and the median population age? • Do strictest measures have a better chance to control the outbreak? What is impact on the positive cases if countries enforce very strict or very lenient lockdown policies? • What is the effect and the impact of implementing the most aggressive lockdown and restrictive measures? Does it greatly help in reducing the cases? • What will be the effects if countries have very lenient public lockdown measures? Does it have negative consequences? • For those countries that were most effective in controlling the COVID-19 outbreak, what government intervention measures (i.e., staying at home, internal movement restriction, travel controls restrictions, etc.) were the most successful to control the outbreak? What are the most important predictor variables that can effectively prevent and minimize the spread of COVID-19? • What is the effect of containment measures on COVID-19 outbreak? • Do countries with the strictest lockdown polices have better advantages in containing the virus when compared to countries with very lenient lockdown policies? To the best of our knowledge, this is the first work that performs V-EDA on COVID-19 in this comprehensive evidence-informed manner, which aims to fully answer the above questions using data for one-year period. In this study, we conducted an in-depth visual exploratory analysis of the global COVID-19 pandemic based on the one-year public data-sets between 1st January and 31 December 2020. The data were assembled from two main sources which are publicly available, namely, the Our World in Data Research group (OWID)-a scientific online publication that focuses on large global problems, and the Oxford COVID-19 Government Response Tracker (OxCGRT)-research work from an academic team from the University of Oxford. Both data-sets are structured in tabular format as Comma-Separated Values format (CSV). The OWID data-set comprises a total of 52 variables which provides daily variables on the number of positive cases, death cases, and tests. It also consists variables on the number of hospital beds, the country's GDP per capita, the country's median age, and the distribution of the elderly population (aged 70 or older). We selected a total of 6 variables, namely the location, date, new daily cases, new daily cases . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint per million, new daily tests per thousand, and new daily deaths from the OWID dataset to generate the visual exploratory graphs for Figure 1 to Figure 5 . Table 1 provides a summary and brief descriptions of selected features used for the OWID dataset. For the OxCGRT dataset, we individually analyzed 2 major categories: government responses to COVID-19 (identified as the stringency index), and the related COVID-19 policies on health containment policies (identified as the health containment index). For the stringency index, 8 variables were considered, which include the school closing, workplace closing, cancel public events, restrictions on gatherings, close public transport, staying at home, internal movement restriction, and the international travel control. For the containment health index, 7 variables were considered, which include the public information campaigns, testing policy, contact tracking, emergency investment health care, investment vaccines, facial coverings, and the vaccination policy. We observed some irregularities with the number of tests conducted and the number of reported positive cases. We identified 76 countries with zero number of tests. These 76 countries did not report any tests despite having positive cases. For example, Venezuela reported zero number of tests despite having 113,558 positive cases. We also discovered 1 country (i.e., Hong Kong) which has zero total number of cases, which we also removed from our analysis. The original OxCGRT dataset covers a total of 184 countries. We merge both the OWID and OxCGRT datasets using the country and the daily date as the primary keys. After merging both OWID and OxCGRT datasets, only 163 countries were selected for our analysis. 18 countries were removed from the OWID dataset and 21 countries were removed from the OxCGRT dataset since they did not match based on the country names. Based on our observations, we suspected that the actual number of cases was likely to be much higher for some poorer countries than the number of reported cases due to limited testing, reporting lags and under-reporting. For Figure 1 and 4, countries were selected into two groups: low-testing and high-testing countries. We used the commonly accepted threshold of the 5th percentile of the total test per thousand to represent low-testing countries and the 95th percentile of the total test per thousand to represent high-testing countries. To observe the impact of moderate measures (versus extreme ones), we also compared our results with the 25th percentile of the countries to represent low-testing countries and the 75th percentile to represent high-testing countries. For Figure 2 and 3, we normalize both the total test per thousand and total positive case variables based on the monthly percentage change. Typically, the trend on the effect of testing is often only visible after a substantial period. Therefore, we make a direct comparison between these two variables on a monthly basis (instead of a daily or weekly basis), in order to clearly identify any visible relationship. The monthly percentage C is calculated as follows: where x 1 is the initial value (i.e., current month) and x 2 is the final value (i.e., next month). For Figure 4 , the case fatality ratio is computed by dividing the number of daily death cases from daily positive cases. For Figure 5 , a similar method was used to separate two different groups. The 5th and 25th percentiles were used to represent a category of countries with the lowest hospital beds per thousand, GDP per Capita, Aged 70 or older, and median age. On the other hand, the 75th and 95th percentiles were used to represent a category of countries with the highest hospital beds per thousand, GDP per Capita, Aged 70 or older, and median age. The stringency index rates the stringency of government measures to COVID-19 from 0 to 100, with 100 being assigned to countries with the strictest rules or highest lockdown measures. To see the effects of government interventions on the number of positive cases, we computed the monthly percentage change of total cases for two groups: low and high stringency ( Figure 6 ). The 5th and 25th percentiles . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint represent low-stringency countries and the 75th and 95th percentiles represent the high-stringency countries. We also computed the monthly percentage change of the total cases to see the effect of high-stringency policies on the number of cases. To determine the effectiveness of high-stringent policies, the percentage change of total cases was calculated for the current month and next month (Figure 7 ). If high stringency government policies have a direct positive impact on the decline of COVID-19 cases, we would expect the monthly percentage change to reduce for the next month. Moreover, we also computed the monthly percentage change of the total cases of the current month and next month for the low-stringency countries (Figure 8 ). This is to examine the effects of countries that have very lenient approaches towards lockdown measures. All variables from the government stringency and health, containment indexes were also extracted to draw the boxplots (Figure 9 and 10). For these plots, we compute the restrictive measure score by normalizing each value of the variable by its maximum value. The normalized score gives the range of values between 0 and 1. Finally, we created two groups of high-and low-cases countries based on their monthly cases. We then identified the previous month of the stringency index of those countries. The intuition is to see whether last month country's stringency level has any direct impact to the number of cases one month later. The same method was used to separate these two groups based on the 75th and 95th percentiles, which represent high-case countries and the 5th and 15th percentiles to represent the low-case countries ( Figure 11 ). We used the Kernel density estimation (KDE) method to create density plots ( Figure 1 , 4-8, and 11). The KDE is is a nonparametric density estimator that estimates the underlying probability density function of the data without assumptions of the population probability distribution functions. It is particularly very useful to explore pattern of the data from a complicated distribution. Given a set of N data points {x 1 , x 2 , · · · , x N }, the KDE functionf (x) is defined as follows: where x i represents each data point, K represents a gaussian function such that K(x) dx = 1 to provide smooth estimate (continuous function without discontinuity), h represents a controllable bandwidth parameter, and each data point x i is normalized with a weight N i=1 w i > 0 to ensure that f (x) dx = 1. As suggested by [4] , the bandwidth parameter h is defined as N For plotting the box plots ( Figure 9 and 10), 1.5 is set as the proportion of the inter-quartile range (IQR) past the low and high quartiles to extend the plot whiskers. Points outside this range were identified as outliers. To assess the effect of population testing, countries were divided in high-testing countries and low-testing countries. First, the 75th percentile of each group of coun-. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint tries were compared with the total of positive cases and the total of positive cases per million (Figure 1a and 1b, respectively). With this sample group, the countries that applied more tests also registered a higher number of cases per million (P<0.0001, Student's t-test), but there was no difference between the total number of cases (P=0.13, Student's t-test). In the 95th percentile the distribution of cases was similar, although the sample group was smaller (Figure 1c and 1d) . It was also observed that the countries with the highest testing had more cases per million (P<0.0001, Student's t-test), with no significant difference in the absolute total of cases when compared to the group of countries with less testing (P=0.18, Student's t-test). Moreover, in the high-testing countries, there was no correlation between the number of tests and the number of cases per million. While in low-testing countries, there was a moderate positive correlation between the number of tests applied and the number of cases per million (r=0.69, P<0.0001, Spearman Test). These results demonstrate that high-testing countries have more positive cases per million than countries that test less, but the number of cases is not directly related to the number of tests performed on the population. On the other hand, in low-testing countries, as more tests are performed, more positive cases are confirmed. These data suggest that in low-testing countries, the tests may have been used mainly to confirm cases with symptoms, presenting a greater chance of being positive. Thus, the total number of positive cases for Covid-19 in these countries may be higher than those confirmed, as asymptomatic patients or those who did not have access to the tests were not counted. In order to better understand the outcome of countries with regard to the level of testing, the number of new cases and tests applied monthly were analyzed in the four countries with the greatest testing ( Figure 2 ). In these countries, none of them showed significant correlation between the number of new cases and the number of monthly tests used. Notably, in the first months analyzed, there were more positive cases than tests per thousand, illustrating the situation experienced at the beginning of the pandemic with the explosion of cases and low availability of tests for sale. After that period, Luxembourg, United Arab Emirate, Denmark and Cyprus presented a few months with more testing than positive cases, indicating that these countries were testing the population in a preventive way in order to control of the spread of the virus. On the other hand, countries with low testing have a tendency to have a positive correlation between the number of tests conducted and new cases (Figure 3a -3d). In the Democratic Republic of Congo and Madagascar, the reduced new cases coincide with the reduction in tests applied (Figure 3a and 3b, respectively). In Madagascar there was positive correlation between the number of new cases and the number of tests (Figure 3b , r=0.65, P=0.045, Spearman Test). These data reinforce previous data, suggesting that the detection of new cases in these countries has been limited to the number of tests available. Specially owing to the positive cases decreased in the months that there is a decrease in testing. In addition, countries with high testing were able to provide reliable data on the spread of the virus. Additionally, it was observed there was a difference between the Case Fatality Ratio (CFR) in the two groups of countries. In the 75th percentile, . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint there was no difference in the CFR between countries with high testing and countries with low testing (Figure 4a ). However, when analyzing the 95th percentile, lowtesting countries had a higher CFR (mean = 0.0231) than the CFR of high-testing countries (mean = 0.0062, P = 0.003, Student's t-test, Figure 4b ). These results demonstrate that countries that conducted a greater number of tests in a preventive manner managed to decrease the number of deaths and, consequently, had lower CFR. Comparatively, the CFR was correlated with hospital bed capacity, economic output, the elderly population and the median population age of the countries ( Figure 5 ). In the 75th percentile, the mean hospital beds per thousand (HBPT) was 8.1 ± 0.6 in the countries with the highest HBPT and was 0.64 ± 0.07 in the countries with the lowest HBPT (P<0.0001; 95% confidence interval [CI], -8.8 to -6.2), Figure 5a . In the 96th percentile (Figure 5b ), the mean HBPT was 10.6 ± 1.0 in the countries with high numbers and the mean and was 0.4 ± 0.06 in the countries with the lowest numbers (P<0.0001; 95% confidence interval [CI], -12.5 to -8.0). In turn, in the 75th percentile ( Figure 5 C), the mean GDP per capita was 60738 ± 5329 in the countries with the highest GDP per capita and was 2113 ± 243.9 in the countries with the lowest GDP per capita (P<0.0001; 95% confidence interval [CI], -69552 to -47698). On the other hand, in the 95th percentile, the countries with the highest GDP per capita had mean of 82275 ± 10182 and the countries with the lowest GDP per capita had mean of 1177 ± 115.3 (Figure 5d , P<0.0001; 95% confidence interval [CI], -104579 to -57616) Furthermore, the CFR was strictly correlated negatively with GDP per Capita in countries from the 95th percentile (r = -0.64, P = 0.054). The same was not observed in countries of the 75th percentile. These data suggest that the country economic situation may be related to the case fatality ratio with more patients dying from the disease. However, the value of P was borderline (P = 0.05), showing that, with the data analyzed, a strong tendency in the relationship cannot be confirmed between fatality and GDP per capita. In addition, the mean population aged 70 or older was 14.1 ± 0.4 in the countries with the highest percentage and was 1.4 ± 0.1 in the countries with the lowest percentage (P<0.0001; 95% confidence interval [CI], -13.5 to -11.9) in the 75th percentile countries (Figure 5e ). In the 95th percentile countries (Figure 5f ), the mean population aged 70 or older was 15.7 ± 0.8 in the countries with the highest percentage and was 1.0 ± 0.2 in the countries with the lowest percentage (P<0.0001; 95% confidence interval [CI], -16.5 to -12.8). In addition, there were positive correlation between CRF and age in both countries for the 75th and 95th percentile. (r = 0.52, P = 0.003 and r = 0.82, P = 0.006, respectively). Notably, the population difference is more evident in the countries of the 95th percentile, where a high positive correlation coefficient between population age and case fatality ratio was confirmed. While, the mean population age was 44.7 ± 0.5 in the countries with the highest percentage and was 18. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint Another point analyzed was whether restrictive measures were able to control the outbreak. Therefore, the monthly percentage change of total cases was compared between countries with high and low restrictions. In the 75th percentile countries, the mean monthly percentage change of total cases was 66.2 ± 0.7 in the highstringency countries and was 31.4 ± 1.8 in the low-stringency countries (P<0.0001; 95% confidence interval [CI], -38.6 to -31.0), Figure 6a . And in the 95th percentile countries, the mean monthly percentage change of total cases was 70.3 ± 1.2 in the high-stringency countries and was 21.4 ± 2.8 in the low-stringency countries (P<0.0001; 95% confidence interval [CI], -55.4 to -42.5). In this group, there was there was an expressive difference (48.9 ± 3.0) between high and stringency countries. However, when comparing the percentage change of total cases month by month, it was observed that the total cases of one month were negatively correlated with the stringency index of the next month in the 75th percentile countries with highest monthly jump of stringency ( Figure 7a , r = -0.70, P= 0.0001). That is, the more stringency the country was in the month, there were fewer cases in the following months. However, the same was not observed in the 95th percentile countries ( Figure 7b ). There was also no relationship when the indexes and cases of the same month were analyzed. These results suggest that restrictive measures decrease the number of cases after at least one month after the start of them. On the other hand, in countries that had minor restrictive measures, there was no relationship with the total number of cases. In 75th percentile, the total number of cases is strictly similar (Figure 8a ), in the 95th percentile there is difference between the months, but it is not statistically significant (Figure 8b ). It is probably not possible to observe the relationship between the factors as the stringency index in these countries remain consistently low throughout the year, while the number of cases vary. As it was observed that the strictest countries in restrictive measures were more successful in reducing the number of cases, it was also analyzed which restrictive measures the countries adopted. The restrictive measures observed were: School Closing, Workplace Closing, Cancel Public Events, Restrictions on Gatherings, Close Public Transport, Staying at Home, Internal Movement Restrict, International Travel Control (Figure 9 ). First, public event cancellation index was higher in countries with the lowest number of positive cases in the 75th and 95th percentile (P=0.0008 and P= 0.004, respectively), demonstrating that this measure could be important for reducing the positive cases. In addition, closing public transportation (P=0.03), restriction on gatherings (P<0.0001), school closing (P=0.0004), staying at home (P=0.03) and workplace closing (P=0.01) were essential for reducing positive cases in the 75th percentile countries (Figure 9a ). On the other hand, international travel control and school closing were measures that reduced the number of positive cases in the 95th percentile countries (P=0.02 and P=0.001, respectively, Figure 9b ). These results suggest that the public event cancellation, international travel control and school closing were the most important measures to reduce the virus spread. Likewise, containment measures as public information campaigns, testing policy, contact tracing, emergency investment in healthcare, investment in vaccines, facial . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint coverings, and vaccination policy were observed in countries with high and low number of positive cases ( Figure 10 ). In the 75th percentile countries, contact tracing (P=0.0006), facial coverings (P<0.0001), public information campaigns (P= 0.006) and testing policy (P=0.0002) had elevated index in the countries with the lowest numbers of positive cases (Figure 10a ). In the 95th percentile countries, only contact tracing (P=0.02) and facial coverings (P=0.002) were important measures for reducing the number of positive cases (Figure 10b ), which suggests that these two measures were the most important for controlling the disease, however the other measures were significantly important considering a large group of countries. Finally, it was found that both the 75th and 95th percentile countries that had the lowest number of cases also had higher stringency index than high-cases countries (Figure 11a and 11b, P<0 .0001, Student's t-test). The same was observed in relation to containment health index (Figure 11c and 11d, P<0 .0001, Student's ttest), confirming that these measures were essential to control the virus spread and prevent the increase in the number of positive cases. Since Covid-19 was declared by the World Health Organization (WHO) as a pandemic in March 2020, each country has taken different measures to control the spread of the SARS-CoV-2 virus, preventing contamination of people. In order to understand the effect of the measures taken to control the Covid-19 outbreak, data related to one year of the pandemic were analyzed. Firstly, the effect of the testing level in different countries and the number of positive cases was observed. Population testing is done through the reverse transcription-polymerase chain reaction (RT-PCR), which is the gold standard for detecting SARS-CoV-2 and the only reliable test for determining positive cases [5] . RT-PCR is an important step used to control the virus spread, especially because asymptomatic or pre-symptomatic people can transmit the virus and infect other people [6]. Thus, it is essential that testing be carried out preventively so that it is possible to detect these cases. In this study, it was confirmed that high-testing countries had more cases per million than low-testing countries. However, only in the countries that tested less, the number of tests was directly related to the number of positive cases per million. In addition, these data were confirmed when analyzed month by month. Even in Madagascar, a low-testing country, there was a positive correlation between the number of tests and the number of positive cases. Thus, despite the high-testing countries had more positive cases during the year, there are months with a low number of new positive cases and even though the level of testing remained high. Therefore, high-testing countries were able to control the spread of the virus. On the other hand, low-testing countries remained with the number of positive cases similar to the testing level, suggesting that there may be an underestimation of the number of cases. Consequently, countries that have had a restrictive testing policy may have total numbers of detected cases that do not correspond to reality. In fact, the study estimated that a large part of the transmission of the virus would occur by people who did not yet have symptoms (35%) and also asymptomatic people, who were infected but who did not show symptoms (24%) [7] . In addition, a systematic review with data from England and Spain showed that approximately 33% of people who tested positive for COVID-19 were asymptomatic . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint [8]. Moreover, it would be important for countries to have established criteria to define whether the death was caused by COVID-19. For example, confirming the patient's contamination by the positive test. For this reason, countries that were concerned with correctly diagnosing cases, such as Belgium, had high rates of positive cases [9] . However, understanding the pandemic consequences in countries that did not have this concern is difficult [9]. Moreover, we found that the case fatality ratio was lower in high-testing countries, demonstrating that frequent or comprehensive testing population, in a preventing way, led to a decrease in deaths Covid-19. Additionally, the CFR was significantly lower in the 95th percentile high-testing countries than in the 75th percentile, emphasizing that high testing is related to the case fatality ratio reduction. In April 2020, for example, Sweden had a higher CFR than countries that established restrictions as lockdown. Furthermore, the UK and France, a low-testing countries, also had high CFR [10] . Ergonul and collaborators analyzed the CFR of 34 countries and observed that elevated CFR was related to diseases like tuberculosis, disorder as obesity in adults older than 18 years and elderly people. On the other hand, the CFR was negatively related to rural population and hospital bed density. That is, there was less death in countries with more rural populations and/or more availability of hospital beds [11] . Moreover, the CFR is related to population age and GDP per capita because it is higher in countries with more aged 70 people and in countries with reduced GDP per capita countries. Moreover, the CFR was lower in the countries of the 95th percentile with a low proportion of aged 70 than in the countries of the 75th percentile. Notably, in Japan, the CFR was 18.1% for people aged 80, 8.5% for people aged 70, and was even lower for people aged 60, with a CFR of 2.7% [9]. Interestingly, the proportion of hospital beds in each country was not related to CFR. At the same time, we analyzed what was the effect of restrictive measures in the control of the pandemic. We observed that the monthly percentage change of total cases was higher in high-stringency countries. Especially in the 95th percentile countries. In this point, the cause-effect needs to be addressed individually, owing to some countries could be more stringent only when the number of cases increased [9] . However, in the 75th percentile countries, the stringency was related to a diminishing number of cases in the following months, demonstrating that restrictive measures are effective in reducing the number of cases, but that the effect will be verified after four weeks. This is important to be emphasized in awareness campaigns so that the population is aware that it takes at least four weeks for the restrictions to have been reflected in the number of new cases. In countries with lenient public lockdown measures, there was no significance regarding the change in the number of cases, probably because there were no measures to decrease the number of cases. Thus, the number of new cases remained stable over the months. It was observed that both the 75th and 95th percentile countries with a reduced number of cases had higher stringency indexes. Notably, public event cancellation, international travel control, school closing, contact tracing, and facial covering were verified as the measures related to the reduction of the cases. Definitely, in any type of event, the distance between people is the most efficient way to avoid contamination [12] . During outbreaks, it is interesting to prevent the occurrence of public . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint events, to avoid the contact of people who are not from the same family and do not live together, getting to the event is also a possibility of contamination. In situations with many people and close proximity to them, such as public transport, it is difficult to estimate the efficiency of individual protection measures, such as masks, as the exposure is high [12] . In Japan, the sports events, concerts, and school prohibition reduced the R0 from 2.5 to 1.1, then after the reopening, the index was higher than the initial one [13] . This indicates that events prohibition is effective in reducing the number of new cases. Our results corroborate these data. School closing was also a measure applied in countries with a low number of cases. In the USA, from March to May 2020, the school closing led to a decrease in the number of cases and mortality [14] . In a systematic review carried out in April 2020, it was shown that the closure of schools did not impact the decrease in the transmission of the SARS-CoV-2 virus [15] . However, in our study we analyzed data from a year of the pandemic, making it possible to assess the emergence and control of outbreaks in several countries. Either way, the functioning of schools should have restrictions, such as the use of masks and distance between students, to avoid contamination [16] . The World Health Organization recommends the use of facial coverings (surgical masks, N95 or cloth face coverings) to prevent the spread of the virus. There is evidence of the efficiency of the use of facial coverings, such as, for example, a case in the USA in which two hairstylists worked for a few days while they were infected with the SARS-CoV-2 virus, but due to the use of masks, neither one of the 139 clients who had contact with them during the period was contaminated [17] . Another study looked at data from 200 countries and the time it took to use facial coverage was effective in reducing the number of infections and deaths [18] . However, it is important that the masks are well adjusted to the face so that all the air exchanged is filtered and the efficiency is preserved [19] . These data are in agreement with our results, showing that the use of facial coverings was an essential measure for the reduction of cases. Significantly, social isolation and case tracking have been shown to be effective in controlling the spread of the virus if done properly [20] . Mainly because the risk of contamination is higher when there is more exposure to the virus. The ideal is to distance greater than 2 m and contact time until 10 minutes with an infected person [21] . In fact, contact tracing will only be effective if a short contact time between people is defined [22] . An infected person can have no symptoms and still be able to transmit the virus to another person. With case tracking, it is possible to alert someone that had close contact with an infected person, even if it is someone who has had contact in a public environment. In fact, some mobile apps were developed to improve contact tracing, with rapid and anonymous identification, allowing efficient control [21] . Apps must prioritize detection efficiency and user privacy [23] . A survey in Belgium showed that only 49% of people would use a tracing app and their main concern was privacy [24]. However, this strategy will only be successful if it is a comprehensive screening approach and if the tracked people obey adequate isolation [25] [22] . Our results showed that contact tracing was important to prevent the virus from spreading during the outbreak. In fact, in China, screening for positive cases was more efficient in containing the spread of the virus than in international travel control [26] . On the other hand, a . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint study that evaluated several countries showed that restricting international travel was important to reduce mortality [27] . Our data demonstrate that international travel control is an important measure for the control of the pandemic during the year 2020. And it should be considered for the following year, mainly to avoid the contagion with new emerging mutations in some countries. Finally, it was observed that both the 75th and 95th percentile countries with a reduced number of cases had higher stringency indexes. In addition, restrictive measures affect the number of new cases after four weeks. The measures related to the reduction of the cases were public event cancellation, international travel control, school closing, contact tracing, and facial covering. These results will be essential to the understanding of a year of pandemic. Especially for countries that still face the outbreak. Despite scientific advances about the Covid-19 disease and the beginning of vaccination, many countries still face a critical situation and the emergence of new variants is a warning. Overall, restrictive measures are necessary to contain the spread of the virus and, consequently, reduce mortality. In this paper, we have done a comprehensive examination of the long-term effects and impact of the COVID-19 pandemic using one-year public data. We have also evaluated several government intervention variables and restrictive measures. We found strong evidence that restrictive measures reduce the number of new cases after four weeks, indicating the minimum time required for the measures to be established. We also found that public event cancellation, international travel control, school closing, contact tracing, and facial coverings were the most important measures to reduce the virus spread. It was also observed that countries with the lowest number of cases had a higher stringency index. Our findings are relevant for decision-makers in implementing appropriate intervention policies. Since COVID-19 vaccination has not been widely adopted during the first year of the outbreak, we did not have sufficient data to evaluate the effectiveness of vaccine adoption. Therefore, future research work may further examine the short-term and long-term effects of the different vaccination strategies. Finally, we may also utilize publicly available mobility datasets to assess the joint effects of government interventions, social-distancing, and economic policies with respect to population movements. Author details 1 Department on Information Systems, College of Technological Innovation, Zayed University, Abu Dhabi, United . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 8, 2021. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Figure 4 : Can a country reduce Case Fatality Ratio (CFR) by conducting a higher number of tests as an early precautionary measure to prevent patients from health complications that lead to deaths? We can see that the concentration of the data for the CFR of the high-testing countries is smaller compared to the low-testing countries. From both 75th and 95th percentiles, there is also a clear evidence that the higher range of values on the CFR (x-axis) is smaller for high-testing countries indicating that high-testing countries have lower CFR when compared to low-testing countries. Using the Student's t-test, analysis of the Case Fatality Ratio (CFR) in low-testing and high-testing countries is the following A) in the 75th percentile, CFR was not different B) in the 95th percentile, low-testing countries had a higher CFR (mean = 0.0231) than the CFR of high-testing countries (mean = 0.0062, P = 0.003). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) However, an opposite trend is observed at the 5th and 95th percentile whereby the area to the left under the blue curve has significant concentration. This indicates that extreme lockdown measures is essential to have significant reduction on the monthly percentage change of cases. Moderate lockdown measures do not have significant impact on the monthly reduction since both high and low stringency countries have similar concentrations and peaks, and we also observed a few countries at the right side of the range that incurred very high monthly percentage change of total cases under moderate lockdown. Using the Student's t-test, monthly percentage change of total cases in A) High and low stringency countries in the 75th percentile. The mean monthly percentage change of total cases was 66.2 ± 0.7 in the high-stringency countries and was 31.4 ± 1.8 in the low-stringency countries (P<0.0001; 95% confidence interval [CI], -38.6 to -31.0). A) High and low stringency countries in the 95th percentile. The mean monthly percentage change of total cases was 70.3 ± 1.2 in the high-stringency countries and was 21.4 ± 2.8 in the low-stringency countries (P<0.0001; 95% confidence interval [CI], -55.4 to -42.5). . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Figure 7 : What is the effect and the impact of implementing the most aggressive lockdown and restrictive measures? Does it greatly help in reducing the cases? From both 75th and 95th percentile, it is clear that the area under the curve of next month has higher concentration on the left side when compared to current month. This clearly indicates that extreme lockdown measures have direct significant impact on reducing the number of cases within one month. Using the Spearman Test, percentage change of total cases monthly in the high-stringency countries was analyzed A) from the 75th percentile, the total cases of one month were negatively correlated with the stringency index of the next month (Figure 7 A, r = -0.70, P= 0.0001), and B) from the 75th percentile, there was not significant correlation. At 75th percentile, the concentration levels are very similar for both current and next months. However, at 95th percentile, the range of values to the right of the curve for the next month is larger. For the current month, there is a concentration of data on the left meaning that there is a smaller percentage change of total cases when compared to the next month. This clearly shows that lenient restrictive measures have higher tendencies of increased cases after one month. Similarly, the percentage change of total cases monthly in the lowstringency countries was analyzed Using the Spearman Test A) from the 75th percentile, and B) from the 75th percentile. There was not significant correlation. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Figure 9 : For those countries that were most effective in controlling the COVID-19 outbreak, what steps/measures (i.e., staying at home, internal movement restriction, travel controls restrictions, etc.) have they taken that have successfully contributed to the outbreak prevention? What are the most important predictor variables that can effectively prevent and minimize the spread of COVID-19? Restrictive measures as School Closing, Workplace Closing, Cancel Public Events, Restrictions on Gatherings, Close Public Transport, Staying at Home, Internal Movement Restrict, International Travel Control in A) 75th percentile countries. There was difference between the countries with the highest or lowest number of positive cases at public event cancellation (P=0.0008), closing public transportation (P=0.03), restriction on gatherings (P<0.0001), school closing (P=0.0004), Staying at home (P=0.03) and workplace closing (P=0.01). B) 95th percentile countries. There was difference between the countries with the highest or lowest number of positive cases at public event cancellation (P= 0.004), international travel control (P=0.02) and school closing (P=0.001). The data were analyzed using the Student's t-test. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Figure 10 : What is the effect of containment measures on COVID-19 outbreak? Containment measures as public information campaigns, testing policy, contact tracing, emergency investment in healthcare, investment in vaccines, facial coverings, and vaccination policy in A) 75th percentile countries. There was difference between the countries with the highest or lowest number of positive cases at contact tracing (P=0.0006), facial coverings (P<0.0001), public information campaigns (P= 0.006) and testing policy (P=0.0002). In B) 95th percentile countries, there was difference between the countries with the highest or lowest number of positive cases at contact tracing (P=0.02) and facial coverings (P=0.002). The data were analyzed using the Student's t-test. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Figure 11 : Do countries with the strictest lockdown polices have better advantages in containing the virus when compared to countries with very lenient lockdown policies? There is a peak around the index level 10 for each graph for the high-case countries. This indicates that the data is concentrated around the index level of 10 for the high-case countries. There is also a peak around the index level 10 2 for each graph for the low-case countries, which also indicates the data concentration and peak level. Analysis of the difference between low and high cases countries A) Stringency index in the 75th percentile countries (P<0.0001). B) Containment health index in the 75th percentile countries (P<0.0001). C) Stringency index in the 95th percentile countries (P<0.0001). D) Containment health index in the 95th percentile countries (P<0.0001). The data were analyzed using the Student's t-test. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 8, 2021. ; https://doi.org/10.1101/2021.05.04.21256635 doi: medRxiv preprint Analyzing the epidemiological outbreak of covid-19: A visual exploratory data analysis approach Analysing the covid-19 cases in kerala: a visual exploratory data analysis approach Visual exploratory data analysis of covid-19 pandemic Multivariate Density Estimation: Theory, Practice, and Visualization 3004 (2020) Politics and Policy of COVID-19 Impact of lockdown on covid-19 case fatality rate and viral mutations spread in 7 countries in europe and north america National case fatality rates of the covid-19 pandemic Event-specific interventions to minimize covid-19 transmission Effects of voluntary event cancellation and school closure as countermeasures against covid-19 outbreak in japan Association between statewide school closure and covid-19 incidence and mortality in the us School closure and management practices during coronavirus outbreaks including covid-19: a rapid systematic review Comprehensive and safe school strategy during covid-19 pandemic Absence of apparent transmission of sars-cov-2 from two stylists after exposure at a hair salon with a universal face covering policy-springfield, missouri Physical distancing, face masks, and eye protection to prevent person-to-person transmission of sars-cov-2 and covid-19: a systematic review and meta-analysis Arab Emirates.. 2