key: cord-0760415-vbw0wbn1 authors: Muthusami, R.; Saritha, K. title: Statistical analysis and visualization of the potential cases of pandemic coronavirus date: 2020-06-10 journal: Virusdisease DOI: 10.1007/s13337-020-00610-1 sha: 07c638d72edcbd6f41a84d30023b0cfd864c3d68 doc_id: 760415 cord_uid: vbw0wbn1 A local outbreak of initially unknown cause pneumonia was detected in Wuhan (Hubei, China) in December 2019 and a novel coronavirus, the severe acute respiratory syndrome coronavirus 2, was quickly found to be causing it. Since then, the epidemic has spread to all of China's mainland provinces as well as 58 other countries and territories, with more than 87,137 confirmed cases around the globe, including 79,968 from China, 7169 from other countries as of 1 March 2020, as stated by the World Health Organization in the COVID-19 situation report-41. In response to this current public health emergency, this study done a statistical analysis and visualized reported cases of coronavirus disease 2019 (COVID-19) based on the open data collection provided by Johns Hopkins University. Where the location and number of confirmed infected cases have been shown, there have also been deaths, recovered cases and comparisons of the growth rates between the Globe countries. This was intended to provide researchers, public health officials and the general public with exposure to the epidemic. statistical data analysis and visualization can increase the understanding of situation among the mass population in the coming days [13, 14] . The World Health Organization (WHO), Johns Hopkins University researchers, and other agencies all maintain dataset on the number of confirmed infected cases, deaths, and disease recoveries. All data obtained in this research work is from Johns Hopkins University and is freely accessible via the GitHub repository. The dataset covered the period from 22 January 2020 to 17 April 2020 which includes time-series and aggregated data [15] . We statistically analyzed our dataset with various methods of data analysis and visualized those data to provide a proper understanding of the COVID-19 outbreak worldwide. Our exploit analysis was carried out by Johns Hopkins University with the 2019 coronavirus dataset (January-April 2020). Here, between 22 January 2020 and 17 April 2020, we present an effort to visualize and analyze the results. COVID-19 has so far propagated nearly 185 Countries/Regions, 83 Cities/Provinces have been registered, and 264 separate geographical locations combined. Using time-series data, it estimated the number of individual cases, such as confirmed infected, deaths and recovered around the globe and the top 10 countries in the world. As of 17 April 2020, the United States and Spain are among the top ten countries in the world. Further to the discussion on different cases, such as confirmed illnesses, deaths and recovery in those countries as seen in the Fig. 1 . Worldwide the total confirmed infected cases are 2,152,646, and the global average rate is 0.38, with a standard deviation of 2.15. The global average rate of the Top 10 countries is 7.73, with a standard deviation of 8.49. In this circumstance, the US ranked first with a total of 667,801, the global percentage is 31.02, and with a total of 184,948, the global percentage of 8.59, Spain is second. The estimated number of deaths worldwide is 143,800, with a global average of 3.61. For this situation, the US occupied the first place with 32,916 counts, and with 22,170 counts, Italy was second in the top 10 countries around the world. The total number of cases recovered is 542,107 in the world. In this scenario, Germany ranked first, with a total of 77,000, with a total of 74,797, Spain ranked second, and the US ranked fourth, with a total of 54,703, in the top 10 countries of the world. From a statistical data analysis, it can be understood that 5% of deaths and 8% of recoveries occurred in reported cases in the United States. In Spain, 10% of deaths and 40% of recoveries occurred in confirmed cases. We also explore time-series data using visual data analysis to provide a clear and understandable outcome of this extreme outbreak of COVID-19. This segment will analyze various time-series data using several visual data analysis approaches with the R programming language. We have created a graph and given awareness of how SARS-CoV-2 spread around the globe from 22 January 2020 to 17 April 2020; it allows individuals to grasp the epidemiological essence of COVID-19. Figure 1 indicates that the confirmed infected cases have been crossed by 2,000,000 cases around the globe. Many cases, such as death, recovery and active, have also been shown. New cases reported on a single day do not actually represent new cases on that day, as the number of confirmed infected cases or deaths announced by any organization-including WHO, ECDC, Johns Hopkins University and others-does not reflect the total number of new cases or deaths on that day. This is due to the long chain of reporting that occurs between a new case or death and its inclusion in national or international statistics. In the event of an outbreak of an infectious disease, it is necessary not only to track the number of deaths, but also the rate of increase in the number of deaths. If there is a fixed number of deaths over a fixed period of time, we call that ''linear'' growth. But if they continue to double within a fixed time span, we call it ''exponential'' growth. Based on the results, looking at the rate of death growth, we have understood that it is linear growth in the US and Spain. Figure 3 indicates that changes every day occurred in confirmed cases between 22 January 2020 and 17 April 2020 from the USA and Spain. By this we will conclude that the reported cases will accelerate on 20 March 2020 and that the last day of change is 31,451 in the US. In Spain, the confirmed case rises linearly from 03 March 2020 to 15 April 2020, the last day of change is 7,304. It is clear that the real-time analysis of these data is extremely useful in documenting the epidemiological behavior of this severe disease. We believe that this method of data analysis will certainly increase understanding of the situation and inform behavior. This study examined three separate categories of data, including confirmed infected, death and recovered cases across the globe, for the period from 22 January to 17 April 2020. It will also include a comparative overview of all the Statistical analysis and visualization of the potential cases of pandemic coronavirus cases reported in the United States and Spain. Nevertheless, we are discussing various cases internationally in order to explain the various cases identified over a particular time span. After review, 2,152,646 confirmed cases of COVID-19 occurred worldwide on 17 April 2020. In the US, where the highest count is 667,801, the global percentage is 31.02. Death cases were 143,800 across the globe (6.68%), with the US top count being 32,916 (4.93%). The cases recovered were 542,107 around the globe (25.18%) with Germany at the top of the list with a total of 77,000 cases. The visual analysis of the growth rate of confirmed infected, deaths and recovered cases between the US and Spain is another investigation. The goal of this article on COVID-19 is to summarize existing research, collect relevant data and make it possible for readers to make sense of the published data and early research on the coronavirus outbreak. Much of our work focuses on known problems for which we can link with well-established research and evidence on COVID-19. The research presented here is based on statistical and visual data analysis methods with the aid of a dataset provided by John Hopkins University. The research was done with R Studio 1.2.5033 and R 4.0 beta versions of the Windows 10 operating system. Each and every description of the different cases of COVID-19 is documented here between 22 January 2020 and 17 April 2020. We are now also observing the harmful outbreak of the SARS-CoV-2 virus. To the world, this is extremely troubling. In this analysis, we examined the top 10 countries most affected and comprehensive reported cases of the United States and Spain. In conclusion, the dataset COVID-19 (2019-nCoV) from the Johns Hopkins CSSE data repository (22 January 2020 to 17 April 2020) was used for our experiment. It has supported us to generate and disseminate detailed information to the scientific community and to the public, especially at the peak phase, in order to understand the growth and impact of the novel coronavirus. Nevertheless, knowledge of this novel SARS-CoV-2 virus remains minimal among the general population around the globe. Raw data published from different sources are not adequately capable of offering an insightful understanding of COVID-19 as a consequence of SARS-CoV-2. A user-friendly data analysis platform would also be more effective in recognizing the epidemic of this severe disease. The informative graphics of the visualization platform provide an intuitive interface and a simple view of all raw data. Hopefully, in the coming days, we will continue to track the epidemiological data of this outbreak that we have used in this study and from other official sources. Naming the coronavirus disease (COVID-19) and the virus that causes it. The ICTV's page is here: International Committee on Taxonomy of Viruses (ICTV) World Health Organization. WHO statement regarding cluster of pneumonia cases in Wuhan COVID-19) situation reports Identification of a novel coronavirus causing severe pneumonia in human: a descriptive study Identification of a novel coronavirus associated with severe acute respiratory syndrome Emerging coronaviruses: genome structure, replication, and pathogenesis A novel coronavirus from patients with pneumonia in China Centers for Disease Control and Prevention. 2019 Novel Coronavirus (2019-nCoV) The fight against the 2019-nCoV outbreak: an arduous march has just begun Rolling updates on coronavirus disease (COVID-19) The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health-the latest 2019 novel coronavirus outbreak in Wuhan, China A human coronavirus responsible for the common cold massively kills dendritic cells but not monocytes Analyzing the epidemiological outbreak of COVID-19: a visual exploratory data analysis (EDA) approach COVID-19 outbreak: Tweet based analysis and visualization towards the influence of coronavirus in the World Novel CoronaVirus CoViD-19 (2019-nCoV) Data Repository by Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). 2020 Conflict of interest All authors declare no conflict of interest.Data availability statement Data will be available upon request.