key: cord-0748479-6w9kdhod authors: Huang, Jianping; Zhang, Li; Liu, Xiaoyue; Wei, Yun; Liu, Chuwei; Lian, Xinbo; Huang, Zhongwei; Chou, Jifan; Liu, Xingrong; Li, Xun; Yang, Kehu; Wang, Jinguo; Liang, Hongbing; Gu, Qianqing; Du, Pengyue; Zhang, Tinghan title: Global prediction system for COVID-19 pandemic date: 2020-08-02 journal: Sci Bull (Beijing) DOI: 10.1016/j.scib.2020.08.002 sha: 8ef074402038ee06e124419bce5b1cd25443c9fa doc_id: 748479 cord_uid: 6w9kdhod nan The outbreak of a novel coronavirus (SARS-CoV-2) has resulted in a worldwide pandemic infecting over 5.9 million people [1] . This positive-strand RNA virus can cause severe respiratory distress syndrome in humans , with over 364,000 deaths between December 2019 and May 30, 2020 [1, 2] . To combat this pandemic, the World Health Organization (WHO) is coordinating global efforts on surveillance, epidemiology, mathematical modeling, diagnostics, treatment and control, and has issued interim guidance to countries. Nevertheless, this is a difficult situation and the number of cases is rapidly increasing globally. The temporal evolution and the spatial spread of this virus have also raised serious concerns about the future trajectory of this outbreak. In the current epidemic period, when the number of reported COVID-19 cases is growing quasi-exponentially, nowcasting and forecasting are essential for public health planning and control [3] [4] [5] [6] . Due to high population connectivity, there is a high risk that a localized outbreak may evolve into a global pandemic. Therefore, a system for global pandemic prediction is urgently needed to provide important scientific data to the WHO and local governments to help with public decision-making and allocation of medical resources. An important feature of modern epidemiological responses involves the use of all available data to provide real-time response information [7] . Although it is difficult to establish an accurate epidemiological model describing the spread of a pandemic, real global pandemic data contain particular solutions to the mathematical equations incorporated in epidemiological models. Therefore, it is theoretically possible to improve the credibility of prior epidemiological models by introducing the latest pandemic data. In addition, there is spatiotemporal heterogeneity in the occurrence of COVID-19, which may be related to the meteorological conditions and intervention measures implemented by local governments in different regions of the world [8] . Meteorological factors influence the spread of many diseases, and temperature and relative humidity may interact with the incidence of COVID-19. Hence, it is necessary to establish a continental or even global early warning system for epidemics that incorporate weather prediction and climate analysis as independent variables to improve the overall accuracy of the prediction [8, 9] . Many countries also implemented epidemic prevention and control measures during the outbreak, and it is important to consider the role of national interventions in the spread of COVID-19. These factors are important in revising and improving prediction models [5, 6, 10] . Currently, however, no studies have considered the influence of meteorological factors and interventions on global COVID-19 prediction models. In this article, we integrated the epidemic prediction model with real global pandemic data and considered the influence of environmental factors (temperature and humidity) as well as the implementation of control measures, to establish our own global prediction system, which is the first of global COVID-19 prediction system. This system predicts COVID-19 country by country and day by day over globe. The results of predictions using this system are now available online (http://covid-19.lzu.edu.cn/). Our prediction system is a modified epidemiological susceptible-infectiousrecovery (SIR) model [6, 11] that incorporates real global pandemic data, meteorological factors, and quantified quarantine measures [9] . In this model, it is assumed that the total population (N) in the region remains unchanged during the outbreak, COVID-19 is only spread via human-to-human infection, and there is no difference in immunity among individuals. The total population of each country (N = S + I + R) is divided into three categories, namely S (susceptible), I (infected), and R (recovered + dead). The SIR model is described by [1] [2] 7] Based on the SIR model, we developed a model that includes the potential influence of temperature, humidity, urban population density, and intensity of control measures on COVID-19 infection. Our model is defined by the following equations: where = 0 + 1 ( 2m ) + 2 (RH 2m ) and = 0 + c (NO 2 ). F(T2m) and F(RH2m) are the probability distribution functions (PDFs) of local temperature and relative humidity at 2m above ground level obtained by Huang et al. [9] , who found that 60.0% of confirmed COVID-19 cases occurred in places where the air temperature ranged from 5 to 15°C. The parameters ( 0 , 1 , and 2 ) are obtained using non-linear fitting. F(NO2), the anomaly of the local NO2 concentration, was introduced to quantify the effectiveness of quarantine measures. We chose NO2 because it is a major atmospheric pollutant from vehicle exhaust and is highly correlated with traffic volume. Thus, a large decline in the local NO2 concentration is associated with reduced traffic flows and is correlated with the efficiency of the quarantine measures. With over 5.9 million confirmed cases globally [1] , the coefficients in Eqs. (4) and (5) (β, μ, and η) are inversed [12] and calibrated using pandemic, meteorological, and quantified quarantine data that are updated in real-time for each country or region. Thus, this global prediction system incorporates various modules instead of using a simple model. The residual term (Q), which is predicted using the EEMD-ARMA (Ensemble Empirical Mode Decomposition-Autoregressive moving Average Model) method [13] , was introduced to help correct the bias between the hindcast and the actual reported data. Because the parameters included in this system are inversed from the latest data repository, they self-adapt to the reported cases for each region. Fig. 1 shows the predicted number of new cases of COVID-19 per day on June 20, 2020. Implementation of the quarantine measures and the rise in temperature are expected to restrain the spread of the virus in the northern hemisphere to some extent, and the number of new cases is expected to decline. However, with gradual lifting of the blockade in the United States, it is predicted that the number of new cases per day will remain above 20,000. The outbreak in European countries will be contained initially and the number of newly confirmed cases will decline. Regions such as India and Russia will still be in the outbreak stage, in which the number of new cases per day exceeds 10,000. In the southern hemisphere, however, the declining temperature due to the switch in season to autumn and the loose quarantine measures will provide favorable conditions for the virus to spread. It is foreseeable that the epidemics in Brazil and other southern hemisphere countries will deteriorate. Effective quarantine measures are urgently required as a precaution against further outbreak to save lives before it is too late. Fig. 2 compares the reported number of confirmed cases between January 22 and May 14, 2020, with the results of our simulation for six countries (US, Italy, UK, Russia, Saudi Arabia, and Brazil). The figure also shows a 10-day hindcast for May 15 to May 24, 2020. Our system captures the systematic variation in number of new cases per day, and reproduces the epidemic curves for Italy before and after the peaks. The system also successfully emulates the epidemic curves for Saudi Arabia and Brazil, where the epidemic is still in its rapidly increasing phase, and the US, Russia, and UK, where the curve is oscillating. For countries in the post-peak stages, nationwide alerts and quarantine methods are still necessary as a precaution against potential recurrent outbreaks. The COVID-19 pandemic is one of the most severe global crises since the Second World War, with a devastating impact on human health and the global economy. Over 4 billion people worldwide have been affected by severe restrictions on their movements and social relationships [14] . Governments, social policies, and health systems around the world need to be prepared for the pandemic to transform passive prevention into active prevention. In this digital, globalized world, new data and information on the evolution of the COVID-19 epidemic are rapidly updated. In this article, we describe a more realistic and practical global prediction system that incorporates large amounts of data. Except for Africa and some regions, our predictions are consistent with the total number of confirmed cases worldwide. Our predictions indicate that countries in the western hemisphere and Africa are high-risk regions, and countries with initial control of pandemic should remain alert for secondary outbreaks (e.g., China and South Korea). Nevertheless, because the prediction system is still at an initial stage of development, it still has some flaws that warrant further improvements. The quality control module of input data should be improved to reduce the system sensitivity to the initial condition. The uncertainty of epidemic prediction is not only related to the prediction system being used, but also to local medical conditions, the degree of population aging, and other factors. Our prediction system may be more useful for assessing qualitative trends and evaluating intervention options than for accurately predicting the number of cases. Establishment of a COVID-19 global prediction system will help us to understand and mitigate the impact of this virus and future pandemics, and provide a valuable reference for policymakers. In the face of a global pandemic, we are all responsible for combating the virus, protecting vulnerable members of society, and ensuring that national security systems can respond to the pandemic and provide sufficient care for the population. Joint efforts of all countries and people worldwide will help us to overcome this epidemic and restore normal life as soon as possible. COVID-19 Data Repository by the Center for Systems Science and Engineering Structural basis for the recognition of SARS-CoV-2 by full-length human ACE2 Key data for outbreak evaluation: building on the Ebola experience Early dynamics of transmission and control of COVID-19: a mathematical modelling study Estimating the number of infections and the impact of non-pharmaceutical interventions on COVID-19 in 11 European countries Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: a modelling study Outbreak analytics: a developing data science for informing the response to emerging pathogens COVID-19 transmission in Mainland China is associated with temperature and humidity: a time-series analysis Optimal temperature zone for the dispersal of COVID-19 How to make predictions about future infectious disease risks The mathematical theory of infectious diseases and its applications Inversion of a nonlinear dynamical model from the observation Improving forecasting accuracy of annual runoff time series using ARIMA based on EEMD decomposition Modeling and forecasting of epidemic spreading: the case of Covid-19 and beyond He has long been dedicating to the study of long-term climate prediction, dust-cloud interaction and semi-arid climate change by combining field observations and theoretical study The numbers of reported confirmed cases for the same period are shown by violet lines, whereas the simulated cases This work was jointly supported by the National Natural Science Foundation of China (41521004) and the Gansu Provincial Special Fund Project for Guiding Scientific and Technological Innovation and Development (2019ZX-06). The authors acknowledge the Center for Systems Science and Engineering at Johns Hopkins University for providing the COVID-19 data, and we would like to acknowledge NASA and ECMWF for making the OMNO2d and ERA-interim data publicly accessible. J. H. designed the study and contributed to the ideas, interpretation and manuscript writing. L. Z., X. L., Y. W., C. L. and X. L. contributed to the data analysis, interpretation and manuscript writing. All of the authors contributed to the discussion and interpretation of the manuscript. All of the authors reviewed the manuscript. The authors declare that they have no conflict of interest.