key: cord-0781202-a1sk6mka authors: Li, Sijia; Song, Kun; Yang, Boran; Gao, Yucen; Gao, Xiaofeng title: Preliminary Assessment of the COVID-19 Outbreak Using 3-Staged Model e-ISHR date: 2020-04-07 journal: J Shanghai Jiaotong Univ Sci DOI: 10.1007/s12204-020-2169-0 sha: 49d2420cdc9347be2c92e1ed714bf2bb527f2647 doc_id: 781202 cord_uid: a1sk6mka The outbreak of coronavirus disease 2019 (COVID-19) in Wuhan has aroused widespread concern and attention from all over the world. Many articles have predicted the development of the epidemic. Most of them only use very basic SEIR model without considering the real situation. In this paper, we build a model called e-ISHR model based on SEIR model. Then we add hospital system and time delay system into the original model to simulate the spread of COVID-19 better. Besides, in order to take the government’s control and people’s awareness into consideration, we change our e-ISHR model into a 3-staged model which effectively shows the impact of these factors on the spread of the disease. By using this e-ISHR model, we fit and predict the number of confirmed cases in Wuhan and China except Hubei. We also change some of parameters in our model. The results indicate the importance of isolation and increasing the number of beds in hospital. , there were 53 284 confirmed cases as of 24:00 on February 21. There have been 2 345 dead cases and 76 288 confirmed cases in total. The impact of this epidemic in China and the world is profound in various fields. Therefore, the analysis and subsequent prediction of the development of the epidemic are particularly critical. In this paper, we start analysis from Wuhan to predict the epidemic situation and make our assessment for prevention and control of There are many academic papers and researches related to epidemic prediction. Most of the earlier researches focus on estimating the basic reproductive number and making preliminary predictions [1] [2] [3] [4] [5] . Some researches make predictions without using classic epidemic model such as detailed individual based mobility model [6] , while some of other papers study the impact of certain facts such as transportation [7] , traveling and incubation period [8] . In the related work about epidemic prediction, we find there are some of defects that can be perfected. In predicting the development of the epidemic, most papers make predictions by simply solving the problem of ordinary differential equations. It is not reasonable for this problem is not continuous in practical sense. Therefore, we propose time delay mechanism to simulate the discrete problem. The fact is that this outbreak happens in China with urgent and aggressive action. We cannot ignore the government's control and people's awareness. Thus, we propose 3-staged model to simulate it. Due to the fact that many patients cannot be cured or isolated in hospital in the early stage, it is of great importance to build a model to estimate the impact of hospital. In this paper, we start analysis from Wuhan to make overall predictions of COVID-19 and then we also make predictions of the epidemic in China except Hubei. Due to the lack of data on medical facilities in other cities in Hubei except Wuhan, we do not include these cities in this paper. We only consider Wuhan and other provinces outside Hubei. This may lead to some differences from other papers, but it has no effect on the prediction of the development and end time of the epidemic. We first focus on Wuhan which is quarantined from the outside world. We propose a model of e-ISHR (each letter represents a status: exposed → incubation → symptomatic → hospitalized → recovered) based on the classic epidemiological model SEIR. Then, we make a nationwide model based on e-ISHR model with the same infectivity. Using this model, we also make the predictions of COVID-19 in China except Hubei. To improve the authenticity and reliability of the model, we integrate 3-staged model and hospital system into it. The 3-staged model aims to simulate the changing of government's control and people's awareness. We divide the development of the epidemic into 3 stages: early natural stage (8 January 2020 to 20 January 2020), intermediate control stage (21 January 2020 to 11 February 2020), and late control stage (since 12 February 2020). The hospital system aims to simulate the shortage of hospital beds and medical resources in Wuhan. Once the hospital is full, virus-carrying patients can only be isolated at home, which will undoubtedly increase the chance of virus transmission. We then make predictions for both Wuhan and China except Hubei. There are many academic papers and researches related to epidemic prediction. Some papers only make preliminary predictions based on epidemic data in Wuhan and use classic epidemic model such as SEIR. Li et al. [1] found that interpersonal infection has occurred since the middle of December 2019 based on information reported by 22 January 2020. The basic reproductive number is estimated to be 2.2 (95% confidence interval: 1.4-3.9). Imai et al. [2] estimated that a total of 1 723 cases of COVID-19 in Wuhan (95% confidence interval: 427-4 471) had onset of symptoms by 12 January 2020. Wu et al. [9] used the number of international cases exported by Wuhan to estimate the number of infections in Wuhan. The basic reproductive number is estimated to be 2.68 (95% confidence interval: 2.47-2.86). Nishiura et al. [3] estimated the cumulative incidence in China at 5 502 cases (95% confidence interval: 3 027-9 057) based on thirteen exportation events reported on 24 January 2020. Riou and Althaus [4] estimated the basic reproduction number to be around 2.2 (90% high density interval: 1.4-3.8). Zhan et al. [10] integrated the daily intercity migration data with the classic SEIR model to construct a new model suitable for describing the epidemic. However, others make predictions of the epidemic in Wuhan without using classic epidemic model. Majumder and Mandl [5] estimated the basic reproduction number as a range from 2 to 3.1 using IDEA model. Guo et al. [11] applied virus host prediction to predict the potential hosts of viruses using deep learning algorithm. Some papers study the impact of human factors such as transportation or traveling. Backer et al. [8] researched on the travellers who have stayed in Wuhan. Wu et al. [12] estimated a risk of fatality among hospitalized cases at 14% (95% confidence interval: 3.9%-32%). Du et al. [7] estimated the release-risk of transportation from Wuhan to other cities in China before the quarantine. Besides, others study the impact of virus factors such as incubation period. Backer et al. [8] researched on travellers who have stayed in Wuhan and draw conclusions about incubation period of COVID-19. This model is mainly divided into three part for overview: different groups of people, staged mechanism, and time delay mechanism. In this part, we divide crowds into five groups: exposed group, group in incubation period, group in symptomatic period, hospital/home isolated group, and recovered group. If a healthy man has contact with a virus carrier, he will become a part of exposed group. If this man is unlucky and really catches the disease, he will become symptomatic after a few days (incubation time). If one day this man realizes he needs to go to hospital to see a doctor, he will be confirmed to have this disease and become hospital/home isolated. Finally, after a few days, he may be recovered (this rate will be different, depending on hospital or home isolation). We assume that the recovered will not contract this disease again because of the antibody. This is the circulation of different groups. This system aims at calculating the number of people in exposure with virus each day. The number is determined by how many healthy people a virus carrier will meet in a day. According to the real situation that the number of people met by a virus carrier in incubation period or in symptomatic period may be different, so these two groups need to be treated separately in our model. The formula for continuity is where x represents continuous time variable. Then we discretize this formula into where n means the number of days from the beginning of COVID-19. In this part, the value of parameter β inc , β sym and β hom can be obtained by fitting. This system aims at calculating the number of people in incubation period each day. The number I satisfies where I new (x − t inc ) means the number of newly increased patients in incubation period at the time x − t inc . Then we discrete this formula into In this part, the values of α and δ can be obtained by fitting. According to the data given by the National Health Commission of the People's Republic of China, the average value of t inc is 7 d. This system aims at calculating the number of people who are symptomatic (before going to hospital) each day. The number S can be calculated by (5) where S new (x − t toH ) means the number of newly increased patients in symptomatic period at the time Here, the value of t toH can be obtained by fitting. This system aims at simulating the existence and role of hospital, which is not considered in the general SEIR model. When symptomatic patients go to hospital, if hospital is full (no beds for isolation or cure), the patients will be isolated at home, which of course has higher probability of contacting with and infecting others. The main purpose of establishing this system is to prove the importance of building hospitals in Wuhan. This system aims at calculating the number of people who are recovered each day. The number R satisfies The discrete formula is In this part, the value of γ hom can be obtained by fitting. According to the data given by the National Health Commission of the People's Republic of China, the average value of t rec is 15 d. Besides, the value of γ hos can be calculated as 0.059 from the data given by the National Health Commission of the People's Republic of China. Due to the fact that the exact value of t die is not mentioned by the National Health Commission of the People's Republic of China, we set it as 25 d. Unlike the classical single-staged SEIR model, our model divides the development of the epidemic into three stages for simulation. Stage 1, earlier natural stage, is from the start time to 20 January 2020. At this stage, according to news reports, people in Wuhan were free to move, and almost no health protection measures were taken. The National Health Commission of the People's Republic of China has not yet notified the state of the epidemic nationwide, and the prevention and control of the epidemic was in a relatively loose state. Stage 2, intermediate control stage, is from 21 January 2020 to 11 February 2020. Starting from January 21, a daily report and zero-report system for pneumonitis cases of new coronavirus infections would be implemented nationwide. The National Health Commission of the People's Republic of China included pneumonia of new coronavirus infections in the legal management of Class-B infectious diseases and took preventive measures for Class-A infectious diseases and control measures. And after January 23, as Wuhan came to a closure and a large number of medical staff went to assist Wuhan, the national epidemic prevention entered a stage of strict government control. Stage 3, late control stage, is from 12 February 2020 to date. After February 12, according to the Diagnosis and Treatment of New Coronavirus Pneumonia (Trial Implementation of the Fifth Edition), diagnostic criteria of the cases were distinguished. Clinical diagnostic cases were added and suspected cases with pneumonia imaging characteristics were identified as clinically diagnosed cases, so that patients could receive standardized treatment in accordance with the relevant requirements of confirmed cases as early as possible. Therefore, the distinctive feature of this stage is that due to changes in diagnostic criteria, the number of confirmed cases on February 12 increased rapidly by more than 10 000, but at the same time this also indicates that more patients in the subsequent stage would receive better treatment conditions and the epidemic has been further strictly controlled. Unlike the general SEIR epidemic model, our model not only reflects the change of continuous time or adjacent days, but also takes into account the discrete time delay in the actual situation. For example, we specifically consider the lengths of t inc , t tOH , t rec and t die . It will take some time to manifest the effects of data. Therefore, in the final result, our estimation curve may not be as graceful as the general SEIR model, but it is indeed more in line with reality. In particular, in order to measure the lengths of the time periods in the actual situation, we use a Poisson distribution near the average to simulate the actual time of the population for each time. First of all, we fit e-ISHR model with the data of confirmed cases of Wuhan. We start fitting in Stage 1. As shown in Fig. 1 , by fitting the model in Stage 1, we get the parameters below: the start time of the outbreak as 20 December 2019, and the initial infection number as 41. Figure 1 shows the infectious number in model/real situation. Here, the infectious probability is fitted as α = 0.06; the first stage is earlier natural stage, when viruses spread naturally. Due to time delay model, the number of confirmed cases stays 0 not until one goes to the hospital. Therefore, we can get the initial infection number and the starting time of the outbreak. Model estimation Real number In Stage 2, the high false negative rate of nucleic acid kits has resulted in a significantly smaller number of confirmed cases reported by the government than the real number [10] . Therefore, we are not going to fit data with our model in Stage 2. According to the official data of hospital, we fix the total number of hospital beds in Stage 2 as 160 151. Then we fit data in Stage 3, when the report confirmed cases are roughly equivalent to the real situation. By fitting the model in Stage 3, we get the parameters for both Stage 2 and Stage 3. Using the fitted model with the same parameters, we make prediction in next 40 days. As shown in Fig. 2 , blue and green dots represent the real number of confirmed cases we use for two fitting calculations. The red dots are the real data we do not use for fitting. Therefore it is intuitive to see the accuracy of our prediction. The mean absolute percentage error (MAPE) of our prediction is calculated as 1.13%, which indicates that our prediction is relatively accurate. The mean squared error (MSE) is calculated as 381 425 person 2 . We draw our conclusions that the epidemic in Wuhan will end gradually after March 8, the extant confirmed cases will decrease to 0 after April 1, and the final number of accumulative confirmed cases may reach 55 000. Figure 2 shows the fitting and predicting confirmed cases in Wuhan. The second stage is intermediate control stage with high awareness of protection but insufficient medical resources; the third stage is late control stage with diagnostic criteria. Due to time delay model, there are many small twists in the picture, which indicates the complexity of our model. After predicting epidemic for Wuhan, we then predict the epidemic in other provinces outside Hubei with the same method. Figure 3 shows the fitting and pre-dicting confirmed cases in China except Wuhan; the MAPE and MSE of this prediction are 0.218% and 1 380 person 2 , respectively. As shown in Fig. 3 , we draw conclusions that the epidemic in other provinces outside Official number of confirmed cases (for fitting 1) Official number of confirmed cases (inconsistent with the facts) Official number of confirmed cases (for fitting 2) Official number of confirmed cases (for obtaining accuracy) Earlier natural stage (unknown -2020-1-20) Intermediate control stage (2020-1-21 -2020-2-11) Late control stage (2020-2-12 -2020-3-7) Epidemic end gradually (after 2020-3-8) Hubei will end gradually after February 27, the extant confirmed cases will decrease to 0 after March 23, and the final number of accumulative confirmed cases may reach 13 500. Parameters used in predicting models for Wuhan and China except Hubei are shown in Table 1 and Table 2 , respectively. Next, we also estimate the changes in the number of confirmed patients under different circumstances to determine the degree of influence of a factor on the epidemic. Firstly, we change the number of people contacted by a virus carrier in incubation period each day, which is the parameter β inc . We set it as 0.1, 1, 2, 3, 4 and 5 separately to simulate the number of contacts under different degrees of control measures. As shown in Fig. 4(a) , the number of contacts can make a huge impact, especially when this parameter reaches more than 5. We then study the impact of different number of beds in hospital on the development of the epidemic. We set the number of hospital beds to 1 000, 5 000, 10 000, 15 000 and 20 000 separately. As shown in Fig. 4(b) , we can observe that there is a positive correlation between the number of hospital beds and the number of confirmed patients. This also confirms the necessity and effectiveness for Wuhan to build Thunder Mountain Hospital, Vulcan Mountain Hospital and many square cabin hospitals, from a certain angle. Models Given that most of the current epidemic prediction papers use the SEIR epidemic model, we compare e-ISHR model with the traditional SEIR model. In the traditional SEIR model, the parameters are set as follows: the total population, the probability of illness during the incubation period, and the probability of recovery are consistent with the model in this article. Due to the lack of strict case diagnosis criteria before 12 February 2020, we define the basic reproductive number as 0.415 4 in Wuhan and 0.430 6 in other provinces except Hubei from 12 February 2020 to 21 February 2020. It is reasonable that the basic reproductive number of Wuhan is lower because of the stricter control measures. According to the derived parameters, we predict the number of accumulative confirmed cases from 22 February 2020 to 2 March 2020, the MAPE of the prediction compared with the real data is 23.71% in Wuhan and 28.66% outside Hubei; the MSE is 204 900 693 person 2 in Wuhan and 21 936 049 person 2 outside Hubei. Furthermore, we also divide the SEIR model into the same time period according to the staged idea. The first stage is from the start time to 20 January 2020, the second stage is from 20 January 2020 to 11 February 2020, and the third stage is from 12 February 2020 to the present. In the parameter settings, the total population, incubation probability and recovery probability are also consistent with the model in this paper. Similarly, the basic reproductive number in three different stages is obtained by fitting the data from February 12 to February 21. In Wuhan, the basic reproductive number is 0.7 in the first stage, 0.337 6 in the second stage, and 0.2135 in the third stage. In other provinces except Hubei, the basic reproductive number is 0.9 in the first stage, 0.25 in the second stage, and 0.16 in the third stage. On the basis of the obtained parameters, we predict the number of accumulative confirmed cases from February 22 to March 2, the MAPE of the prediction compared to the real data is 4.18% in Wuhan and 1.42% outside Hubei; the MSE of the prediction is 4 456 621 person 2 in Wuhan and 40 597 person 2 outside Hubei. The detailed results are shown in Table 3 . Figure 5 describes the accumulative confirmed cases with different models. The scatter means the confirmed number in real situation in Wuhan and three curves represent different models: traditional SEIR model, staged SEIR model and e-ISHR model. We can see obviously that our model curve fits the data best. The experimental results show that in Wuhan, for MAPE, our model is as low as 1.13%, which is much lower than 23.71% of traditional SEIR model, and also lower than 4.18% of staged SEIR model; for MSE, our model is as low as In this paper, we build a model called e-ISHR model which includes SEIR model, staged mechanism, time delay mechanism and hospital system. We first use the number of confirmed case to fit some of the parameters in our model in different stages. Then we predict this number in the future and the end time of the epidemic. Besides, we also change some of the parameters like contact rate and the number of beds in hospital to simulate the spread of this disease in different situations, which indicates the importance of the government' control measures and building more hospitals. At last, we compare our model with traditional SEIR model and staged SEIR model using MAPE and MSE as evaluating indicator, and the result reflects the accuracy and superiority of our model. At present, the epidemic situation and control measures of COVID-19 in some countries out of China are still unclear, but in this article we have not made the necessary predictions for the epidemic situation. If the epidemic measures and data in some countries out of China become stable, we also plan to apply the model proposed in this article to the epidemic prediction. At the same time, we will also correct and improve the model based on the real-time situation of the epidemic situation in China, such as the possible resumption of labor, virus mutations, successful development of specific drugs, and the impact of imported cases abroad in the future. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Estimating the potential total number of novel coronavirus cases in Wuhan City The extent of transmission of novel coronavirus in Wuhan, China Pattern of early humanto-human transmission of Wuhan Preliminary assessment of the international spreading risk associated with the 2019 novel coronavirus (2019-nCoV) outbreak in Wuhan City Risk for transportation of 2019 novel coronavirus disease from Wuhan to other cities in China Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study Modeling and prediction of the 2019 coronavirus disease spreading in China incorporating human migration data Host and infectivity prediction of Wuhan 2019 novel coronavirus using deep learning algorithm Real-time tentative assessment of the epidemiological characteristics of novel coronavirus infections in Wuhan, China, as at 22