key: cord-0634931-xvmm9trf authors: Gotz, Thomas title: First attempts to model the dynamics of the Coronavirus outbreak 2020 date: 2020-02-10 journal: nan DOI: nan sha: 2fbdb8d81f4cdb09780f1031ef50ca1efc32ec93 doc_id: 634931 cord_uid: xvmm9trf Since the end of 2019 an outbreak of a new strain of coronavirus, called 2019--nCoV, is reported from China and later other parts of the world. Since January 21, WHO reports daily data on confirmed cases and deaths from both China and other countries. In this work we present some discrete and continuous models to discribe the disease dynamics in China and estimate the needed epidemiological parameters. Good agreement with the current dynamics has be found for both a discrete transmission model and a slightly modified SIR-model. In December 2019, first cases of a novel pneumonia of unknown cause were reported from Wuhan, the seventh-largest city in China. In the meantime, these cases have been identified as infections with a novel strain of coronavirus, called 2019-nCoV. Its genome sequence turned out to be 75-to 80-percent identical to the SARS-coronavirus, that caused a major outbreak in Asai in 2003. At the beginning of January 2020, the virus spread over mainland China and reached other provinces. Increased travel activities due to the Chinese new year festivities supported the expansion of the infection. By mid of January, China reported a sharp rise in cases with about 150 new patients. From January 21 onwards, WHO's daily situation reports contain the latest figures on confirmed cases and deaths, see [1] . Our work is based on these data for the mainland of China. The National Health Commission of the People's Republic of China also provides daily reports [2] , however this website is only available in Chinese and hence we did not use it for our analysis. By Y k we denote the cumulated Corona cases in mainland China on day k, where k = 0 corresponds to beginning of the data recordings on January, 21. In Figure 1 we show the data until February, 9 in a semi-logarithmic plot. The new cases N k := Y k − Y k−1 are also given. The data for other countries than China is not included in our first modeling approach. Assuming, that the number of daily new infections is directly proportional to the total number of currently infected leads to a discrete exponential model To estimate the infection growth rate r and the initial infected population y 0 at day k = 0 (i.e. Jan, 21), we use a least-squares fit to the observed data (k, Y k ) for k = 0, . . . n with n = 15 in a logarithmic version. We obtain the estimateŝ r ≃ 0.304 The coefficient of determination (in statistics usually called the R 2 -value) of this fit is given by In Figure 2 we visualize the exponential model (Exp) in comparison to the data. Although, the R 2 -value is quite large, the observed behavior is far from being purely exponential. Especially in the last days, the disease dynamics has been slowing down. This is also reflected in the decreasing number of new cases, see Fig. 1 . The exponential model is over-estimating the number of infected cases. This can be seen as well, when plotting the infected cases on the next day Y k+1 vs. the infected cases today Y k in a double logarithmic plot, see. a straight line of slope 1 in the double log-plot. However, the real data reveals a slightly smaller increase. Therefore, we may generalize the model to or in the logarithmic version ln y gen k+1 = β ln y gen The parameters β and c can again be estimated from a least squares fit and we obtain (1)β = 0.904 andĉ = 2.02 . In this fit, we have excluded the first data point (ln Y 1 , ln Y 2 ) as an outlier, cf. Fig. 3 . A value of β < 1 models an increase in the number of infected cases, that is below that standard exponential model (Exp) and that is in accordance with the reported data. Another possible extension of the exponential model takes into account the effect of awareness in the population. From a purely heuristic point of view, one can describe this by a nonlinear relation between the number of new infections y k+1 −y k and the current cases y k , e.g. (NonLin) y nl k+1 − y nl k = ρ(y nl k ) α For α = 1, we recover the simple exponential model (Exp). For exponents α < 1, the number of new-infections is reduced for high infection numbers to account for possible awareness in the population. For the given data (again excluding the first data point as an outlier), we perform a least squares fit to the double logarithmic plot (see Fig. 5 ) of the current cases y k vs. the new infections n k . This yields the parameter estimateŝ ρ = 10.1321 andα = 0.5794 with an R 2 -value of 0.9287. Including the first data point (see dashed line in Figure 4 ) yields a fit with a smaller R 2 -value of 0.8011 showing a lower quality of the fit. Moreover, in this case, we would again drastically over-estimate the current behavior. Figure 4 again shows the reduction of new infections in the last days. The last three data points significantly deviate from the previous behavior which can be described rather well by the nonlinear model. In Figure 5 we show the prediction based on the generalized model (Gen)(green curve) and the nonlinear model (NonLin) (blue curve) compared to the observed data. The two models are extended 10 days beyond the current date to show their predictions. On the time interval, where data is available, both models yield very similar estimates, but for future predictions, the generalized exponential model shows significantly lower case numbers. Based on the reduced number of new infection in the last days, it seems that the generalized model (Gen) is better suited to describe the future disease dynamics. In contrast to the above discrete models, we analyze in the sequel shortly a standard-SIR model. The total population equals to a reservoir of mainland China with N = 1.4 · 10 9 inhabitants. The export of the disease to other countries is not taken into account. Due to the short time horizon of not more than 1 month, effects of birth and natural death are excluded form the model. The time scale is measured in days, the recovery rates is assumed to be σ = 14 −1 implying a recovery period of 14 days. Let I and R denote the currently infected and recovered individuals. The susceptible individuals are given by S = N − I − R. The standard SIR-model without where µ denoted the disease induced mortality rate. In addition, we introduce the cumulated number of infected C k = t k 0 I(t) dt. The cumulated number of deaths equals Using the data provided by WHO, we estimate the mortality ratê where Z k , Y k denotes the cumulated observed death or infected cases at day k, see Figure 6 . Based on the given date we obtain an estimated mortality ratê µ = 2.09%. As can be seen from the graph, the assumption of a constant disease induced death rate shows rather good agreement with the given data. Using the above estimated death rate of approx. 2%, we solve the SIR-model (??) and compare its results for the cumulated infection cases C to the reported data. In a first simulation, we use a constant force of infection θ. Using a least-squares fit, we can estimate its valueθ form the observed data. However, during the corse of the disease, quarantine measures have been take to reduce the spread of the disease. In the SIR-model, such quarantine measures lead to a reduction in the force of infection. Therefore, we have also simulated a second model, where the force of infection is assume to be piecewise-constant where t s denotes the switching time. Again, we have performed a least-squares fit to estimate θ 1 , θ 2 and t s based on the reported data. The fit is based on minimizing between the reported data y k and the simulated result C k . Figure 7 compares the simulation results of the two SIR-models to the given data. In Table 1 we summarize the results of the two SIR-models along with their respective basic reproductive numbers R 0 = θ µ + σ For our modified SIR-model 2 with two different forces of infection θ, the basic reproductive number based on the later value for θ equals tô Other sources, see [3, 4] report a range between 2.1 to 3.1. Table 1 . Parameters for the SIR-simulations as shown in Figure 7 . Parameter SIR-Model 1 SIR-Model 2 Initial value I(0) 778 256 Force of infection θ 0.58 2.12 t < 9 0.31 t ≥ 9 Switching time t s -30.01.2020 Basic repro. number R 0 6.12 22.4 t < 9 3.31 t ≥ 9 Least squares error L 2 -diff 8.2 · 10 3 2.6 · 10 3 Novel Coronavirus (2019-nCoV) situation reports National Health Commission of the People's Republic of China Real-time nowcast and forecast on the extent of the Wuhan CoV outbreak, domestic and international spread Novel coronavirus 2019-nCoV: early estimation of epidemiological parameters and epidemic predictions