key: cord-0888554-faz9b9fx
authors: Jeong, Darae; Lee, Chang Hyeong; Choi, Yongho; Kim, Junseok
title: The daily computed weighted averaging basic reproduction number [Formula: see text] for MERS-CoV in South Korea
date: 2016-06-01
journal: Physica A
DOI: 10.1016/j.physa.2016.01.072
sha: 49229fa2777b24bee9d0c85a03ad88cce19254e6
doc_id: 888554
cord_uid: faz9b9fx

In this paper, we propose the daily computed weighted averaging basic reproduction number [Formula: see text] for Middle East respiratory syndrome coronavirus (MERS-CoV) outbreak in South Korea, May to July 2015. We use an SIR model with piecewise constant parameters [Formula: see text] (contact rate) and [Formula: see text] (removed rate). We use the explicit Euler’s method for the solution of the SIR model and a nonlinear least-square fitting procedure for finding the best parameters. In [Formula: see text] , the parameters [Formula: see text] , [Formula: see text] , and [Formula: see text] denote days from a reference date, the number of days in averaging, and a weighting factor, respectively. We perform a series of numerical experiments and compare the results with the real-world data. In particular, using the predicted reproduction number based on the previous two consecutive reproduction numbers, we can predict the future behavior of the reproduction number.

Middle East respiratory syndrome coronavirus (MERS-CoV), first identified in Saudi Arabia in June 2012, is a viral respiratory disease caused by a novel coronavirus [1] . South Korea experienced the largest outbreak of MERS-CoV infections outside the Arabian peninsula [2] . There have been 186 confirmed infective cases in South Korea within two months after the first infective person who returned from a trip to the Arabian peninsula was diagnosed on 11 May, 2015, and 38 people died and more than 16,000 people were quarantined due to the spread of the disease. Fig. 1 represents the epidemic curve of MERS-CoV, South Korea, 20 May-17 July 2015 [3, 4] .

In order to implement appropriate surveillance and control measures, it is very important to predict the future trend of the epidemic. Therefore, it is the purpose of the present paper to show daily computed reproduction numbers for epidemics, in particular, MERS-CoV, so that we can predict the future behavior of the reproduction number day-by-day. In the mathematical modeling of the spread of an epidemic disease, it is crucial to estimate the parameters (e.g., contact rate and recovery rate in an SIR model), but it is difficult to do so due to the lack of data available for the estimation. In case of the MERS-CoV spread in South Korea, there are almost complete daily data of the spread of the disease, and we choose the MERS-CoV spread case in South Korea as an epidemic model to verify the validity and accuracy of the method proposed in this paper. To the authors' knowledge, there have been no previous works that present computational methods for estimation of the values of the epidemic parameters based on actual daily epidemic data.

The rest of the paper is organized as follows. In Section 2, we describe the mathematical model. In Section 3, we provide a numerical algorithm for the estimation of the parameters. We perform several numerical experiments in Section 4. In Section 5, we provide a summary and present our conclusions.

In this paper, we consider the SIR model that was introduced in 1927 by A.G. McKendrick and W.O. Kermack [5] . The model is simple and has been widely used so far, for instance, in multigroup epidemic modeling [6] , online social network dynamics [7] , the model adopting the delay term [8] , stochastic model [6, 9, 10] , vaccination strategy [11, 12] . In this model, the population is divided into susceptible S(t), infected I(t), and removed R(t) individuals at time t. The governing ordinary differential equations for the SIR model are as follows:

with initial condition S(0) = S 0 , I(0) = I 0 , and R(0) = R 0 . Here, the parameters β and γ denote the contact (susceptibility to disease) and removed (either dead or recovered) rates from disease, respectively [13] . We assume that a removed individual can never be infected again. Let N be the constant total population size. Therefore, it is sufficient to solve only two Eqs. (1) and (2), i.e., R(t) = N − S(t) − I(t). By assuming that β and γ are time-dependent parameters, we generalize Eqs. 

For example, the transmission of the vector-borne diseases such as Dengue fever and Malaria has strong seasonal patterns and thus the parameters are estimated as the time-dependent functions [14, 15] . Another example is a seasonal SIR model for the spread of the whooping cough in which the contact rate β is given as a time-dependent periodic function which accounts seasonal changes [16] . 

Let us assume that we have daily statistics about S, I, and R. Let t n = t ref +n be time, where t ref is a reference time. Let DS n , DI n , and DR n be observed susceptible, infected, and removed data at time t = t n , respectively. Let S n , I n , and R n be numerical approximations of S(t n ), I(t n ), and R(t n ), respectively. In this paper, we propose a computation of β(t) and γ (t) on a daily basis, which fit best to the real data. To be more specific, for given data (DS n , DI n , DR n ) and (DS n+1 , DI n+1 , DR n+1 ), we want to find piecewise constant parameters β n+1 and γ n+1 which minimize the following

where S n+1 , I n+1 , and R n+1 are the numerical solutions of Eqs. (4)-(6) with initial condition (S n , I n , R n ) = (DS n , DI n , DR n ). We divide one day between t n and t n+1 into m subintervals, then t = 1/m is the time step size. Let S n+1,1 = S n and I n+1,1 = I n , then by applying the explicit Euler's method to Eqs. (4) and (5), we have

where we have used the condition R(t) = N −S(t)−I(t). Unless otherwise stated, we use t = 1/100 in all numerical tests.

If the solutions of Eqs. (8)-(10) are obtained for all k, then we define S n+1

Note that while we can use higher-order numerical methods such as Runge-Kutta schemes to solve Eqs. (4) and (5), we use the explicit Euler's method for the sake of simplicity. Next, to find the best β n+1 and γ n+1 , we use a MATLAB routine called lsqcurvefit, based on the least-squares sense, which finds coefficients β n+1 and γ n+1 that solve the problem (7). We use β n+1 = 1.0e−6 and γ n+1 = 1.0e−1 as an initial guess in all numerical simulations. The MATLAB code is given in Appendix for the interested readers.

In all computations, the initial conditions are taken to be N = 500,000, S 0 = N − 1, I 0 = 1, and R 0 = 0 with t ref = 20

May 2015. Fig. 2 shows the time series of observed MERS-CoV epidemiological data and numerical solutions S, I, and R, respectively. The basic reproduction number R 0 is the average number of secondary infections caused by one infectious individual in a completely susceptible population. It is an important measurement in that one can determine the stability of the diseasefree equilibrium by computing the value of R 0 ; R 0 < 1 implies that the disease-free equilibrium is locally asymptotically stable [17] , and if R 0 > 1, it is unstable [18, 19] . Finding basic reduction number R 0 can be done through the next generation matrix [20, 21] whose spectral radius is defined as R 0 . In the SIR model, the basic reproduction number is computed as [22, 23] , which will be used for our method in the next section.

In this section, we propose the weighted average values for infected and removed rates, and then define weighted averaging basic reproduction number. Let β n 1,1 = β n for n = 1, . . . , m. We define k-days (k ≥ 2) weighted average β n k,ω as

where 0 < ω ≤ 1 and the superscript i − 1 for ω i−1 represents an exponent. Also, γ n k,ω is defined similarly as Eq. (11). Finally, we define the daily computed weighted averaging basic reproduction number as R n 0,k,ω = β n k,ω N/γ n k,ω . Fig. 3 (a)-(c) represent β n k,ω , γ n k,ω , and R n 0,k,ω for k = 5, respectively, with w = 0, 0.3, and 1. Fig. 4 shows the results of the parameters for k = 10. Note that when ω = 0, we have R n 0,k,0 = R n 0,1,0 for all k. If k is large, then we have more smooth profile and the transition point (R n 0,k,ω = 1) moves to the right. If ω is close to zero, then we have a similar profile to R n 0,1,0 . Note that we have extraordinarily large values of R n 0,k,ω from n = 1 to approximately n = 20 since the recovered individuals were almost zero (γ n k,ω ≈ 0) in early times and the infected ones increased rapidly.

In this section, we investigate the predictability of a linear extrapolation of R n 0,1,0 . Let us define the linear extrapolation

where R n linear means the predicted reproduction number based on the previous two consecutive reproduction numbers and the error is defined as Fig. 5(a) shows R n linear and R n 0,1,0 . Up to about three weeks, there are large differences between two reproduction numbers.

However, we can observe a good agreement interval from n = 30 to n = 59, see Fig. 5 (b). Fig. 5(c) and (d) are corresponding errors E n of (a) and (b). The errors E n from n = 30 to n = 59 are listed in Table 1 . Except some days, the absolute errors |E n | are less than one. We note that we have a spike in Fig. 5(b) and (d). That is due to no population data changes for two days, i.e., 11 July and 12 July as shown in Fig. 1 .

In this section, we consider the modified SIR model with disease-related death. Let the total population N be divided into susceptible S(t), infected I(t), recovered R(t), and dead D(t) individuals at time t, then the SIR model with fatality rate is given by

with initial condition S(0) = S 0 , I(0) = I 0 , R(0) = R 0 , and D(0) = D 0 . Here, the time-dependent parameters β(t), γ (t), and d(t) denote the contact, recovered, and dead rates from disease, respectively [22] . Now, by using the proposed method, we evaluate the numerical solution of the modified SIR model (14)- (17) . We take the same initial conditions: N = 500,000, S 0 = N − 1, I 0 = 1, R 0 = 0, and D 0 = 0. The initial guesses for finding coefficients β, γ , and d are 1.0e−6, 1.0e−1, and 1.0e−1, respectively. Using this model, we can predict the ratio of individuals who died by disease to the infected population at a given time. To investigate the ratio, we introduce the fatality percentage rate µ n which is defined as µ n = D n I n + R n + D n × 100 (%). In Fig. 6 , we show the fatality percentage rate µ n from the real observation data and the numerical approximations. Here, the fatality percentage rate µ n from the numerical approximation is obtained by the numerical solutions S n , I n , R n , and D n . The numerical solutions S n , I n , R n , and D n are evaluated by using the best-fitting parameters β n−1 , γ n−1 , and d n−1 at time t n−1 . As shown in Fig. 6 , the both results are in good agreement in later times.

We have proposed the daily computed weighted averaging basic reproduction number R n 0,k,ω for MERS-CoV outbreak in South Korea, May to July 2015. The linearly extrapolated reproduction number, R n linear , has important implications for the future prediction of the trend of the reproduction numbers. We showed that the results computed by our method match very well with the actual data of the MERS-CoV spread in South Korea. Using the method proposed in this paper, one can predict the progress of the spread of infections on the use of real-time epidemic data, so that proper control policies can be performed in an appropriate time for reducing the spread of the infections. As future research, the methodology introduced in this paper can be applied to various diseases with other epidemic spreading models. In this section, we present the parameter fitting algorithm using the MATLAB function lsqcurvefit, which finds the optimal parameters in the least-squares sense. The following command returns the optimal parameters. 

Epidemic spreading characteristics and immunity measures based on complex network with contact strength and community structure

Preliminary epidemiologic assessment of MERS-CoV outbreak in South Korea

Korea Ministry of Health and Welfare (KMOHW) and Korean Centers for Disease Control and Prevention (KCDC). MERS portal (Daily report

Korea Ministry of Health and Welfare (KMOHW) and Korean Centers for Disease Control and Prevention (KCDC). Daily report for number of infected, recovered, and dead people

A contribution to the mathematical theory of epidemics

Multigroup SIR epidemic model with stochastic perturbation

Study on the growth and decline of SNSs by using the infectious recovery SIR model

SIR model with general distribution function in the infectious period

Stability of a stochastic SIR system

The behavior of an SIR epidemic model with stochastic perturbation

Pulse vaccination strategy in the SIR epidemic model

Theoretical examination of the pulse vaccination policy in the SIR epidemic model

Simulation of emotional contagion using modified SIR model: A cellular automaton approach

Mathematical models of malaria-a review

The seasonal reproduction number of dengue fever impacts of climate on transmission

Whooping cough epidemics in London, 1701-1812: infection dynamics, seasonal forcing and the effects of malnutrition

Revisiting node-based SIR models in complex networks with degree correlations

Reproduction numbers and sub-threshold endemic equilibria for compartmental models of disease transmission

Transmission characteristics of MERS and SARS in the healthcare setting: a comparative study

Mathematical Epidemiology of Infectious Diseases: Model Building

A brief history of R 0 and a recipe for its calculation

Synthesizing data and models for the spread of MERS-CoV, 2013: Key role of index cases and hospital transmission

The author (D. Jeong) was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2014R1A6A3A01009812). C.H. Lee was supported