key: cord-0049980-3s52o333 authors: Lio, Waichon; Liu, Baoding title: Initial value estimation of uncertain differential equations and zero-day of COVID-19 spread in China date: 2020-09-15 journal: Fuzzy Optim Decis Making DOI: 10.1007/s10700-020-09337-6 sha: f8182c1627f63611b4e835f26b8716a772168164 doc_id: 49980 cord_uid: 3s52o333 Assume an uncertain process follows an uncertain differential equation, and some realizations of this process are observed. Parameter estimation for the uncertain differential equation that fits the observed data as much as possible is a core problem in practice. This paper first presents a problem of initial value estimation for uncertain differential equations and proposes an estimation method. In addition, the method of moments is recast for estimating the time-varying parameters in uncertain differential equations. Using those techniques, a COVID-19 spread model based on uncertain differential equation is derived, and the zero-day of COVID-19 spread in China is inferred. Based on uncertainty theory (Liu 2007) , Liu (2008) initialized uncertain differential equation as a type of differential equations involving uncertain processes. Under linear growth and Lipschitz condition, Chen and Liu (2010) proved an existence and uniqueness theorem of solution of uncertain differential equation. Following that, Gao (2012) proved the theorem again under local linear growth and Lipschitz condition. Furthermore, an analytic solution to linear uncertain differential equations was derived by Chen and Liu (2010) , and some analytic methods to nonlinear uncertain differential equations were presented by Liu (2012) and Yao (2013b) . Yao and Chen (2013) made an important contribution for verifying that the solution of an uncertain differential equation can be represented by a family of solutions of ordinary differential equations (this important work was named as Yao-Chen Formula later) , and then the methods for calculating extreme value, first hitting time and time integral of the solution of uncertain differential equation were provided by Yao (2013a) . To estimate the unknown parameters in uncertain differential equation that fits the observed data as much as possible, several methods were proposed, for example, the method of moments (Yao and Liu 2020) , least squares estimation (Sheng et al. 2019) , generalized moment estimation (Liu 2020b) , uncertain maximum likelihood , and minimum cover estimation . Recently, many scholars applied uncertain statistics to modelling COVID-19 pandemic. For instance, Liu (2020a) used uncertain regression analysis to forecast the cumulative numbers of COVID-19 infections in China, while Ye and Yang (2020) used uncertain time series. Following that, Chen et al. (2020) presented an uncertain SIR model, and Jia and Chen (2020) proposed an uncertain SEIAR model by employing high-dimensional uncertain differential equations. However, there are still two challenges in this topic. The first one is how to estimate the zero-day of COVID-19 spread in China. This is the problem of initial value estimation for uncertain differential equations. The second one is how to estimate the parameters of uncertain differential equations based on observed data when the parameters are time-varying. This is the problem of time-varying parameter estimation. The rest of this paper is organized as follows. Section 2 will define a concept of α-region of solution for uncertain differential equations, and Sect. 3 will present a problem of initial value estimation for uncertain differential equations and propose an estimation method. The cumulative numbers of COVID-19 infections in China will be surveyed in Sect. 4, and a COVID-19 spread model based on uncertain differential equation will be derived in Sect. 5. In Sect. 6, the method of moments will be recast for estimating the time-varying parameters of the COVID-19 spread model. Section 7 will infer the zero-day of COVID-19 spread in China. Section 8 will show that stochastic COVID-19 spread model is not suitable. Finally, Sect. 9 will provide a brief conclusion. The α-region of solution of an uncertain differential equation is defined as the set that the solutions may fall in. Definition 1 Let α be given with α ≥ 0.5. Suppose X α t and X 1−α t are the α-path and (1 − α)-path of an uncertain differential equation with initial value x t 0 , respectively. Then the set is said to be the α-region of solution with respect to x t 0 for the uncertain differential equation (1). Example 1 Let α be given with α ≥ 0.5. For the uncertain differential equation with initial value x 0 = 0, since its α-path and (1 − α)-path are respectively, the α-region of solution with respect to x 0 = 0 for the uncertain differential equation (3) is Example 2 Let α be given with α ≥ 0.5. For the uncertain differential equation with initial value x 0 = 1, since its α-path and (1 − α)-path are and respectively, the α-region of solution with respect to x 0 = 1 for the uncertain differential equation (4) is To obtain the α-region of solution, the core problem is to compute the α-path of uncertain differential equation. In order to do it, some numerical methods were designed, for example, Euler method (Yao and Chen 2013) , Runge-Kutta method (Yang and Shen 2015) and Adams method (Yang and Ralescu 2015) . Assume an uncertain process follows an uncertain differential equation and some realizations of this process are observed. How to estimate the initial value of the process based on the uncertain differential equation and observed data is an interesting problem for practice. Definition 2 Suppose an uncertain process X t follows an uncertain differential equation and x t 1 , x t 2 , . . . , x t n are the observed data of X t at the times t 1 , t 2 , . . . , t n , respectively. For any given confidence level α ≥ 0.5, the set is said to be the α-region of initial value with respect to the observed data x t 1 , x t 2 , . . . , x t n for the uncertain differential equation (5), where S α (t 0 , x t 0 ) is the α-region of solution with respect to x t 0 . The following algorithm provides a way to judge whether (t 0 , x t 0 ) ∈ O α or not. Algorithm 1 Step 1: Compute α-path X α t and (1 − α)-path X 1−α t of the uncertain differential equation (5) by Euler method, Runge-Kutta method or Adams method. Step 2: Set i = 0. Step 3: Set i ← i + 1. Step Step 5: If i < n, then go to Step 3. Step 6: The cumulative numbers of COVID-19 infections in China excluding imported cases from January 20 to March 15, 2020 were reported by National Health Commission of China, and summarized by Liu (2020a) and Ye and Yang (2020) . See Table 1 . Let 1, 2, . . . , 56 represent the dates (t) from January 20 to March 15. For example, t = 1 and t = 56 represent January 20 and March 15, respectively. Also let x 1 , x 2 , . . . , x 56 represent the cumulative numbers on dates 1, 2, . . . , 56, respectively. For example, Based on the observed data of cumulative numbers of COVID-19 infections in China, Liu (2020a) obtained the fitted logistic growth model where x t is the cumulative number of COVID-19 infections in China on date t. Effective reproductive rate refers to as the rate of change of cumulative numbers per unit of time. Let R t denote the effective reproductive rate and X t denote the cumulative number of COVID-19 infections in China at time t. During a small time interval [t, t + Δt], we should have Now we assume where μ t , σ t are real-valued functions with respect to time t, and "Noise" is a standard normal uncertain variable N(0, 1). Based on uncertainty theory, let us represent the "Noise" by where C t is a Liu process (Liu 2009 ). Then we have It follows from (8) and (10) that Generally, during a time interval [0, t] with a partition 0 = t 0 < t 1 < · · · < t n = t, we have That is, Thus we obtain a COVID-19 spread model based on uncertain differential equation, where X t is the cumulative number of COVID-19 infections in China at time t, C t is Liu process, and μ t and σ t are unknown time-varying parameters at this moment. The cumulative numbers of COVID-19 infections in China before t = 25 (February 13, 2020) are not real-time data due to the capacity limitation of nucleic acid testing. However, to estimate the time-varying parameters in the COVID-19 spread model, it is insufficient to only use the observed data of cumulative numbers after February 13, 2020, x 25 , x 26 , . . . , x 56 . Therefore, we have to add data from the date when the isolation policy of Chinese government became efficient, i.e., from January 30 to February 12, 2020. According to the fitted logistic growth model (7), we reassign 19369, 22127, 25117, 28316, 31691, 35199, 38789, 42405, 45990, 49487, 52847, 56029, 58998, 61733 to x 11 , x 12 , · · · , x 24 , respectively. By using the data, x 11 , x 12 , . . . , x 56 , we will estimate the time-varying parameters μ t and σ t in the COVID-19 spread model (13). For this purpose, the method of moments (Yao and Liu 2020) will be recast as follows. First, let us estimate μ 11 and σ 11 on January 30, 2020 (t = 11) by applying the 10 observed data x 11 , x 12 , . . . , x 20 . The COVID-19 spread model (13) has a difference form identically follow a standard normal uncertainty distribution N(0, 1), we get 11, 12, . . . , 19 . Substitute X t i and X t i+1 with the observed data x t i and x t i+1 in the above equation, and write for i = 11, 12, . . . , 19. It is clear that h i (μ 11 ,σ 11 ), i = 11, 12, . . . , 19 can be regarded as 9 samples of the standard normal uncertainty distribution N(0, 1). It is clear that the first two sample moments of the samples h i (μ 11 ,σ 11 ), i = 11, 12, . . . , 19 are 1 9 19 i=11 h i (μ 11 ,σ 11 ) and 1 9 19 i=11 h 2 i (μ 11 ,σ 11 ), and the first two population moments of the standard normal uncertainty distribution N(0, 1) are 0 and 1. Since the number of unknown parameters is 2, the moment estimate is then obtained by equating the first two sample moments to the corresponding first two population moments. In other words, the estimate (μ 11 ,σ 11 ) should solve the system of equations, whose root is (μ 11 ,σ 11 ) = (0.1101, 0.0216). Next, let us estimate μ 12 and σ 12 on the date t = 12 by applying the 10 observed data x 12 , x 13 , . . . , x 21 . Since 1) for i = 12, 13, . . . , 20, by the method of moments, we have whose root is (μ 12 ,σ 12 ) = (0.1018, 0.0219). As an analogy, we can get the estimated values (μ 13 ,σ 13 ), (μ 14 ,σ 14 ), . . . , (μ 47 , σ 47 ) shown in Table 2 . Basic reproductive rate refers to as the effective reproductive rate when COVID-19 started spreading naturally in a completely susceptible population. Since it can be considered that COVID-19 naturally spread in China before January 30, 2020 (t = 11), we regard R 11 =μ 11 +σ 11Ċ11 = 0.1101 + 0.0216Ċ 11 as the basic reproductive rate of COVID-19 spread in China. In order to fit μ t and σ t , we may employ logistic decay models, where β 1 , β 2 , β 3 and β 4 are unknown parameters. By applying the least square estimate and samples (μ i ,σ i ), i = 11, 12 · · · , 47 in Table 2 , we get the time-varying parameters, μ t = 0.1101 1 + 0.0083 exp(0.2567t) , σ t = 0.0216 1 + 0.0034 exp(0.2312t) . It follows from (13) and (17) that the COVID-19 spread model based on uncertain differential equation is where X t is the cumulative number of COVID-19 infections in China at time t, and C t is Liu process. Note that the cumulative number X t of COVID-19 infections in China follows COVID-19 spread model (18), and x 25 , x 26 , . . . , x 56 in Table 1 are observed data of X t at the times 25, 26, . . . , 56, respectively. Taking α = 0.95 and applying Algorithm 1, we That means, the zero-day of COVID-19 spread in China is October 17, 2019 ± 36 days. It is concluded that, roughly speaking, COVID-19 started spreading in China from October 17, 2019. If Liu process C t in the COVID-19 spread model (18) is replaced with Wiener process W t , then we obtain a stochastic differential equation dX t = 0.1101X t dt 1 + 0.0083 exp(0.2567t) + 0.0216X t dW t 1 + 0.0034 exp(0.2312t) . Suppose there was only one infectious case on October 17, 2019 (i.e., t 0 = −94 and x t 0 = 1). Taking a date, e.g., t = 30 (February 18, 2020), we have Pr{X 30+Δt < X 30 } ≥ 46.22% when Δt = 10 −6 . That means, the cumulative number of COVID-19 infections in China decreases with a probability of 46.22%. However, the cumulative number X t is always increasing with respect to t. Hence stochastic COVID-19 spread model is not acceptable. This paper presented a problem of initial value estimation for uncertain differential equations and proposed an estimation method. Furthermore, the method of moments was recast for estimating the time-varying parameters in uncertain differential equations. Using those techniques, a COVID-19 spread model based on uncertain differential equation was derived, and the zero-day of COVID-19 spread in China was inferred. Numerical solution and parameter estimation for uncertain SIR model with application to COVID-19. Fuzzy Optimization and Decision Making Existence and uniqueness theorem for uncertain differential equations. Fuzzy Optimization and Decision Making Existence and uniqueness theorem on uncertain differential equations with local Lipschitz condition Uncertain SEIAR model for COVID-19 cases in China. Fuzzy Optimization and Decision Making Uncertainty Theory Fuzzy process, hybrid process and uncertain process Some research problems in uncertainty theory An analytic method for solving uncertain differential equations Estimating unknown parameters in uncertain differential equation by maximum likelihood estimation Uncertain growth model for the cumulative number of COVID-19 infections in China. Fuzzy Optimization and Decision Making Generalized moment estimation for uncertain differential equations Least squares estimation in uncertain differential equations Parameter estimation of uncertain differential equation with application to financial market Adams method for solving uncertain differential equations Runge-Kutta method for solving uncertain differential equations Extreme values and integral of solution of uncertain differential equation A type of nonlinear differential equations with analytic solution A numerical method for solving uncertain differential equations Parameter estimation in uncertain differential equations. Fuzzy Optimization and Decision Making Analysis and prediction of confirmed COVID-19 cases in China by uncertain time series. Fuzzy Optimization and Decision Making Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations