key: cord-0194776-so0pekkf authors: Ding, Yujia; Schellhorn, Henry title: Optimal Control of the SIR Model with Constrained Policy, with an Application to COVID-19 date: 2021-05-18 journal: nan DOI: nan sha: 42efe2a48eb8d648e40336713767ac530088f6c1 doc_id: 194776 cord_uid: so0pekkf This article considers the optimal control of the SIR model with both transmission and treatment uncertainty. It follows the model presented in Gatto and Schellhorn (2021). We make four significant improvements on the latter paper. First, we prove the existence of a solution to the model. Second, our interpretation of the control is more realistic: while in Gatto and Schellhorn the control $alpha$ is the proportion of the population that takes a basic dose of treatment, so that $alpha>1$ occurs only if some patients take more than a basic dose, in our paper, $alpha$ is constrained between zero and one, and represents thus the proportion of the population undergoing treatment. Third, we provide a complete solution for the moderate infection regime (with constant treatment). Finally, we give a thorough interpretation of the control in the moderate infection regime, while Gatto and Schellhorn focussed on the interpretation of the low infection regime. Finally, we compare the efficiency of our control to curb the COVID-19 epidemic to other types of control. This article extends the analysis of the model presented in Gatto and Schellhorn (2021) .We make four significant improvements on the latter paper. First, we prove existence of a solution. Second, In Gatto and Schellhorn (2021) the optimal control α has the interpretation of the proportion of the population that takes a basic dose of treatment, so that α > 1 occurs only if a proportion of the population takes more than a basic dose of treatment. In the low infection regime part of our paper, α is constrained to be between zero and one, and represents thus the proportion of the population undergoing treatment. The latter interpretation is much more realistic, as it is uncommon to ration treatment. Third, we provide a complete solution for the moderate infection regime (with constant treatment). The final improvement is a thorough numerical analysis and sensitivity analysis of the moderate infection regime, while Gatto and Schellhorn (2021) focused exclusively on the interpretation of the control in the low infection regime. This enables us to discover some errors in the secondorder term of the solution in Gatto and Schellhorn (2021) , which we correct here. Finally, we compare the efficiency of our control to curb the COVID-19 pandemic to other types of control. While our solution is complex, it allows to satisfy the objective better. Also, our analytical solutions allow for an intuitive understanding of the optimal control compared to a purely numerical solution. The structure of the article is as follows. In section 2 we briefly introduce the model in Gatto and Schellhorn (2021) , and provide a proof of existence of the solution. In section 3, we show our results for the low infection regime. In section 4, we extend and analyze the solution in the moderate infection regime. Section 5 shows our experimental results when applying our methodology to the COVID-19 in the US in 2020. We draw the conclusion in Section 6. We refer the reader to our earlier paper, Gatto and Schellhorn (2021) for a literature review. Let S, I, R be the proportions of susceptible, infected, and out of infection (recovered, and dead), respectively. Let β be the transmission rate and µ be the death rate. In the SIR model, the rate of decrease dS dt of the proportion of susceptible is equal to the constant transmission rate β time SI. As in Gatto and Schellhorn (2021), we add a term σ S √ SI dB1 dt , where dB1 dt is white noise, in order to model the error in the transmission rate: The optimal policy α is the proportion of the infected population that received treatement, thus α(t) ∈ [0, 1]. The presence of this constraint is an important addition to the model in Gatto and Schellhorn (2021) . Depending whether the individual is treated or not, there are then four different ways for an infected individual to exit the pool of infected: • not treated and recover • not treated and die • treated and recover • treated and died. Thus, the "out of infection rate" will be: For simplicity, we assume that the Brownian motion driving transmission uncertainty (B 1 ) is independent from the Brownian motion driving treatment uncertainty (B 2 ). We suppose that µ 0 ≥ µ 1 (people die faster without treatment than with treatment), but the reader will not lose any intuition by supposing that µ 0 = µ 1 . Most of the time K 1 (t) > K 0 (treatment is better than no treatment), but not necessarily. We relax this requirement somewhat by requiring: We model the treatment rate as an Ornstein-Uhlenbeck process: with the mean-reversion rate λ k > 0 and the long run value of the treatment ratek 1 . It is well-known that K 1 is Gaussian, with variance equal to: Thus, if mean-reversion is large compared to volatility σ k , constraint (2) is satisfied. We simplify (1) by: Putting everything together, the dynamics of the infected is: We try to minimize a measure of the infected over our horizon T . To model risk-aversion to unfavorable treatment decisions, the decision-maker is supposed to minimize the expected value of a convex and increasing function of I(T ). Alternately, one can maximize the negative thereof, i.e., maximize the expected value of a concave and decreasing function of I(T ). Such a function U is called a utility function in financial economics. The policy obtained in maximizing the expected value of a concave utility function can be showed, under certain conditions, to maximize the expected value of the outcome (here −I) under a constraint on the dispersion of the outcome. Out of the universe of concave decreasing utility functions, we choose the power utility function: The coefficient γ is often called the risk-aversion parameter. When γ = 0 the decision-maker is risk-neutral, meaning that the uncertainty does not have an influence on her decisions. It is straightforward to check that this power utility function is concave in I when γ < 0,which we will assume. Taking for instance γ = −1, we see that the objective is to max E − I 2 2 which returns the same policy as: The importance of analytic formulations is that other figures of interest in this model, like the expected number of deaths from treatment can be analytically calculated, and depend on γ. Thus, a decision-maker can calibrate its risk-aversion parameter γ on other goals. Expected number of deaths is only one type of goal and economic factors that can be easily added. We define Our controlled SIR model is thus: The relative sign of our volatilities σ and σ k is important. We will assume without loss of generality that σ < 0. The sign of σ k is the sign of covariance between the measured value of today's treatment rate and the change in value of the treatment rate between today and a future date. An example may help illustrate the difference. Suppose that over a week one performs daily measurements of the treatment recovery rate as well as daily forecasts of the evolution of the treatment recovery rate over the next day. The two quantities measured each day t are proportional to the same white noise B 2 (t+1 day) − B 2 (t). One then calculates weekly estimatesσ of σ andσ k of σ k over these 7 daily observations. Since we arbitrarily choose σ > 0, a positiveσ k shows a correlation of +1 between the measurement (of today's treatment rate) and the forecast. Figure 1 is a depiction of our model. Theorem 1. For any given intial value (S(0), I(0), R(0)) and any X(0) there exists a unique solution of (3)(4)(5) up to time τ . The proof of Theorem 1, included in Appendix A, follows the proof of a theorem of Yamada and Watanabe (1971), as exposed in the book by Karatzas and Shreve (2014, Prop. 2.13, Sec. 5.2) . We assume S(t) close to one and σ S = 0. Thus the term: Figure 1 : A stochastic SIR model is assumed constant. With this simplification, we give an analytical solution to the constrained problem, i.e., the case where 0 ≤ α(t) ≤ 1, a significant improvement over Gatto and Schellhorn, who considered the unconstrained case. We define the impact of treatment risk X: as well as the long run impact of the treatment riskX: We define λ x = λ k and σ x = σ k /σ. For simplicity we write µ = K 0 + µ 0 . We consider first the case where the treatment rate is constant, and then the case where it follows an Ornstein Uhlenbeck process. Theorem 2. The optimal control is constant, and satisfies The proof is the Appendix B, and follows closely Cvitanić and Karatzas (1992) . The problem is In the low infection regime our solution will depend on a kernel H 0 (X t , τ ) with τ = T − t, while in the moderate infection regime it will also depend on two other kernels H 1 (X t , τ ) and H 2 (X t , τ ) that are closely related. In order to unify notation we define the kernels. Define where We provide an explicit formula for A 3 (τ, γ) in Appendix C. The solution to (7) is shown in Gatto and Schellhorn (2021, Prop. 1), but with some typos in the expression of H 0 (X t , τ ), hence we correct the mistake by providing (8) in this paper. We first handle the Ornstein-Uhlenbeck treatment rate case, which was presented in Gatto and Schellhorn (2021, Prop. 2) . In this work, we aim to correct the typos and provide more detials for the Proposition 2 in Gatto and Schellhorn (2021). The problem is defined in Section 2. We rewrite here for convenience, To express the solution of (13), we further definẽ From this, we can calculate: (13) has a solution such that where Z(t) satisfies: The proof is in Appendix D. We refer to Gatto and Schellhorn (2021) for a discussion of α 0 . The sign of α 1 is determined by the signs of σ and More specifically, α 1 is positive if σ and (16) are both positive or negative. α 1 is negative if one of them is positive and the other one is negative. It is obvious that the magnitude of both g(X(t), t) and ∂g ∂X decrease with time and are equal to zero when t = T . Therefore, the importance of α 1 decreases as time increases. To further discuss the sign of (16), we rewrite it by Thus, suppose ∂g ∂X , X(t), andX are all positive, (16) is positive, and vice versa. In the following cases, we provide two simple cases that we can easily discuss the sign of α 1 : , then α 1 is negative. In the following, we discuss the full expansion of the solution in Theorem 3. Consider equation (57) in Gatto and Schellhorn (2021): This time we use full asymptotic expansion: and obtain: The terms of our asymptotic expansion are thus determined by: We use the Ansatz: We have showed that g 1 = H 1 and g 2 = g in the proof of Theorem 3. By the same process, we can also calculate the expressions for g 3 , g 4 , . . . in the sequel. The problem is: Let τ = T − t, the solution kernels h i (τ ) for i = 1, 2, . . . are given by: where Theorem 4. Let I(0) = ε, then the problem (19) has a solution such that (44), and Z(t) satisfies: The proof is in Appendix E, where we also provide a formula for g 3 . Observe that We use the same data set and parameters (see Table 1 ) as in Gatto and Schellhorn (2021), but this time we show the optimal control (result of Theorem 2) of problem (6). We compare in Figure 2 three types of treatment: • no treatment • full control, i.e., α(t) = 1 • optimal control, given in Theorem 2 We can see that, for all risk-aversion parameters γ considered (between −1 and −5), our control is better. We showed that a stochastic optimal control approach enables to fight the COVID-19 epidemic better. Many interesting problems remain to be solved. For instance, we could analytic constrained policies in the multiple treatment case or the Ornstein-Uhlenbeck case. Optimal vaccination is another area where we believe a similar asymptotic approach can be used. Finally, Bertozzi et al. (0) and In our case, we take h(x) = x. Because of (23), there exists a strictly decreasing sequence {a n } ⊂ (0, 1] with a 0 and lim n→∞ a n = 0 such that an−1 an h −2 (u)du = n. For every n there exists continuous function ρ n on R with support on (a n , a n−1 ) so that , x > 0 and an an−1 ρ n (u)du = 1. Then the function Ψ n (x) = |x| 0 y 0 ρ n (u)dudy is even and twice continuously differentiable, with |Ψ n (x)| ≤ 1 and lim n→∞ Ψ n (x) = |x|. Suppose there are two strong solutions (I (1) , S (1) ) and (I (2) , S (2) ), so that (d(I (1) − I (2) )) 2 < σ 2 S (S (1) I (1) − S (2) I (2) )dt + (ασ) 2 (I (1) − I (2) ) 2 dt (d(S (1) − S (2) )) 2 < σ 2 S (S (1) I (1) − S (2) I (2) )dt d((I (1) − I (2) )(S (1) − S (2) )) < −σ 2 S (S (1) I (1) − S (2) I (2) )dt Thus, since |Ψ n | < 1, Observe that: Since Ψ n < 2/nh and h is positive, But, Since |I (1) and local uniqueness follows by Gronwall's inequality. We refer to the problem treated by Gatto and Schellhorn (2021) as the unconstrained problem. Indeed, in that problem α was not constrained. We refer to our problem as the constrained problem. We follow the method of proof in Cvitanić and Karatzas (1992) , referred to hereafter as CK. They introduce auxiliar y problems, which are unconstrained. They show that there exists an auxiliary problem which solution can be used to construct the solution of the original constrained problem. We follow the numbering of the sections in CK in order to ease understanding. CK Section 2. The Model. To ease the correspondence with the CK paper, we define b − r = K 0 + µ 0 − µ 1 −k 1 , θ := (b − r)/σ, and CK Section 3. Portfolio and consumption processes. Define: Denote by I i,α the infected process subject to I(0) = i and control α. It is admissible if The set of admissible α is denoted A 0 (i). Note that (See (3.5) in CK) CK Section 4. Convex sets and their support functions. The difference between CK and this paper is that our objective is to minimize. This means that the key relation between our auxiliary infected and infected is reversed compared to the first equation in CK. Indeed if α ν solves the auxiliary problem and α the original problem, we must have: It is subadditive: CK Section 5. Utility functions. The main difference between our utility functions and the utility functions in financial economics is that our utility functions are decreasing for positive arguments. Recall indeed that our utility function is, for γ < 0: We have lim i→∞ U (i) = −∞ and lim i→0 U (i) = 0, again for γ < 0. This is unlike CK and Wachter (2002) who consider the case 0 < γ < 1 with utility of wealth U 2 (x) = x 1−γ 1−γ . In their case, lim x→∞ U 2 (x) = 0 and lim x→0 U 2 (x) = ∞. We define I 2 to be the inverse of U , with I 2 (y) on y ≤ 0. By straighforward calculations: I 2 (y) = (−y) −1/γ We also define the Legendre-Fenchel dual This function satisfies:Ũ (y) = −I 2 (y) y ≤ 0 CK Section 6. The constrained and unconstrained optimization problems. We define: The supremum of the unconstrained problem is denoted by V 0 , while the supremum of the constrained problem is denoted by V , namely: CK Section 7. Solution of the unconstrained problem. We note that the expectation is finite for every y ∈ (−∞, 0]. We define its inverse Y 0 : The solution of the unconstrained problem is well-known, and equal to: CK Section 8. Auxiliary unconstrained optimization problems. Recall δ(ν) in (24). It is easily seen that: We introduce a new process I (ν) by: We denote by A ν (i) the class of α for which Since the solution of our dual problem will have α(t)ν(t) − δ(ν(t)) ≤ 0, clearly A (i) ⊂ A ν (i). We define: We define a class of progressively measurable processes ν in R by: Proposition 8.3. in CK shows that, if for some λ ∈ D the corresponding control α λ is optimal for the auxiliary optimization problem and if −δ(λ) + α λ (t)λ(t) = 0 then α ∈ A (i) and is optimal for the constrained problem. The solution of the unconstrained problem is: CK Section 9. Contingent claims attainable by constrained portfolios. We sketch the proof of theorem 9.1 in CK, as the signs are different, and the structure of the control is slightly different. CK 9.1 Theorem. Let B be a positive F T -measurable random variable and suppose there is a process λ ∈ D such that, for all ν ∈ D Then there exists a control α ∈ A (i) such that I i,α = B. Sketch of Proof. See CK p.782 for a definition of the stopping time τ n . By (27) and subadditivity of δ (25): It is easy to see that, for any ρ ∈ D , take ν = λ + ρ: and, taking ν(t) = 0, we obtain: which together with (28) for ρ = λ yields: −δ(λ(t)) + α(t)λ(t) = 0 CK Section 10. Equivalent optimality conditions. The most important implication to prove is (D)⇒(B)⇒(A) in CK. It shows that the solution of the dual problem solves the auxiliary problem, and that, moreover, it is feasible and optimal for the original constrained problem. We make it more explicit here. (Part of ) CK 10.1 Theorem. Suppose that for every ν ∈ D , then there exists a control α λ ∈ [0, 1] that is optimal for the constrained problem V λ (i) = E[U (I i,α λ (T ))] and such that SinceŨ (y) = −I 2 (y), By theorem 9.1 there exists a control α λ ∈ A λ (i) such that: Clearly α λ is optimal for the constrained problem, and Thus by proposition 8.3, α λ is optimal for the constrained problem. CK Section 12. A dual problem. Define: In our case,Ũ Typically, γ = −1, so that:Ũ (y) = y 2 /2 The main problem in condition (29) is to find the optimal process H (λ) (across all H (ν) ) but it depends on y which depends on λ. Thus the dual must be fixed for a fixed but arbitrary real number y. The objective has the form E[Ũ (yH (ν) (T ))] = E[U (I 2 (yH (ν) (T ))) − yH (ν) (T )I 2 (yH (ν) (T ))] The right handside of the equation (see Korn (1997, p.134) ) is the maximum of the function h(B, y) := L(B, y) for all non-negative F T measurable B with E[H (ν) (T )B] ≤ i. Thus a minimization over all positive numbers y of h(B, y) would yield the optimal utility of the unconstrained problem. We could thus first minimize E[Ũ (yH (ν) (T ))] in y, and then minimize over ν. However, the main idea is to first minimize over µ, and then minimize over y, hoping that the two can be interchanged. CK 12.1 Proposition. Suppose that for any y there exists λ y such thatV (y) = E[Ũ (yH (λy ) (T ))]. Then there exist an α ∈ A (i) with i = X λy(y) which is optimal for the primal problem, and we have: and we conclude by CK Theorem 10.1. Define: The HJB equation is: min ν 1 2 y 2 (θ + ν/σ) 2 Q yy + y(−r + δ(ν))Q y + Q t = 0 Dividing by (−y) −ρ v(t), the problem becomes: Recall that if ν is positive, then δ(ν) = ν thus we solve (30) and obtain since 1 + ρ = 1/γ and γ is negative. If ν is negative, then δ(ν) = 0, thus ν = r − b. From (26), the solution is Suppose µ 0 = µ 1 and treatment is better than no treatmentk 1 > K 0 . Thus We following the proof of Proposition 2 in Gatto and Schellhorn (2021). Recall the equations (58) (59) in Gatto and Schellhorn (2021) and following same notations: where L 1 , L 2 are equations (53) (54) in Gatto and Schellhorn (2021). Solution of (31) We postulate that: Substitution in (31) shows that H 1 solves: where the operator L γ is defined by: Using the Ansatz (9), we can rewrite the LHS of (33) into: (C 1 (t)X 2 + C 2 (t)X + C 3 (t))H 1 /γ = 0 Clearly all terms C 1 , C 2 , C 3 must be identically zero. Thus: which admit the solutions (10),(11),(12). Solution of (32) The second equation can be rewritten We try the Ansatz: f 2 (Z(t), X(t), t) = Z(t) 2/γ S(t)g(X(t), t) Thus We use Lemma to obtain the g(X, t) in (15). The optimal policy is: g(x, T ) = 0 is in (15). Sketch of Proof. The solution g(x, t) is the price of a variable-coupon bond in an affine model. The building block is the solution of a zero-coupon bond in the similar model. Define m(x) and r(x) to be such that: Let f (x, t) be the solution of: ∂f (x, t) ∂t + 1 2 σ 2 x ∂ 2 f (x, t) ∂x 2 + m(x) ∂f (x, t) ∂x = r(x)f (x, t) Defining: we see that: ∂f (x, t) ∂t Thus (37) Define P (t, T ) = f (X(t), t; T ) = H 2 (X(t), T − t) to be the price of a discount bond with a maturity of T . Clearly: dP (t, T ) P (t, T ) = r(X(t))dt + v(t, T )dW (t) where: v(t, T ) = σ x ∂f ∂x f By Ito's lemma, and for the exact same reason as (39): ∂g(x, t) ∂t + 1 2 σ 2 x ∂ 2 g(x, t) ∂x 2 + m(x) ∂g(x, t) ∂x = E[dg(X, t)|X(t) = x]/dt The stochastic equivalent of (36) is: The solution is: g(X(t), t) = Clearly, for some volatility σ Q (t, τ ) dQ(t, τ ) Q(t, τ ) = r(X(t))dt + σ Q (t, τ )dW (t) We are now ready to define a change of numeraire. Let dW τ = dW − v(t, τ )dt By Theorem 9.2.2. in Shreve (2004), Q(t, τ )/P (t, τ ) is a P τ -martingale, i.e., where dX(t) = m(X)dt + σ x dW (t) = m(X)dt + σ x (dW τ (t) + v(t, τ )dt) From (9), u(X(t), t) = 1 2 β 2 σ S 2 γ e 2 γ A 1 (T −t,γ) 2 X(t) 2 +A2(T −t,γ)X(t)+A3(T −t,γ) Let us now take: P (t, T ) = exp 2 γ A 1 (T − t, γ/2) 2 X 2 (t) + A 2 (T − t, γ/2)X(t) + A 3 (T − t, γ/2) Thus: v(t, τ ) = σ x γ (A 1 (τ − t, γ/2)X(t) + A 2 (τ − t, γ/2)) dX(t) = γ 2 − 1 γ/2 σ x − λ x + σ 2 x γ A 1 (τ − t, γ/2) X(t) + λ xX + σ 2 g 3 (t) = − β 6 16σ 6 S 1 b−r σ 2 (a 1,1 − a 2,1 ) + 2(a 1,2 − a 2,2 ) h 2 2 (t) a3,1−a2,1 2 b−r σ 2 + a 3,2 − a 2,2 +h 4 1 (T ) h 4 1 (t) a3,1−a1,1 2 b−r σ 2 + a 3,2 − a 1,2 a3,1− a 2,1 −a 1,1 2 2 b−r σ 2 + a 3,2 − a2,2−a1,2 2   Suppose we use the first two expansions, the optimal policy is given by: The challenges of modeling and forecasting the spread of covid-19 Convex duality in constrained portfolio optimization Optimal control of the SIR model in the presence of transmission and treatment uncertainty Brownian Motion and Stochastic Calculus Optimal Portfolios: Stochastic Models for Optimal Investment and Risk Management in Continuous Time Stochastic calculus for finance II: Continuous-time models Portfolio and consumption decisions under mean-reverting returns: An exact solution for complete markets When X(t) is a constant, equations (53) and (54) in Gatto and Schellhorn (2021) become Use the Ansatz f 1 (Z(t), t) = Z 1/γ (t)h 1 (T − t) and insert in (17) shows that h 1 solves:where the operator L γ is defined by:Using the Ansatz (20), we can rewrite (42) into:Clearly all terms C 1 , C 2 must be identically zero. Thuswhich admit the solutions (21), (22) . We can rewrite (18) byLet u(t) = 1 2 β 2 2 i−1 σ 2 S γ g 2 i (t) andand the stochastic equivalent of (43) is:We have showed that g 1 = h 1 (T − t). Here we also provide the g 2 and g 3 in the following: