key: cord-0819842-xk6ee3wf authors: Federico, Salvatore; Ferrari, Giorgio title: Taming the spread of an epidemic by lockdown policies date: 2020-12-08 journal: J Math Econ DOI: 10.1016/j.jmateco.2020.102453 sha: 93c9c1b25d36f4492f1f26946592a397de72396c doc_id: 819842 cord_uid: xk6ee3wf We study the problem of a policymaker who aims at taming the spread of an epidemic while minimizing its associated social costs. The main feature of our model lies in the fact that the disease’s transmission rate is a diffusive stochastic process whose trend can be adjusted via costly confinement policies. We provide a complete theoretical analysis, as well as numerical experiments illustrating the structure of the optimal lockdown policy. In all our experiments the latter is characterized by three distinct periods: the epidemic is first let freely evolve, then vigorously tamed, and finally a less stringent containment should be adopted. Moreover, the optimal containment policy is such that the product “reproduction number [Formula: see text] percentage of susceptible” is kept after a certain date strictly below the critical level of one, although the reproduction number is let oscillate above one in the last more relaxed phase of lockdown. Finally, an increase in the fluctuations of the transmission rate is shown to give rise to an earlier beginning of the optimal lockdown policy, which is also diluted over a longer period of time. During the current Covid-19 pandemic, policymakers are dealing with the trade-off between safeguarding public health and damming the negative economic impact of severe lockdowns. The fight against the virus is made especially hard by the absence of a vaccination and the consequent random horizon of any policy, as well as by the extraordinariness of the event. In particular, the lack of data from the past, the difficulty of rapidly and accurately tracking infected, and super-spreading events such as mass gatherings, give rise to a random behavior of the transmission rate/reproduction number of the virus (see, e.g., [15] 1 ). In this paper we propose and study a model for the optimal containment of infections due to an epidemic in which both the time horizon and the transmission rate of the disease are stochastic. In the last months, the scientific literature experienced an explosion in the number of works where the statistical analysis and the mathematical modeling of epidemic models is considered, as well as the economic and social impact of lockdown policies is investigated. A large bunch of papers provides numerical studies related to the Covid-19 epidemics in the setting of classical epidemic models or of generalization of them. Among many others, we refer to [3] , that studies numerically optimal containment policies in the context of a Susceptible-Infected-Recovered (SIR) model (cf. [20] ); [18] which also allows for seasonal effects; [28] , which estimates the transmission rate in various countries for a SIR model with given and fixed transmission rate; [4] , which combines a careful numerical study with an elegant theoretical study of optimal lockdown policies in the SEAIRD (susceptible (S), exposed (E), asymptomatic (A), infected (I), recovered (R), deceased (D)) model; [6] , where a detailed numerical analysis is developed for a SIR model of the Covid-19 pandemic in which herd immunity, behavior-dependent transmission rates, remote workers, and indirect externalities of lockdown are explicitly considered; [1] , where -in the context of a multi-group SIR model -it is investigated the effect of lockdown policies which are targeted to different social groups (especially, the "young", the "middle-aged" and the "old"); [14] , in which a multi-risk SIR model with heterogeneous citizens is calibrated on the Covid-19 pandemic in order to study the impact on incomes and mortality of agespecific confinements and Polymerase chain reaction (PCR) tests; [11] , which calibrates and tests a SEIRD model (susceptible (S), exposed (E), infected (I), recovered (R), deceased (D)) of the spread of Covid-19 in an heterogeneous economy where different age and sectors are related to distinct risks. A theoretical study of the optimal confinement policies in epidemic models is usually challenging because of the nonlinear structure of the underlying dynamical system. The first results on a controltheoretic approach to confinement policies are perhaps those presented in Chapter 4 of [7] , where it is shown that the optimal policy depends only on the shadow price difference between infected and susceptible. In the context of an optimal timing problem, [16] uses a continuous-time Markov chain model to study the value and optimal exercise decision of two (sequential) options: the option to intervene on the epidemic and, after intervention has started, the option to end the containment policies. Control-theoretic analysis are also presented in the recent [22] and [23] . In [23] the authors study a deterministic SIR model in which the social planner acts in order to keep the transmission rate below its natural level with the ultimate aim not to overwhelm the national health-care system. The minimization of a social cost functional is instead considered in [22] , in the context of a deterministic SIR model over a finite time-horizon. The resulting control problem is tackled via the Pontryagin maximum principle and then a thorough numerical illustration is also provided. Inspired by the deterministic problems of [22] and [23] (see also [1, 3] , among others), and motivated by the need of incorporating random fluctuations in the disease's transmission rate, in this paper we consider a stochastic control-theoretic version of the classical SIR model of Kermack and McKendrick [20] . A population with finite size is divided in three different groups: healthy people that are susceptible to the disease, infected individuals, and people that have recovered (and are not anymore susceptible) or dead. However, differently to the classical SIR model, we suppose that disease's transmission rate is time-dependent and stochastic. In particular, it evolves as a general diffusion process whose trend can be adjusted by a social planner through policies like social restrictions and lockdowns. The randomness in the transmission rate is modeled by a Wiener process representing all those factors affecting the transmission rate and that are not under the direct control of the regulator. The social planner faces the trade-off between the expected social and economic costs (e.g., drops in the gross domestic product) arising from severe restrictions and the expected costs induced by the number of infections that -if uncontrolled -might strongly impact on the national health-care system and, more in general, on the social well-being. The social planner aims at minimizing those total expected costs up to the time at which a vaccination against the disease is discovered. In our model, such a time is also random and independent of the Wiener process. We provide a complete theoretical study of our model by showing that the minimal cost function (value function) is a classical twice-continuously differentiable solution to its corresponding Hamilton-Jacobi-Bellman (HJB) equation, and by identifying an optimal control in feedback form 2 . From a technical point of view, the main difference between the models in [1, 3, 6, 22, 23] and ours, is that we deal with a stochastic version of the SIR model, instead of a deterministic one. As a matter of fact, in the aforementioned works the transmission rate is a deterministic control variable, while it is a controlled stochastic state variable in our paper. Moreover, our formulation is also different from that of other stochastic SIR models where the random transmission rate is chosen in such a way that only the levels of infected and susceptible people become affected by noise, with the transmission rate itself not being a state variable (see, e.g., [17, 29] and references therein). To the best of our knowledge, ours is the first work considering the transmission rate as a diffusive stochastic state variable and providing the complete theoretical analysis of the resulting control problem. In addition to its theoretical value, the determination of an optimal control in feedback form allows us to perform numerical experiments aiming at showing some implications of our model. For the numerical analysis we specialize the dynamics of the transmission rate, that we take to be mean-reverting and bounded between 0 and some γ > 0 (cf. (4.1)). In this case study, the containment policies employed by the social planner have the effect of modifying the long-run mean of the transmission rate, towards which the process converges at an exponential rate. Moreover, we take a separable social cost function (cf. (4.2) ). This is quadratic both in the regulator's effort and in the percentage of infected people. An interesting effect which is in fact common to all our numerical experiments is that the optimal lockdown policy is characterized by three distinct periods. In the first phase it is optimal to let the epidemic freely evolve, then the social restrictions should be stringent, and finally should be gradually relaxed in a third period. We also investigate which is the effect of the maximal level L of allowed containment measures (i.e., the lockdown policy can take values in [0, L]) on the final percentage of recovered, which in fact turns out to be decreasing with respect to L. This then suggests that the case L = 1 -which leads in a shorter period to the definitive containment of the disease with the smallest percentage of final recovered -might be thought of as optimal in the trade-off between social costs and final number of recovered. We observe that if the epidemic spread is left uncontrolled, then its reproduction number (R t ) t fluctuates around 1.8 and the final percentage of recovered (i.e. the total percentage of infected during the disease) is approximately 72% of the society after circa 7 months (in all our simulations the initial infected were 1% of the population). On the other hand, when L = 1, under the optimal policy we have a relative reduction of circa 30% of the total percentage of recovered individuals, and the reproduction number drops below 0.6 in the period of severe lockdown (circa 60 days). Moreover, the optimal containment is such that the so-called "herd immunity" is reached as the product R t S t (reproduction number × percentage of susceptible) becomes strictly smaller than the critical level of one, even if R t oscillates at around 1.7 in the last more relaxed phase of lockdown. Finally, we observe that an increase of the fluctuations of the transmission rate β have the effect of anticipating the beginning of the lockdown policies, of diluting the actions over a longer period, and of keeping a larger level of containment in the long run. This can be explained by thinking that an higher uncertainty in the transmission rate induces the policymaker to act earlier and over a longer period in order to prevent positive larger shocks of β. The rest of the paper is organized as follows. In Section 2.1 we set up the model and the social planner problem. In Section 3 we develop the control-theoretic analysis and provide the regularity of the minimal cost function and an optimal control in feedback form. In Section 4 we present our numerical examples, while concluding remarks are made in Section 5. Finally, Appendix A collects the proof of some technical results needed in Section 3. 2.1. The Stochastic Controlled SIR Model. We model the spread of the infection by relying on a generalization of the classical SIR model that dates back to the work by Kermack and McKendrick [20] . The society has population N and it consists of three different groups. The first group is formed by those people who are healthy, but susceptible to the disease; the second group contains those who are infected, while the last cohort consists of those who are recovered or dead. In line with the classical SIR model, we assume that, once recovered, an individual stays healthy for ever. We denote by S t the percentage of individuals who are susceptible at time t ≥ 0, by I t the percentage of infected, and by R t the fraction of recovered or dead. Clearly, S t + I t + R t = 1 for all t ≥ 0. The fraction of infected people grows at a rate which is proportional to the fraction of society that it is still susceptible to the disease. In particular, letting β t be the instantaneous transmission rate of the disease, during an infinitesimal interval of time dt, each infected individual generates β t S t J o u r n a l P r e -p r o o f Journal Pre-proof new infected individuals. It thus follows that the percentage of healthy individuals that get infected within dt units of time is I t β t S t . Notice that the instantaneous transmission rate β t measures the disease's rate of infection, as well as the the average number of contacts per person per time. In this regard, β t can be thus influenced by a social planner via policies that effectively cap the social interaction, like social distancing and lockdown. During an infinitesimal interval of time dt, the fraction of infected is reduced by αI t , since infected either recover from the disease, or die because of it at a rate α > 0. According to the previous considerations, the dynamics of S t and I t can be thus written as where (x, y) ∈ (0, 1) 2 are given initial values such that 3 x + y ∈ (0, 1). Notice that for any t ≥ 0, and for any choice of (β t ) t we can write and therefore S t > 0 and I t > 0 for all t ≥ 0. Moreover, summing up (2.1) and (2.2) we have d(S t + I t ) = −αI t < 0 for all t > 0, which then implies that S t + I t < 1 for all t ≥ 0. We depart from the classical SIR model by assuming that the transmission rate β t is time-varying, stochastic, and may be controlled. More precisely, we let (Ω, F, F := (F t ) t , P) be a complete filtered probability space with filtration F satisfying the usual conditions, and we define on that a onedimensional Brownian motion (W t ) t . For a given and fixed L ≥ 0, and for any (ξ t ) t belonging to we assume that the transmission rate evolves according to the stochastic differential equation The process (ξ t ) t influences the trend of the transmission rate and it should be interpreted as any effort devoted by the social planner to the decrease of the transmission rate. In this sense, ξ = 0 corresponds to the case of no effort done to decrease the disease, whereas the case ξ = L corresponds to the maximal effort. To fix ideas, ξ t may represent a percentage of social/working lockdown at time t and L corresponds to the maximal implementable value of such lockdown (e.g. 60%, etc.). On the other hand, the Brownian motion (W t ) t models any shock affecting the transmission rate and which is not under the control of the social planner. Regarding the dynamics of (β t ) t we make the following standing assumption. is bounded, infinitely many times continuously differentiable with respect to its first argument, and has bounded derivatives of any order; that is, there exists The choice of considering x + y < 1 -i.e. of having an initial strictly positive percentage of recovered -is only done in order to deal with an open set in the subsequent mathematical formulation of the problem. As a matter of fact, such a condition is not restrictive from the technical point of view as our results still apply if x + y < , for some > 1, thus covering the case x + y = 1 as well. (iii) σ : I → (0, ∞) is bounded, infinitely many times continuously differentiable with respect to its first argument, and has bounded derivatives of any order; that is, there exists K σ > 0 such that sup A reasonable dynamics of the transmission rate (β t ) t is the mean-reverting for some ϑ, γ, σ > 0, β ∈ (0, γ). In this case, in can be shown that 0 and γ are unattainable by the diffusion (β t ) t , which then takes values in the interval I = (0, γ) for any t ≥ 0. The level β can be seen as the natural transmission rate of the disease, towards which the transmission rate reverts at rate ϑ when ξ ≡ 0. Finally, the level γ is the maximal possible transmission rate of the disease, and σ is a measure of the fluctuations of (β t ) t around β. We will employ this dynamics in our numerical illustrations (cf. Section 4 below). Notice that dynamics (2.5) fulfills all the requirements of Assumption 2.1; this is shown, for the sake of completeness, in Proposition A.3 in Appendix A. Moreover, if ξ t ≡ L, then the transmission rate defined through (2.5) reaches 0 asymptotically, as its drift is negative and its diffusion coefficient stays bounded. Hence, under the maximal lockdown policy, the disease is asymptotically eradicated. A modeling feature that needs some discussion regards the nature of the control rule in (2.5). In our formulation, the policymaker adjusts ξ continuously over time with the aim of decreasing the trend of the transmission rate. However, motivated by the real-world strategies employed during the Covid-19 crisis, one can very well imagine a model where regulatory constraints are introduced once the reproduction number R t = β t /α becomes larger than a certain value, say R . Within this setting, a natural question would be: which is the optimal R and the optimal size of interventions? A possible answer to this question could be found by proposing a model where the policymaker instantaneously reduces the level of β via lockdown policies and faces proportional and/or fixed costs for its actions. This would gives rise to a singular or impulsive stochastic control problem; see [2] , [13] and [8] and references therein. Given the underlying multi-dimensional setting, we expect that the optimal trigger level R would be a function of the current values of (S t , I t ). However, the proof of such a conjecture would require the thorough study of a complex (non convex) three-dimensional degenerate singular/impulse stochastic control problem that clearly requires techniques different from those employed in this work. The Social Planner Problem. The epidemic generates social costs, that we assume to be increasing with respect to the fraction of the population that is infected. These costs might arise because of lost gross domestic product (GDP) due to inability of working, because of an overstress of the national health-care system etc. The social planner thus employs policies (ξ t ) t in the form, e.g., of social distancing or lockdown in order to adjust the growth rate of the transmission rate β, with the aim of effectively flattening the curve of the infected percentage of the society. Such actions however come with a cost, which increases with the amplitude of the effort. Assuming that a vaccination against the disease is discovered at a random time τ exponentially distributed with parameter λ o > 0 and independent of (W t ) t 4 (see also Remark 2.4 below), the social planner aims at solving Here, δ ≥ 0 measures the social planner's time preferences, and C : is a running cost function measuring the negative impact of the disease on the public health as well as the economic/social costs induced by lockdown policies. The following requirements are satisfied by C. Without loss of generality, we also take C(0, 0) = 0. Convexity of y → C(y, ξ) captures the fact that the social costs from the disease might be higher if a large share of the population is infected since, for example, the social health-care system is overwhelmed. The fact that ξ → C(y, ξ) is convex describes that marginal costs of actions are increasing because, e.g., an additional lockdown policy might have a larger impact on an already stressed society. Finally, the Lipschitz and semiconcavity property of C(·, ξ) are technical requirements that will be important in the next section. An application of Fubini's theorem, employing the independence of τ and (W t ) t , allows to rewrite the problem defined in (2.6) as The assumption that a vaccination against the disease is discovered at an exponential random time τ , independent of (W t ) t , has the technical important effect of leading to a timehomogeneous social planner problem (cf. (2.7) above). From a modeling point of view, such a requirement is clearly debatable, as it presupposes that the decision maker does not take into account the scientific progress in the epidemic's treatment. In order to take care of this, we now propose an alternative more realistic formulation which, however, comes at the cost of substantially increasing the mathematical complexity of the social planner problem. Suppose that the social planner has full information about the current technological level Q t achieved in the disease's treatment and assume, for example, that this evolves according to the SDE: for suitable µ and η, and for a standard Brownian motion (B t ) t independent of (W t ) t . The process (B t ) t models all the exogenous shocks affecting the technological achievements (e.g., new scientific discoveries in related fields), while µ measures the instantaneous trend of the research. Define then a continuous-time Markov chain (M t ) t with two states, 0 and 1, where 0 means that the vaccination is not available and 1 that a treatment has been instead found. We assume that 1 is an absorbing state and that the Markov chain has transition rate from state 0 to state 1 given by (λ(t, Q t )) t . Here, λ : R + × R + → R + is such that Λ t := t 0 λ(s, Q s )ds < ∞, a.s. for any t ≥ 0, and it is nondecreasing in its second argument. This latter condition clearly means that the larger the technological level is, the faster the disease is treated. Within this setting, the problem can be then still be written as The independence of Q with respect to W then leads to the equivalent formulation which defines a four-dimensional stochastic control problem. Clearly, this problem is much more challenging than (2.7) and its analysis, requiring different techniques and results, is left for future research. Another interesting future work might concern an extension of the previous model in which the social planner can also increase the technological level Q by supporting the research of a vaccination. Assuming that such an investment comes at proportional cost, this problem can be modeled in terms of an intricate stochastic control problem where the transition rate λ of the Markov chain is controlled through a singular control. In order to tackle Problem (2.7) with techniques from dynamic programming, it is convenient to keep track of the initial values of (S t , I t , β t ) t . We therefore set O := (x, y, z) ∈ R 3 : (x, y) ∈ (0, 1) 2 , x + y < 1, z ∈ I , and, when needed, we stress the dependency of (S t , I t , β t ) with respect to (x, y, z) ∈ O and ξ ∈ A by writing (S x,y,z;ξ t , I x,y,z;ξ t , β z;ξ t ). Indeed, due to (2.3) and the autonomous nature of (2.4), we have that S t and I t depend on (x, y, z) and on ξ through β t , while β t depends only on z and directly on ξ. We shall also simply set (S The latter is well defined given that C is nonnegative. In the next section we will show that V solves the corresponding dynamic programming equation in the classical sense, and we also provide an optimal control in feedback form. We introduce the differential operator L acting on functions belonging to the class C 1,1,2 (R 3 ): Lϕ (x, y, z) := xyz ϕ y − ϕ x (x, y, z) − αyϕ y (x, y, z) + 1 2 σ 2 (z)ϕ zz (x, y, z). Next, for any (y, z, p) ∈ (0, 1) × I × R, define which is continuous on [0, 1] × I × R. Indeed, by Assumptions 2.1-(ii) and 2.3-(iv), there exists a constant K > 0 such that J o u r n a l P r e -p r o o f By the dynamic programming principle, we expect that V should solve (in a suitable sense) the Hamilton-Jacobi-Bellman (HJB) equation (3.3) λv(x, y, z) = (Lv)(x, y, z) + C (y, z, v z (x, y, z)), (x, y, z) ∈ O. In order to show that V indeed solves (3.3) in the classical sense, we start with the following important preliminary results. Their proofs are standard in the literature of stochastic control (see, e.g., [25, 31] ), upon employing Assumptions 2.1 and 2.3. Proposition 3.1. There exists K > 0 such that, for each q := (x, y, z), q : i.e., V is bounded and Lipschitz continuous on O; (ii) for any µ ∈ [0, 1] and for some K > 0 i.e., V is semiconcave on O. Moreover, V is a viscosity solution to the HJB equation (3.3). Proof. The first claim of (i) above follows from the fact that C is nonnegative and bounded on [0, 1] 2 ; the second claim of (i) is due to Proposition 3.1 in [31] , whose proof can be easily adapted to our stationary setting. Analogously, the semiconcavity property of (ii) can be obtained by arguing as in Proposition 4.5 of [31] . Finally, Theorem 5.2 of [31] (again, easily adapted to our stationary setting) or Proposition 4.3.2-(2) of [25] lead to the viscosity property. The semiconcavity of V , together with the fact that V solves the HJB equation (3. 3) in the viscosity sense, yield the following directional regularity result. Proof. Let (x, y, z) ∈ O. By semiconcavity of V , there exists the left and right derivatives of V along the direction z at (x, y, z) that we denote, respectively, by V − z (x, y, z), V + z (x, y, z). Moreover, again by semiconcavity, we have the inequality V − z (x, y, z) ≥ V + z (x, y, z). Assuming, by contradiction, that V is not differentiable with respect to z at (x, y, z) means assuming that V − z (x, y, z) > V + z (x, y, z). Hence, we can apply Lemma A.1 in the appendix and find a sequence of functions (φ n ) n ⊂ C 2 (O) such that Then, the viscosity subsolution property of V (cf. Proposition 3.1) yields λV (x,ȳ,z) ≤ (Lφ n )(x,ȳ,z) + C (ȳ,z,φ n z (x,ȳ,z)). Taking the limit as n → ∞ and using (3.4) we get a contradiction. We have thus proved that V z exists at each arbitrary (x, y, z) ∈ O. Now we show that V z is continuous. Take a sequence (q n ) n ⊂ O such that q n → q ∈ O, and let η n = (η n x , η n y , η n z ) ∈ D + V (q n ), the latter being nonempty due to the semiconcavity of V . Since V z exists at each point of O, we have η n z = V z (q n ). Since V is semiconcave, the supergradient D + V is locally bounded as a set-valued map, and therefore there exists a subsequence (q n k ) k such that η n k → η = (η x , η y , η z ). By [9, Prop. 3.3.4-(a)], we have η ∈ D + V (q), and again, since V z exists, we have η z = V z (q). Hence, we have proved that from any sequence (q n ) n ⊂ O converging to q, we can extract a subsequence (q n k ) k ⊂ O such that V z (q n k ) → V z (q). By usual arguments on subsequences, the claim follows. We can now prove the main theoretical result of our paper, which ensures that V is actually a classical solution to the HJB equation (3.3) . In turn, this provides a way to construct an optimal control in feedback form. 6 If the system of equations 7 admits a unique strong solution (S t , I t , β t ) t , then the control is optimal for (2.8) and (β t ) t is the optimally controlled transmission rate; that is, Proof. Proof of (i) -Step 1. Recall (3.2) and define F (x, y, z) := C (y, z, V z (x, y, z)). Due to Propo- Although not uniformly elliptic, the differential operator L defined in (3.1) is hypoelliptic, meaning that the so-called Hörmander's condition is satisfied (cf. the proof of Proposition A.2 in the appendix and equation (A.4) therein). In fact, by Proposition A.2 in the appendix, for any q := (x, y, z) ∈ O the (uncontrolled) process (Q q t ) t := (S x,y,z t , I x,y,z t , β z t ) admits a transition density p(t, q, ·), t > 0, which is absolutely continuous with respect to the Lebesgue measure in R 3 , infinitely many times differentiable, and satisfying the Gaussian estimates (A.2) and (A.3). As a consequence, by Fubini's theorem we can write v(x, y, z) = ∞ 0 e −λt O F x , y , z p(t, x, y, z; x , y , z )dx dy dz dt, 6 Notice that, in order to define a candidate optimal control in feedback form, one actually only needs the existence of the derivative Vz. For instance, in the deterministic problem tackled in [12] only the regularity of the directional derivative is exploited to prove a verification theorem in the context of viscosity solutions. However, here we can improve the regularity of V due to the stochastic nature of our problem, and therefore prove a classical verification theorem. 7 Since b is bounded, by the method of Girsanov's transformation, the system has a weak solution, which is also unique in law (see [19, Ch. 5 , Propositions 3.6 and 3.10] and also [19, Ch. 5, Remark 3.7] ). For the sake of brevity, we do not investigate further existence and uniqueness of strong solutions, even if this might be done by employing finer results (e.g., see the seminal paper [30] ). J o u r n a l P r e -p r o o f and recalling (3.8) , and applying the dominated convergence theorem, one shows that v ∈ C 2 (O). For (x, y, z) ∈ O, let now τ n := inf{t ≥ 0 : |(S x,y,z t , I x,y,z t , β z t )| ≥ n}, n ∈ N, and notice that the strong Markov property yields Since v ∈ C 2 (O), we can apply Itô's formula to the first addend on the left-hand side of the latter, take expectations, observe that the stochastic integral has zero mean (by definition of τ n and the fact that v x is continuous), and finally find (3.10) that is, by (3.9), Dividing now both left and right-hand sides of the latter by t, invoking the (integral) mean-value theorem, letting t ↓ 0, and using that t → (S x,y,z t , I x,y,z t , β z t ) is continuous, we find that v is a classical solution to (3.11) λϕ = Lϕ + F on O. Proof of (i) - If (x, y, z) / ∈ K n , then v n (x, y, z) = V (x, y, z) as ρ n = 0 a.s. Take then (x, y, z) ∈ K n . By the same arguments as in Step 1 and considering that V is continuous on K n , the function v n is a solution to (3.13) λϕ = Lϕ + F, on K n , ϕ = V on ∂K n . Since also V is a viscosity solution to the same equation and since uniqueness of viscosity solution holds for such a problem (cf., e.g., [10] ), we have v n = V on K n . Because ρ n ↑ ∞ for n ↑ ∞ (as the boundary of O is unattainable for S x,y,z t , I x,y,z t , β z t ), by taking limits as n ↑ ∞ in (3.12) we find that where the last equality follows by dominated convergence upon recalling that V is bounded. But then V = v on O, and therefore V ∈ C 2 (O) and solves (3.11) by Step 1. That is, V is a classical solution to the HJB equation (3.3) . Proof of (ii). The optimality of (3.7) follows by a standard verification theorem based on an application of Itô's formula and the proved regularity of V (see, e.g., Chapter 3.5 in [25] ). In this section we illustrate numerically the results of our model, with the aim of providing qualitative properties of the optimal containment policies in a case study. We use the mean-reverting model for the dynamics of β, i.e. Journal Pre-proof for some L, ϑ, γ, σ > 0, β ∈ (0, γ). Notice that such a choice of the dynamics of β fulfills all the requirements of Assumption 2.1 (see Proposition A.3 in Appendix A). Moreover, we assume that the social planner has a quadratic cost function of the form The latter can be interpreted as a Taylor approximation of any smooth, convex, separable cost function with global minimum in (0, 0). In (4.2),ȳ ∈ (0, 1) represents, e.g., the maximal percentage of infected people that the health-care system can handle. Notice that in this case for any (x, y, z) ∈ O one has (cf. (3.5)) Our numerics is based on a recursion on the nonlinear equation which is solved by the value function in the classical sense (cf. Theorem 3.3). Namely, starting from v [0] ≡ 0 we use the recursive algorithm: z ), n ≥ 1 and those equations are solved by Montecarlo methods based on the Feynmann-Kac formula Such an approach is needed because of the lack of appropriate boundary conditions on the HJB equation, as the boundary ∂O is unattainable for the underlying controlled dynamical system. We take a day as a unit of time. In our experiments we assume that the average length of an infection equals 18 days, so that α = 1 18 (see also [3] , [6] , and [22] ), the level of the maximal possible transmission rate of the disease is γ = 0.16, the natural transmission rate of the disease is β = 0.1, towards which the transmission rate (β t ) t reverts at rate ϑ = 0.1 when ξ ≡ 0, σ = 1, so that the fluctuations of (β t ) t are (at most) of order 10 −2 . Furthermore, we set λ = 1/365 8 , and we fixȳ = 0.1 in (4.2). Finally, in all simulations we assume that at day zero about 1% of the population is infected. In all the subsequent pictures we show the mean paths of the considered quantities, with their 95% confidence interval. The Montecarlo average has been performed by employing 6000 independent simulations. In Section 4.1, we compare the optimal social planner policy with the case of no restrictions; in Section 4.2 we consider strategies in which the containment measures are limited to a fixed percentage L ∈ [0, 1] and provide a comparison between them; in Section 4.3 we study the effect of the fluctuations of the transmission rate on the problem's solution. 4.1. The Optimal Social Planner Policy. We compare the optimal social planner policy with the case of no restrictions (see Figure 1 ). In the optimal social planner policy severe lockdown measures (larger than 40%) are imposed for a period of circa 63 days, starting on day 79; then, it follows a gradual reopening phase. The final percentage of recovered individuals is about 50%, in contrast to 72% which is the total percentage of recovered individuals in the case of no restrictions. 8 Our choice of the value of λ = λo + δ can be justified by assuming that it takes at least a year to develop a vaccine (i.e. 1/λo ≥ 365) and that the intertemporal discount rate of the social planner δ is negligible with respect to vaccination discovery rate. Indeed, a typical value for the annual discount rate δ is 5% which is clearly such that Furthermore, the cases of optimal lockdown and no lockdown show a substantial difference in the evolution of the reproduction number R t := βt α : in the case of lockdown policies at work, in the most restrictive period, the latter is significantly decreased to values around 0.6. Another relevant quantity to analyze is R t S t . Indeed, recalling (2.2), it is easy to see that the percentage of infected naturally decreases at exponential rate α(R t S t − 1) if R t S t is maintained strictly below 1. We observe that, under the suboptimal action "no lockdown", R t S t lies below one from day 85 on. On the other hand, the optimal containment policy is such that R t S t < 1 from day 75 on. As a consequence, R t can be let oscillate strictly above one (actually, around 1.7) during the final phase of partial reopening so that the negative impact of lockdowns on the economic growth can be partially dammed. 4.2. The Optimal Social Planner Policy with Limited Containment. In many countries, a vigorous lockdown could not always be feasible, especially for long periods. Further, as pointed out by recent literature (for instance see [4] ), gradual policies of longer duration but more moderate containment exhibit large welfare benefits comparable to the ones obtained by a drastic lockdown. For this reason, we consider a strategy in which the containment measures are limited to a fixed percentage L ∈ [0, 1]. Notice that L = 0.7 in [3] , L = {0.7, 1} in [1] and [6] . A comparison of the optimal social planner policy with limited containment L ∈ {0.2, 0.4, 0.6, 0.8} is shown in Figure 2 and a summary is contained in Table 1 . amount of infected) in average ranges from 52% (case L = 0.8) up to 68% (case L = 0.2). In all the cases, the optimal containment starts at the maximal rate and the first day of containment is substantially the same (around day 54). Different ceilings L on the containment strategies also affect the values and the fluctuations' size of the reproduction number βt α : smaller values of L correspond to milder variation of the reproduction number R t of size 0.3, whereas larger values of L lead to rapid changes of R t which reaches levels smaller than 1 (less than 0.8 for L = 0.8 and less than 0.6 for L = 1). In all the cases, R t S t lies J o u r n a l P r e -p r o o f strictly below 1 after a certain date, which is decreasing with respect to L (see the last column in Figures 1 and 2) . Notice that without any containment policies, R t S t decreases on time due to a natural "herd-immunity" effect. On the other, when lockdowns are in place, we observe a faster decrease of R t S t which is forced by the initial vigorous policymaker's actions. The final relaxation of the latter then allows for an increase of R t S t , which is however constrained below the critical level of 1. Such an effect is monotone decreasing with respect to L. The Role of Uncertainty. The main new feature of our model is to consider a (controlled) stochastic transmission rate in the framework of the classical SIR model. In this section we study numerically how an increase of the fluctuations of the transmission rate affects the optimal solution. In particular, in Figure 3 the volatility σ takes values 1, 5 and 10, thus leading to fluctuations of β of order 10 −2 , 5 × 10 −2 , and 10 −1 , respectively (indeed, recall that σ(β) = σβ(γ − β) attains its maximum at γ/2 and γ = 0.16). We observe from Figure 3 that larger fluctuations of β have the effect of anticipating the beginning of the lockdown policies, and of diluting the actions over a longer period. Indeed, when σ = 5 and σ = 10, the optimal lockdown policy starts around day 46 and 42, respectively, in contrast to day 54 of the case σ = 1. Moreover, when σ increases, the maximal employed lockdown intensity reduces and the level of containment stabilizes at a larger value in the long run. This can be explained by thinking that an increase in the fluctuations of the transmission rate induces the policymaker to act earlier and over a longer period in order to prevent positive large shocks of β. However, in order to dam the social costs resulting from a longer period of restrictions, the maximal intensity of the lockdown policy should be reduced. Moreover, such a spread of the optimal lockdown policy gives rise to an increase of the final percentage of recovered (which is circa 58% and 60% when σ = 5 and σ = 10, respectively, and circa 50% when σ = 1). J o u r n a l P r e -p r o o f Journal Pre-proof 5 We have studied the problem of a policymaker which during an epidemic is challenged to optimally balance the safeguard of public health and the negative economic impact of severe lockdowns. The policymaker can implement containment policies in order to reduce the trend of the disease's transmission rate, which evolves stochastically in continuous time. In the context of the SIR model, our theoretical analysis allows to identify the minimal social cost function as a classical solution to the corresponding dynamic programming equation, as well as to provide an optimal control in feedback form. In a case study in which the transmission rate is a (controlled) mean-reverting diffusion process, numerical experiments show that the optimal lockdown policy is characterized by three distinct phases: the epidemic is first let freely evolve, then vigorously tamed, and finally a less stringent containment should be adopted. Interestingly, in the last period the epidemic's reproduction number is let oscillate strictly above one although the product "reproduction number × percentage of susceptible" is kept strictly below the critical level of one. Hence, under the optimal containment policy, the percentage of infected decreases naturally at an exponential rate and the social planner is then allowed to substantially relax the lockdown in order not to incur too heavy economic costs. Moreover, we show that an increase in the fluctuations of the transmission rate gives rise to an earlier beginning of the optimal lockdown policy, which is also diluted over a longer period of time. We believe that our work is only a first step in enriching the SIR model of a stochastic controlled component and in understanding the policymaker's problem of optimally balancing the safeguard of public health and social wealth. There is still much to be done in order to incorporate other features like the partial detectability of the transmission rate or the role of public investment on the discovery of a vaccination (see Remark 2.4 on this). We leave the analysis of the resulting challenging problems for future work. and denote by Π : R 3 → A the orthogonal projection on A. Given q ∈ R 3 we then have the decompositon Define, for q ∈ O , ϕ n (q) := g(Πq) + ψ n (s), where ψ n : R → R, ψ n (s) = − n 2 This sequence realizes (A.1). Indeed, the first two properties hold by construction; in particular the second one is due to the fact that we have |η−ζ| s if s < 0. As for the last two properties, we notice that Dϕ n (q) = Πη (= Πζ) + η − ζ |η − ζ| dψ n ds (s), so ϕ n z (q) = Πη, (0, 0, 1) + η − ζ |η − ζ| , (0, 0, 1) dψ n ds (s) = Πη, (0, 0, 1) + η z − ζ z |η − ζ| dψ n ds (s), ϕ n zz (q) = η z − ζ z |η − ζ| d 2 ψ n ds 2 (s), which then imply them. Denote by q = (q 1 , q 2 , q 3 ) := (x, y, z) an arbitrary point of O. For any multi-index α := (α 1 , α 2 , α 3 ) ∈ N 3 we denote by |α| = 3 i=1 α i and D α q = ∂ |α| /∂ α1 q1 . . . ∂ α3 q3 , with the convention that ∂ 0 is the identity. Proposition A.2. For any q ∈ O the (uncontrolled) process (Q q t ) t := (S x,y,z t , I x,y,z t , β z t ) admits a transition density p which is absolutely continuous with respect to the Lebesgue measure in R 3 , infinitely many times differentiable, and satisfies the Gaussian estimates Here, C 0 , D 0 , C α , and D α are increasing functions of time. Proof. Given f, g ∈ C 1 (R 3 ; R 3 ), define the Lie bracket xy(2b(z, 0)σ z (z) − ασ(z) − σ(z)b z (z, 0)) xy(σ(z)b z (z, 0) − 2b(z, 0)σ z (z)) b(z, 0) 2 Hence, the matrix associated to (L 0 ∪ L 1 ∪ L 2 )(q) has the sub-matrix formed by all its rows and its first three columns with determinant −αx 2 y 2 σ 3 (z) < 0. Hence, (A.4) holds true on O given the arbitrariness of q. Therefore, by Theorem 2.3.3 in [24] , for any t > 0 the uncontrolled process (S x,y,z t , I x,y,z t , β z t ) t admits a transition density p which is absolutely continuous with respect to the Lebesgue measure in R 3 , and infinitely many times differentiable. Moreover, Theorem 9 and Remark 11 in [5] (see also [21] ) show that p satisfies the Gaussian estimates (A.2) and (A.3). This completes the proof. Then, for (ξ t ) t ∈ A, introduce the stochastic differential equation Because b and σ are Lipschitz-continuous and have sublinear growth, uniformly with respect to ξ, for any (ξ t ) t ∈ A there exists a unique strong solution to (A.6) starting at z ∈ R (see, e.g., Theorem 7 in Chapter V of [26] ). We denote such a solution by ( β ξ,z t ) t . Since ξ → b(z, ξ) is decreasing, by Theorem 54 in Chapter V of [26] , we have (A. 7) β L,z checked β L,z t > 0 for all t ≥ 0 a.s. and β 0,z t < γ for all t ≥ 0 a.s. Hence, by (A.7), we get β ξ,z t ∈ (0, γ) for all t ≥ 0 a.s. This proves that the SDE (4.1) admits a unique strong solution which lies within the open interval I = (0, γ) (cf. Assumption 2.1-(i)). Given the boundedness of the interval I = (0, γ), using the expression of b and (A.5) it is straightforward to verify that (ii) and (iii) of Assumption 2.1 are satisfied as well. Optimal Targeted Lockdowns in a Multi-Group SIR Model Singular Stochastic Control, Linear Diffusions, and Optimal Stopping: A Class of Solvable Problems A Simple Planning Problem for COVID-19 Lockdown Mortality Containment vs. Economic Opening: Optimal Policies in a SEIARD Model An Elementary Introduction to Malliavin calculus. INRIA Rapport de Recerche n A Macroeconomic SIR Model for Covid-19 Optimal Control of Deterministic Epidemics A General Verification Result for Stochastic Impulse Control Problems Semiconcave Functions, Hamilton-Jacobi Equations, and Optimal Control User's Guide to Viscosity Solutions of Second Order Partial Differential Equations Restating the Economy while Saving Lives under Covid-19 Dynamic Programming for Optimal Control Problems with Delays in the Control Variable On the Optimal Management of Public Debt: A Singular Stochastic Control Problem Cost-Benefit Analysis of Age-Specific Deconfinement Strategies Monitoring the Spread of Covid-19 by Estimating Reproduction Numbers over Time Optimal Timing of Interventions during an Epidemic Asymptotic Behavior of Global Positive Solution to a Stochastic SIR Model Flattening the Curve": Optimal Control of Epidemics with Purely Nonpharmaceutical Interventions Brownian Motion and Stochastic Calculus Contributions to the Mathematical Theory of Epidemics, Part I Applications of the Malliavin Calculus, Part II Optimal Control of an Epidemic through Social Distancing Optimal Epidemic Suppression under an ICU Constraint The Malliavin Calculus and Related Topics Continuous-time Stochastic Control and Optimization with Financial Applications Stochastic Integration and Differential Equations Convex Analysis Susceptible-Infected-Recovered (SIR) Dynamics of Covid-19 and Economic Impact Stability of a Stochastic SIR System On the Strong Solutions of Stochastic Differential Equations Stochastic Control -Hamiltonian Systems and HJB Equations Acknowledgments. The authors thank Frank Riedel, Mauro Rosestolato, three anonymous Referees and the Guest Editors for interesting comments and suggestions. Proof. Since W is semiconcave, there exists C 0 ≥ 0 such thatand it is clear that it is equivalent to show the claim for W . By [27, Theorem 23.4] , it follows that there existSet g(q) := η, q ∧ ζ, q and notice that W (0) = 0 = g(0) and that, by concavity,