key: cord-0576951-sn5n4p9p authors: Lauro, Francesco Di; Kiss, Istvan; Rus, Daniela; Santina, Cosimo Della title: Trajectory Tracking of Optimal Social Distancing Strategies with Application to a CoVid-19 Scenario date: 2020-08-12 journal: nan DOI: nan sha: 2543016a26fd1269e799a98f914a8af02d9c5cb9 doc_id: 576951 cord_uid: sn5n4p9p This letter proposes the use of nonlinear feedback control to produce robust and reactive social distancing policies that can be adapted in response to an epidemic outbreak. % A trajectory tracking algorithm is proposed, and its effectiveness is analytically proven when acting on a low-dimensional approximation of the epidemics. % Means of mapping the inputs and output of this controller to the real network dynamics of the epidemics are introduced. % The strategy is tested with extensive simulations in a Covid-19 inspired scenario, with particular focus on the case of Codogno - a small city in Northern Italy that has been among the most harshly hit by the pandemic. % The proposed algorithm generates dramatic reductions of epidemic levels, while maintaining a total level of social distancing close to the nominal optimum. Designing policies to control pandemics is a long-standing research challenge, which has been recently brought into focus again by the Covid-19 pandemic. Until a vaccine is available, the most effective way [1] , [2] of limiting the spread of Covid-19 has proven to be social distancing, from soft measures to full lockdown (referred as control policies hereinafter). At the same time, it is also widely accepted that extreme levels of lockdown are unsustainable in the long run, due to the vast range of pernicious secondary effects that they may provoke [3] , [4] . Finding the right policy is clearly a matter involving healthcare, sociological, and economical considerations that we do not aim at addressing here. In this context, the role of modeling has been to provide a range of possible alternatives, and establish their effectiveness. Control theory is a natural candidate to test the usefulness of various control scenarios [5] . Usually, the evolution of the epidemic is described by low dimensional compartmental models [6] , which can be adapted to act on network models [7] to take into account social contact structures. Arguably, indeed, a critical challenge in modeling epidemic spreading arises from the complexity of the social contact structure of the population experiencing the outbreak. Although complex, data-driven models of epidemics that include the high dimensional network-like structure are available [8] , [9] , developing controllers directly based on such models is a burdensome task, and results require extensive simulations of limited Block diagram of the strategy proposed in this paper. The input and output maps reduce the high-dimensional dynamics of the outbreak to a simpler evolution of few salient characteristics, namely the prevalence of infected and susceptible ı, s, which are sensible to changes in the level of social distancing, modelled here as different values of the transmission rate of infection β. A nonlinear feedback controller acts within this representation implementing trajectory tracking of an optimal control policy. The goal is to ensure that the number of infected people is kept below the health care system capacity, i.e. ı ≤ ı th , while enforcing as few social distancing measures as possible. interpretability. Thus, most of the attention has been devoted to open loop optimal control applied on simpler models, to shed light on some important tasks, such as selecting optimal timing when there are constraints on the length of the action [10] , solving a linear quadratic problem integrating number of deaths and economic effects [11] , and discussing the problem of optimal peak reduction [12] . However, open loop strategies have been shown to be quite prone to many uncertainties affecting the controlled system -many of which arise from the difficulties in modeling the underlying network structure [12] - [14] . Looking at the problem through the lenses of control theory, it appears clear that these robustness issues call for the implementation of feedback actions. Linear feedback controllers are proposed in [15] , [16] . In [17] , the loop is closed by periodically replanning the optimal action, in a model-predictive-control fashion. In [18] , a similar strategy is proposed, and robustified by means of interval arithmetic. Finally, [19] introduces an open-loop fast switching strategy with duty cycle selected through a slow feedback of the total infected. To the authors best knowledge, no work has been done to design a feedback control policy that can be directly applied to the high dimensional network-structured dynamics of the epidemics. In this work, we aim at making a very first step in this direction. We propose to design a control policy as a trajectory tracking problem, where the reference curve is devised by means of optimal control. The latter is built up on the constraint that, in nominal conditions, the social distancing level is the minimum necessary to ensure that the healthcare system capacity is never overburdened. Feedback is used to enforce the robustness of this action, without spoiling its effectiveness. The stability of the proposed feedback controller is discussed analytically when applied to a compartmental model. We then introduce a method based on arguments widely accepted in epidemiology, which interfaces the low dimensional controller with a full networkbased model, which will be our benchmark. We perform extensive simulations of epidemics on synthetic networks, with conditions inspired by real Covid-19 scenarios, proving the effectiveness of the control architecture across a wide range of different settings. To succinctly summarise, our work contributes with • a feedforward action expressed in closed form which optimally flattens the epidemic curve, • a provable trajectory tracking controller, • two simple but effective strategies to interface the controller with a realistic network system, and • extensive simulations showing the effectiveness of the proposed reactive policy in a wide range of network simulations. Consider a fixed population of N individuals, and a disease spreading among them, through direct contacts. Each individual can be in either of three states: (i) susceptible, meaning that they can be infected by the pathogen; (ii) infected, meaning that they contracted the pathogen and they can now infect other susceptible people; (iii) recovered -and therefore immune, or removed. We denote with S(t), I(t), R(t) the number of people at time t who are susceptible, infected or recovered, respectively. We have that S(t) + I(t) + R(t) = N . We can therefore neglect the study of R, as its value can always be recovered from S, I and N . If the population is well mixed 1 , the evolution of the disease can be well described by the so-called SIR model [6] s(t) = −βı(t)s(t),i(t) = +βı(t)s(t) − γı(t), (1) where s(t) and ı(t) are the system state, indicating respectively the number of susceptible S(t) and infectious I(t), divided by the total population N . Without loss of generality, we will consider in the following t = 0 as the time in which s + ı = 1, meaning that no individual has yet recovered from the disease. The constant γ ≥ 0 defines the transition rate from the pool of infected, to the compartment of recovered/removed. β is the rate at which an infected individual makes disease-transmitting contacts with other individuals in the population. From the analysis of this system it follows that the basic reproduction number is R 0 = β γ , meaning that an outbreak happens if β > γ [6] . When social distancing policies are imposed, the value of β changes from a maximum of β max > 0 (no policy put into place), to a minimum of 0 (total lockdown). Therefore β is the control input of (1). We propose here a control strategy acting on system (1). As shown by Fig. 1 , this architecture is made of two components: (i) an optimal open loop action, and (ii) a feedback controller implementing trajectory tracking. The first block takes as input the the maximum capacity of intensive care units in the local hospitals ı th and an initial condition ı(0), and it generates as output a reference curve which is flattened enough to reach the threshold at its peak. The second block is a nonlinear feedback controller which robustly tracks this reference while minimally relying on model cancellations -i.e. full lockdown. Our aim here is to introduce a nominal strategy ("Optimal Solution" in Fig. 1 ) for optimally flattening the epidemic curve ı(t), so to keep the number of infected people ı within the maximum capacity of the health-care system, that we call ı th > 0. Enforcing this constraint is very important since exceeding it may provoke a critical failure of the healthcare system, leading to a substantial increase in the number of deaths not only from the disease, but also from uncorrelated health issues. On the other hand, we want to keep the level of restriction on the population as low as possible, to minimise possible secondary negative effects (see the Introduction for more details). We are interested in the case of a constant β -which however will become later variable by the action of the feedback controller. This simplification is instrumental in making the optimal control problem more manageable. Although it is beyond the scope of this paper to relax this assumption, it is worth noting that in the case of piecewise constant β, only minimal changes are required for our arguments to hold. We summarize the above considerations through the optimization problem max β∈R β, s.t. 0 < ı(t) ≤ ı th ∀t and (1). We now propose a Lemma introducing a general solution to this optimal control problem. Lemma 1. The solution of (2) exists in closed form, and it is equal to where W −1 is the branch −1 of the Lambert W function [20] . Proof. Since the cost function is linear in the optimization parameter, the optimal value is to be found on the boundary of the feasible set. We want to set β in such a way that max t ı(t) = ı th . The maximum value of ı is given by the non-trivial solution ofi(t) = 0. Combining this condition with the second equation in (1) yields s + = γ/β. Further, we can combine the first two lines of (1) into dı/ds = γ/(βs) − 1. This nonlinear ordinary differential equation can be solved together with the initial condition By inverting ı(s + ) for β, we get the desired optimal value such that max t ı(t) = ı th . The following is a solution for all integer values of j, However, only W −1 , W 0 have values in the real line [20] . Moreover, it is always the case that W 0 > W −1 , which in turn assures that the larger value of β among the two possible solutions is always reached for j = −1, concluding the proof. It is worth noting that the argument of W −1 is always between −1/e and 0 since 0 ı(0) ≤ ı th . This is exactly the range of arguments for which the −1 branch of the Lambert function is well defined [20] . Fig. 2 shows few instances of solutions for different values of ı(0) and ı th . Note that the corresponding evaluation of the state (which we will callβ,ī,s hereinafter) can be obtained either by direct integration of (1) or by using the approximated closed form solutions, as for example in [21] . The following Lemma introduces the tracking controller ("Trajectory Tracking" in Fig. 1) implementing the reactive change of the social distancing level β. Note that -although we consider here its use in conjunction with the optimal strategy introduced in the previous section -this controller is agnostic to the choice of the reference to be tracked, and it is introduced as such. Lemma 2. The feedback loop composed by the control action β(s, ı) = ψi(ı −ī) + ψs(s −s) +β sī s ı (6) and the SIR model (1), is such that lim t→∞ (s, ı) = (s,ī), ∀s, ı and ∀ψ i , ψ s ∈ R, ψ s > 0, ifs,ī,β is a solution of (1). Proof. Consider the linear change of coordinates Adding up the two equations in (1), yieldsi +ṡ = −γı. We can therefore establish the inverse change of coordinates Combining the latter, with the second equations in (1) allows writing the following equivalent formulation of the SIR dynamicsẍ which is in normal form. We take the following control action with α p > γα d ≥ 0 being the gains of the PD-like action. This produces the closed loop dynamics e = (γ + α d (γx +ẋ)ẋ)ė + αp(γx +ẋ)ẋe, where e =x − x. By hypothesis γ + α d (γx +ẋ)ẋ > 0 and α p (γx +ẋ)ẋ > 0. Therefore, both e andė converge exponentially to zero (see for example [22] ), which combined with (9) yields (7). Concluding the proof requires showing the equivalence of (11) and (6) . Substituting the referencesẍ calculated from (10) into (11) yields γẋ+ẍ = (γx+ẋ)ẋ. Eq. (10) is obtained by considering thatẋ = ı and (8), and by taking ψ i = α p /γ − α d and ψ s = α p /γ. Note that the latter are always strctily positive by construction. We want our control action to remain limited when acting on a neighborhood of sı = 0. Also, it is not meaningful to act on the system by changing β to negative values. Similarly, it is not acceptable to get β greater than where > 0 is a small constant, and [a] u l is is equal to l or u if a < l or a > r respectively, and equal to a otherwise. Fig. 3 reports two examples of application of the algorithm to the SIR model (1). We implement two important features in a refined model, namely that (i) people interact through heterogeneous contact structures, i.e. the population is not well-mixed, (ii) real epidemics have an intrinsic degree of stochasticity, therefore they cannot be exactly described by (1) . We therefore consider stochastic epidemics on networks [7] , in which the contact structure is modeled by a network, where individuals are modeled as nodes interacting only with their neighbors through links, which carry the disease from infected nodes to their susceptible neighbors at a constant rate β n . Infected nodes recover independently at a constant rate γ, after which they do not participate further in the epidemic. Initialization of the epidemic is made by infecting I 0 = N ı(0) N randomly chosen nodes, the others being susceptible. The resulting process is therefore a continuous time Markov chain, with a state space of dimension 3 N (arrangements of length N with entries S, I or R). In this paper, we make use of the Gillespie algorithm [23] adapted to networks (see [7, Appendix] ) to simulate epidemics on different instances of Erdős-Rényi networks [24] . The generative process of a Erdős-Rényi network can be concisely described as follows: start with N isolated nodes, then, for each pair of different nodes, a link connecting them is placed with probability 0 < p < 1, with no multi-edges allowed. Hence, the probability of a node of having k neighbors follows a binomial distribution B (N −1, p) , E(k) = p(N −1) being the average. Since the recovery rate of the network independent from the contact structure, we set it to be equal to γ in (1). To connect the controller proposed above to the network structure we need to introduce two maps, as shown in Fig. 1 . The output map extracts s and ı from the full state of the network by counting the number of susceptible and infected subjects, and normalizing it for the total population N . The input map instead provides expressions for the control input on the network level β n given the output of the controller β(s, ı). To this end, we manipulate the first equation of (1) as followṡ The term βI S N represents the total infectious pressure in the ODE model, i.e. the rate at which infections happen. This quantity drives the whole infectious process, and it is crucial that the map preserves it. Unfortunately, on the network, the infectious pressure is given by β n times the number of links between infected and susceptible nodes, which is a random variable that depends on which nodes are infected/recovered and on the topology of the network. To overcome this issue, we introduce the so-called mean-field approximation [7] : on average, an infected node is connected to E [k] neighbors, of which we assume that a proportion S N is susceptible; hence, we set the number of S − I links as . This allows us to derive an approximated expression for β n as a function of β: The mean-field approach can be also seen as an approximation where infected nodes are placed on the network uniformly at random. This often gives an upper estimate of the true S − I link count [7] , which should result in a more conservative control. Codogno has been the first city in Lombardy with a diagnosed case of Covid-19, on February, 14 2020. We therefore take it as a prototypical example where to apply our controller. We set γ = 1 9 1 days , in line with 9 days recovery time from symptoms to first negative RT-PCR results, as reported by [25] . We also set β = R0 γ 1 days with R 0 = 2.2 in line with [26] . Codogno has around N = 16000 inhabitants and its hospital had 4 available ICU beds at the begin of the outbreak. Combining WHO guidelines [27] with [28] , we assume that at anytime 1% of the infected are in need of intensive care units. The reference policyβ is therefore set to be the one that at its peak uses up the available capacity of the hospital -i.e. ı th = 0.025. The average number of daily contacts at risk in Italy is estimated to be 19 circa [29] . This is used as the average degree E [k] of a random network to represent the social contact structure of people in Codogno. We consider a time window of 180 days, with an initial condition for the simulation of I(0) = 800 of the population, to mimic a delayed recognition of the presence of an outbreak. On top of the uncertainties already introduced by the network itself, we consider three levels of further real-world non-ideal behaviors. Level (i) has no further changes. Level (ii) has a delay in the knowledge of the state equal to 2 days, the change of control action happens only at the beginning of each day, the control action is quantized into a reduced set of possible levels β ∈ {0, 0.1, . . . , 0.9, 1}β max , and a Gaussian measurement noise with variance 10 −3 is added. Level (iii) increases the delay to 3 days, the quantization levels reduce in number β ∈ {0, 0.2, 0.4, 0.6, 0.8, 1}β max , and the noise increases to 0.01 (i.e. about 50% of the actual signal). Note that the delay in the measurements is consistent with the amount of time necessary to collect the results of daily swab tests and to perform statistical analysis to get an estimate of the prevalence. The time discretization reflects the practical impossibility of changing instant-by-instant the level of social distancing. Similarly, the quantization of the control action mimics the limited set of interventions that can be realistically implemented by policy makers. Finally, the noise on data represents the uncertainty to get a precise estimate of the true prevalence from daily tests. Figs. 4 and 5 show the evolution of infected and of prescribed social distancing respectively. Together with the result when using the proposed feedback action β(s, ı)we report as comparison the evolution of the uncontrolled epidemics (β = β max ) and the evaluations of the open loop action. Susceptible percentages s are not shown for the sake of space, since it is not the core goal of our controller to regulate them. All the simulations are repeated 100 times. Every time a new network is generated. Average behaviors are shown as dashed lines, together with the corresponding σ−band as a translucent area of the same color. Fig. 6 reports some representative examples of realizations. Note that the higher the uncertainties, the higher the chances of observing brief periods where i ≥ ı th . The improvement of performance with respect to the uncontrolled epidemics will be quantified in the following subsection. We perform extensive simulations for various choices of system parameters. We consider a population of N = As for the Codogno validation, each simulation is repeated 100 times, always randomly re-generating the network. We cannot report here the complete results of our simulations, for the sake of space. We report instead some relevant performance index. The use of the controller consistently induces a reduction of over 99% of the amount of people infected when ı > ı th . This is obtained by flattening the epidemic curve, and as a consequence the total duration of the outbreak doubles. The average level of control measures imposed depends on the parameters. More specifically, the average value of β increases with E [k] and decreases with an increasing in γ -from a minimum of 0.17β max to a maximum of 0.95β max . Note that varying E [k] affects both the network and the control map (14) . In none of the simulations the controller presented critical failures or unstable behaviors. This preliminary work showed that a simple feedback action can dramatically improve the robustness and the effectiveness of an optimal policy for epidemic control. A relevant outcome observed in all our simulations is that, when control acts on an outbreak that has already reached a significant proportion of the population, the advisable strategy is to go into full lockdown until the epidemic curve is brought down to acceptable levels, and then gradually relax control measures but keeping the epidemic on a manageable trajectory. Ideally, when implementing control policies based on daily testing data, policy makers should have access to the exact state of the system. Clearly, this is not the case. At the very best, daily tests on a fraction of the population can be considered a (noisy) proxy of the state, and robust statistical analysis needs to be carried out to get an estimate of it. Together with several further uncertainties, we tested how the controller behaves when the input data are not exact, but instead normally distributed around the true value. We observe that the controller can maintain almost equal average performance, but the variance between individual realizations increase. This is however far from being the ultimate solution to model-based reactive quarantine design. On the contrary, we see that as a first step. Future work will be devoted to use more accurate mean-field models, improve control design, use more realistic network models and possibly implement learning strategies, the ultimate goal being to have a model that can be applied to real contact networks -which are often not fully specified a priori. Modelling strategies for controlling sars outbreaks Mathematical models to guide pandemic response Fear, lockdown, and diversion: comparing drivers of pandemic economic decline 2020 Covid-19-related suicides in bangladesh due to lockdown and economic factors: case study evidence from media reports How control theory can help us control covid-19 Containing papers of a mathematical and physical character Mathematics of Epidemics on Networks: from exact to approximate models Modelling COVID-19 The effect of human mobility and control measures on the COVID-19 epidemic in China The timing of one-shot interventions for epidemic control Optimal covid-19 epidemic control until vaccine deployment Optimal, near-optimal, and robust epidemic control Beyond r0: Heterogeneity in secondary infections and probabilistic epidemic forecasting The impact of network properties and mixing on control measures and disease-induced herd immunity in epidemic models: a mean-field model perspective Can the covid-19 epidemic be controlled on the basis of daily test reports Modelling the covid-19 epidemic and implementation of population-wide interventions in italy A parametrized nonlinear predictive control strategy for relaxing covid-19 social distancing measures in brazil Robust and optimal predictive control of the covid-19 outbreak On fast multi-shot epidemic interventions for post lock-down mitigation: Implications for simple covid-19 models On the lambertw function Accurate closed-form solution of the sir epidemic model Exponential convergence rates of nonlinear mechanical systems: The 1-dof case with configuration-dependent inertia Exact stochastic simulation of coupled chemical reactions Random Graphs. Cambridge Studies in Advanced Mathematics Persistence and clearance of viral RNA in 2019 novel coronavirus disease rehabilitation patients RESEARCH High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2 Media statement: Knowing the risks for covid-19 Critical Care Utilization for the COVID-19 Outbreak in Lombardy, Italy: Early Experience and Forecast During an Emergency Response What types of contacts are important for the spread of infections? using contact survey data to explore european mixing patterns