key: cord-0990299-1yri67si authors: Köhler, Johannes; Schwenkel, Lukas; Koch, Anne; Berberich, Julian; Pauli, Patricia; Allgöwer, Frank title: Robust and optimal predictive control of the COVID-19 outbreak() date: 2020-12-23 journal: Annu Rev Control DOI: 10.1016/j.arcontrol.2020.11.002 sha: a31f9e56f71a61e0ef7333e3eef8195560b8c2db doc_id: 990299 cord_uid: 1yri67si We investigate adaptive strategies to robustly and optimally control the COVID-19 pandemic via social distancing measures based on the example of Germany. Our goal is to minimize the number of fatalities over the course of two years without inducing excessive social costs. We consider a tailored model of the German COVID-19 outbreak with different parameter sets to design and validate our approach. Our analysis reveals that an open-loop optimal control policy can significantly decrease the number of fatalities when compared to simpler policies under the assumption of exact model knowledge. In a more realistic scenario with uncertain data and model mismatch, a feedback strategy that updates the policy weekly using model predictive control (MPC) leads to a reliable performance, even when applied to a validation model with deviant parameters. On top of that, we propose a robust MPC-based feedback policy using interval arithmetic that adapts the social distancing measures cautiously and safely, thus leading to a minimum number of fatalities even if measurements are inaccurate and the infection rates cannot be precisely specified by social distancing. Our theoretical findings support various recent studies by showing that (1) adaptive feedback strategies are required to reliably contain the COVID-19 outbreak, (2) well-designed policies can significantly reduce the number of fatalities compared to simpler ones while keeping the amount of social distancing measures on the same level, and (3) imposing stronger social distancing measures early on is more effective and cheaper in the long run than opening up too soon and restoring stricter measures at a later time. Social distancing is an effective way to contain the spread of a contagious disease, particularly when little is known about the virus and no vaccines or other pharmaceutical interventions are available. Social distancing and isolation (together with other non-pharmaceutical measures such as hygiene and face masks) have a direct influence on the infection rates and hence on the spread of the virus [1, 2, 3] . While this combination has proven effective during the last weeks, e.g. in the German outbreak of COVID-19, strict social distancing is also very costly in terms of economical and psychological damage, which naturally leads to a multi-objective decision problem. There have been numerous approaches to model the COVID-19 outbreak and to predict future behavior for different distancing policies in simulation studies. The most commonly used modeling approaches are different extensions of the SIR (susceptible-infected-removed) model formulated either as system dynamics or as agent-based simulations (e.g. for Germany [4, 5, 6] ). In many such studies, different policies are simulated and compared with respect to the goals that both the health care system is not overwhelmed such that every patient in need receives treatment and the mortality rate is kept low, and also such that the majority of people can resume social interaction as soon as possible. However, in line with [7] and others, we advocate to go from mere model predictions to (model predictive) control, since control generally offers the theory to develop and apply optimal or robust decision making under uncertainties. While mathematical modeling and control of epidemics is a topic with rich history (see, e.g., the survey in [8] and the references therein), there have also been numerous approaches to apply control theory to the COVID-19 spread. In [9] , the author applies control theoretic principles and insights to a simple model of the outbreak to point out the difficulties of the system at hand: fast unstable dynamics with significant delays. In more recent literature, multiple works have addressed the problem of open-loop optimal control for the COVID-19 pandemic. In [10, 11] , for example, the authors argue in favor of 'on and off'-policies of the social distancing measures, yielding a bang-bang like optimal control strategy. Such 'on and off'-policies, where the control input switches between two states, however, could pose great challenges, amongst others, for the society, but also for production lines, supply chains and the economy in general. In this paper, we propose optimal open-loop and feedback control strategies to handle the German COVID-19 outbreak. We employ the recently developed SIDARTHE model [12] in order to design control policies which minimize the number of fatalities within a time horizon of two years, without using excessive social distancing measures. We also address robustness of our policies w.r.t. model and measurement uncertainties via a (robust) model predictive control (MPC) feedback strategy. Note that the following discussion and results are all based on the information and data available prior to the initial submission of this paper (May 2020). Similar to the setup in this paper, the authors in [13] explore the best policy to implement while waiting for the availability of a vaccine. In their paper, they also distinguish between varying severity of symptoms ('mild' or 'severe') and seek a solution to the multi-objective optimization problem of minimizing fatalities and costs due to the implementation of the control strategy itself. Their main outcome of the open-loop input strategy is qualitatively similar to our results in Section 3.2: Start with a loose strategy, soon increase all distancing measures such that the health care system capacities are never extorted and then relax the social distancing measures gradually and slowly. Another example for an open-loop optimal policy applied to the COVID-19 pandemic is presented in [14] where the authors consider optimal control of the German outbreak using a slightly simpler model as the one chosen in the present paper (without distinguishing between detected and undetected individuals), which also includes an increased mortality rate if the ICU capacity is exceeded. Therein, the objective is not only to minimize the number of fatalities but also the number of susceptible individuals at the end of the time horizon, thus aiming for herd immunity. Our investigations in Section 3, however, indicate that herd immunity cannot be reached in a reasonable amount of time without overwhelming the hospital capacities. Therefore, our approach minimizes the number of fatalities after two years, with the underlying assumption that a vaccine will be available thereafter. However, an open-loop optimal policy cannot suffice to control the COVID-19 pandemic given all the uncertainties in the spreading of the virus and the disease progression, as we will see in the numerical results. We argue, similar to [7] , that an MPC-based feedback strategy is the right tool to develop optimal and robust social distancing policies, especially in the presence of model inaccuracies. By using online measurements of the current outbreak, feedback inherently introduces robustness with respect to uncertainties and disturbances to the policy. We also robustify the feedback mechanism by introducing a robust MPC-based feedback strategy for uncertain state measurements which is crucial in a situation where only a limited amount of data is available and, for example, the number of the currently infected persons can only be estimated roughly by applying different studies. Our results are also in accordance with a very recently published joint strategy paper for Germany by authors from different German research institutions (Fraunhofer-Gesellschaft, Helmholtz Association, Leibniz Association and Max Planck Society) [15] . Firstly, they also state that reaching herd immunity without the availability of a vaccine would either exceed the health care capacities (with a resulting high mortality rate) or take several years (cf. our results in Section 3). Secondly, they state that the goal of wiping out the virus can only be a robust solution if this eradication would be a worldwide effort with very high social and economic costs (cf. our results in Section 3.1.1), which seems impossible to realize. Finally, they suggest an adaptive strategy for all policies influencing the infection rates with the goal to keep the spread of COVID-19 at bay while requiring the least possible restraints on the society and economy. With exactly this reasoning, we develop suitable control approaches in Section 3.2, Section 4, and Section 5 for such an 'adaptive' strategy. To summarize, our key contributions are the following: We extend the model in [12] by a mortality rate dependent on the state of the health care capacity and fit the parameters with data from Germany (Section 2). We develop an optimization problem for finding the optimal input (in terms of setting infection rates) that minimizes the number of fatalities while keeping the costs occurring due to distancing measures low (Section 3). Moreover, we show that such an optimal input has significant advantages compared to simpler baseline policies. We show that simply applying a precomputed (optimized) input is dangerous if the model is uncertain and explain why feedback is of utmost importance when dealing with such an unstable and uncertain system. Further, we demonstrate how such feedback can be incorporated via MPC, and we showcase the advantages of this control policy (Section 4). We develop a robust MPC-based feedback strategy, which takes model inaccuracies, uncertain state measurements, and inexact inputs into account and can thus handle the COVID-19 outbreak cautiously and safely (Section 5). Although based on a simple model fitted with limited data, we hope that these high-level insights inspire further investigations, possibly on more complex epidemiological models, and can ultimately help decision makers to improve and optimize their policies to mitigate the spread of epidemics while keeping the toll on the society and economy low. In this section, we describe the model of the COVID-19 epidemic that we use for our subsequent control approach. Our model is adapted from the SIDARTHE model proposed in [12] with the key differences that (i) we use more recent data to estimate new parameters to model the German COVID-19 outbreak (in [12] , the Italian outbreak was considered) and (ii) we model the fact that the mortality rate increases if the number of critically ill patients exceeds the capacity of the German health care system. In Section 2.1, we describe the model of [12] and explain its ingredients. Thereafter, in Section 2.2, we provide details on our parameter estimation algorithm which fits the model to the German COVID-19 outbreak. Finally, we propose an extension of the model by increasing the mortality rate when the health care system is overwhelmed in Section 2.3. -Recognized (symptomatic, detected), T -Threatened (symptomatic with life-threatening symptoms, detected), H -Healed (immune after prior infection, detected or undetected), E -Extinct (dead, detected). In accordance with In the equations (1), capital letters describe fractions of the whole population that are currently in the respective state. Since the model represents the whole population, the states sum up to 1, i.e., they must satisfy S + I + D + A + R + T + H + E = 1 at all times. Therefore, one equation in (1) is redundant and hence, e.g., the state H can be expressed via the algebraic relation H = 1 − (S + I + D + A + R + T + E) instead of Equation (1g), as it is common in the field of differential algebraic equations. In most parts of this section, we omit time arguments for simplicity. Further, Greek letters are the model parameters which are briefly summarized in the following: α, β, γ describe the infection rates for susceptible individuals, i.e., the rate at which susceptible individuals are infected by the states I, D or R, and A, respectively, and hence join the state I. , θ describe the testing rate, i.e., at which rate (asymptomatic or symptomatic) infected individuals go from undetected to detected. ζ describes the rate of asymptomatic (detected or undetected) infected individuals exhibiting symptoms, i.e., going from states I or D to A or R, respectively. µ is the rate at which infected individuals in A or R develop life-threatening symptoms, i.e., join the state T . λ, κ, σ(T ) are recovery rates for individuals affected by COVID-19. The recovery rate for threatened individuals σ(T ) depends on T , compare Section 2.3. τ (T ) is the mortality rate, i.e., the rate at which individuals with lifethreatening symptoms decease, and it depends on T , compare Section 2.3. Key features of the considered model for the COVID-19 pandemic compared to simpler ones (e.g., SIR models, compare [16] ) are that it distinguishes between detected and undetected cases, symptomatic and asymptomatic individuals, and it includes a separate state T for patients with life-threatening symptoms (compare [12] for a more detailed explanation of the key ingredients). The present model, i.e., Equations (1) as well as Figure 1 , is a mild modification of the model suggested in [12] . First, we reduce the number of parameters by including the following assumptions. We assume that the rate for developing (severe) symptoms is the same for detected and undetected cases, since (to this day) no effective medication of COVID-19 is known. More precisely, the transitions from I to A and D to R have the same dynamics with rate ζ, and similarly for the respective recovery rates as well as for the transitions from A to T and R to T . Moreover, we assume that the rate β at which susceptible individuals are infected is the same from states D and R, since the state D is neglected for the parameter identification step (compare Section 2.2). Finally, as a key difference to [12] , we consider T -dependent rates τ (T ) and σ(T ) for threatened patients, i.e., the mortality and recovery rates depend on the current number of threatened patients. Essentially, τ (T ) increases and σ(T ) decreases if T exceeds the capacity of the German health care system (see Section 2.3 for a detailed description of this model refinement). In this section, we adjust the model parameters and the initial condition given in [12] to the COVID-19 outbreak in Germany. This is necessary, because the outbreaks in Germany and Italy evolved differently due to differences in the testing policy, the testing capacity, the health care system, the reaction of the governmental authorities, and the underlying counting method of confirmed cases. In order to compute realistic parameters for Germany, we will use a pragmatic approach that enables us to easily include prior knowledge about relations between parameters. The approach is a least squares optimization of the available data, where prior knowledge is incorporated via hard constraints in the optimization problem. The available data is marked by a tilde and is given by: the confirmed COVID-19 casesC, deathsẼ, and recoveriesH c from [17] , [18] for the days t ∈ [0, 49] from February 28, 2020 (t = 0) to April 21, 2020 (t = 53). We filtered this data set using the Matlab function kaiser(7,3) with window length 7 and shape factor 3 to reduce the effect of noise corruption and having less confirmed cases during weekends. Further, we have to divide the data set by the total German population N total = 8.3 · 10 7 to ensure all values are normalized and are in the range [0, 1]. the COVID-19 patients in ICUT 2 and how many of them diedẼ 2 or recoveredH 2 from [19] for t ∈ [24, 53] from March 23 (t = 24) to April 21 (t = 53). 1 This data set, however, is rather small compared to the complexity of the model (1) consisting of eight states and 13 parameters. Therefore, we need to leverage additional prior knowledge in order to avoid over-fitting and ensure a realistic resulting parameter set and initial conditions. Based on other studies and the interpretation of our model states and parameters, we enforce the following assumptions. The detection rate of asymptomatic cases is negligible, as the current German policy is to test only people showing symptoms [20] , i.e., = 0. At February 28, the initial date for our fit, there were 48 confirmed cases, hence, we assume R(0) = 48/N total , T (0) = D(0) = H(0) = E(0) = 0, and I(0), A(0), S(0) appear as decision variables with The test rate θ is approximately constant. Please note that this does not mean that the absolute number of tests is constant per day, as this value is rather proportional to θA than to θ. The infection rates α and γ were influenced by the countermeasures that the German authorities installed to fight the spread of the pandemic. According to [4] , three main events changed the spreading rates: (1) March 9 -canceling large events, (2) March 16 -closing schools and non-essential stores, and (3) March 23 -contact ban (Kontaktsperre) that prohibits groups of more than two people and requires people to maintain a distance of at least 1.5 m in public. Hence, there are four different policies u i , i = 1, 2, 3, 4 monotonically increasing from no countermeasures u 1 = 0 to full lockdown u 4 = 1 resulting in α i = α max + u i (α min − α max ) and γ i = γ max + u i (γ min − γ max ). This yields the following six decision variables α min , α max , γ min , γ max , u 2 , and u 3 . One of the main reasons why the COVID-19 pandemic is spreading so fast is that infectiousness peaks even before the onset of symptoms [21] . As asymptomatic individuals have no indication of their infection, they are on top of that also not as cautious as people with symptoms. Therefore, we require α ≥ γ when searching for realistic parameters. Further, we want to ensure that people tested positive are significantly less contagious while in quarantine, such that we require γ ≥ 5β. The percentage of confirmed COVID-19 cases is estimated in the study [22] as 27.32% in Germany. In our model, this value approaches the constant φ = ζ λ+ζ θ+µ κ+θ+µ , which is the proportion of people that develop symptoms (I to A, I to D can be ignored as = 0) and get detected (A to R or T ); that is the percentage of confirmed accumulated cases in a steady state (I = D = A = R = T = 0). To make sure our model coincides with the findings of [22] , we expect φ to be slightly above of the estimated 27.32% as a steady state is not reached yet and the proportion of detected cases increases over time, i.e., we constrain φ ∈ [0.3, 0.45]. The percentage of asymptomatic disease progressions was estimated at 43% in a population screening study in Iceland [23] , at 43.2% in a comprehensive testing of the whole municipality of Vo', Italy [24] and at 17.9% [25] in a study regarding the cruise ship Diamond Princess. To ensure that our model has a comparable ratio, we add the constraint λ λ+ζ ∈ [0.18, 0.43] to the optimization problem. The (base) reproduction rate in the beginning of March was estimated as approximately 3 [26] . Thus, for the parameters α max , γ max with no active countermeasures we require R 0 (α max , γ max ) ∈ [2.5, 3.5] where R 0 (α, γ) is given by (see [12] for details) The median of the incubation time is 5-6 days [27] , [28] , [29] , which we identify as the half life period a person is in the state I, i.e. log(2)/(λ+ζ) ∈ [5, 6] . Further, the median time from onset of symptoms until intensive care is 10-11 days [30] , [31] . Hence, we constrain the half life period of a ailing or recognized individuals to log(2)/(κ + µ) ∈ [10, 11] . Figure 2 : The threatened state T is split up in ICU cases T 2 and non-ICU cases T 1 which is later approximated using a lumped model. In the state H of (1) the confirmed recovered cases are not distinguished from the undetected ones, thus we define the number of confirmed recovered cases as H c , with H c (0) = 0 andḢ c = λD + κR + σ 1 T 1 + σ 2 T 2 and further the number of confirmed accumulated cases as C = D + R + T + H c + E in order to match the dataH c andC. Considering the COVID-19 patients in intensive careT 2 , a natural choice would be to identify them with threatened state T , however, all deaths in the model have been in T before, but in reality only half to a third of the deaths happens in ICU [18] , [19] . Hence, as the patients in ICU are only a part of T , we split T into T 1 and T 2 , where T 2 represents the number of people in intensive care and T 1 are all otherwise threatened COVID-19 cases. We assume that there are no transitions from T 1 to T 2 and vice versa, such that T = T 1 + T 2 can be modeled aṡ with µ 1 +µ 2 = µ. This more complex model with T 1 and T 2 will be approximated with a model of the form described in Section 2.1 as sketched in Figure 2 . Further, we define H 2 (E 2 ) to be the numbers of people that recovered (died) from T 2 . Finally, we perform the parameter optimization by solving a least squares problem via CasADi [32] to fit C, E, H c , T 2 , E 2 , H 2 to the dataC,Ẽ,H c , T 2 ,Ẽ 2 ,H 2 . The best fitting parameters are given in Table 1 and the resulting fit is shown in Figure 3 . Many of the constraints listed above are active at the optimal set of parameters, e.g., α = γ, which is not surprising since we use the constraints to keep the parameters in a realistic range without further regularization. This fit further enables us to specify the full model state of today x(53) =: x 0 , which will be used in the following sections as the initial condition where t = 53 corresponds to April 21 Table 1 scaled by N total compared to the actual data [18] , [19] . The horizontal axis represents the time in days, where t = 1 is February 28 and t = 53 is April 21. unknown cases or the percentage of asymptomatic cases deviate from the assumptions. It has been recognized as a key difficulty in handling the COVID-19 pandemic that the virus is highly contagious, thus infecting large numbers of individuals. In addition, since many elderly and ill people require hospitalization and/or intensive care [33] , large waves of infections can quickly exceed the capacities of local health care systems [34] . Hence, ensuring that health care resources are sufficient is a key issue in handling the outbreak [35] , given that an overwhelmed health care system even correlates positively with the mortality rate [36] . In this section, we describe how the mortality and recovery rates τ (T ) and σ(T ) in (1) depend on the number of threatened patients T . The basic idea is that they are constant as long as the health care system's capacity is not at its limit, and the mortality rate τ (the recovery rate σ) increases (decreases) significantly if it is overwhelmed. According to [19] , there are (on April 21) 2 908 COVID-19 patients in an ICU and 12 623 ICU spots are available. Hence, J o u r n a l P r e -p r o o f the overall ICU capacity currently available for COVID-19 patients is 2 908 + 12 623 = 15 531, and we define the relative ICU capacity as T ICU = 15 531 N total , where N total = 8.3 · 10 7 . We consider a constant value of T ICU for simplicity, although it is likely that it will further increase in the future. We assume that the mortality rate increases if the number of individuals requiring treatment in an ICU exceeds T ICU , i.e., if T 2 > T ICU , with T 2 as in (2) . More precisely, we assume that if a patient requiring intensive care does not receive it, then the patient deceases (i.e., for such patients, the mortality rate increases and the recovery rate is zero). According to data of deceased individuals from Italy, those who were not admitted to an ICU deceased in median within 4 days [37] . Hence, we model those individuals in T 2 which are not admitted to an ICU via decaying first order dynamics with a half-life period of 4 days, i.e, the corresponding time constant τ crit satisfies e −4τcrit = 0.5, thus leading to τ crit = 0.173. In a first approximation T 2 ≈ µ2 µ T and hence we only modify the mortality rate τ in case that µ2 µ T > T ICU . In the model (1), τ (T ) and σ(T ) only occur jointly with T , which leads us to the following formula for τ (T )T and σ(T )T : , a simple lumped model is recovered as long as the ICU capacity is not exceeded. If however µ2 µ T > T ICU , then the mortality rate increases to τ crit for those µ2 µ T − T ICU patients which require intensive care but do not receive it. Similarly, for this fraction, the recovery rate is set to zero implicitly in (5b). The individuals T 1 = µ1 µ T not receiving intensive care are not affected by this mechanism. Clearly, the modified rates in (5) are just a simple approximation of the effect that the mortality rate increases if hospitals are overwhelmed. This modification in the model is crucial when studying the effect of loosening quarantine measures and corresponding optimal policies, as done in the remainder of this paper. Since (fortunately) the German health care system has not been overwhelmed to this date, there are no quantitative data to validate the above modification and in particular, the exact value of τ crit . Nevertheless, the refinement is confirmed qualitatively by experiences in other countries [34, 35, 36] . Moreover, even a substantial change of τ crit has little effect on the overall dynamics since it only affects the exact number of fatalities. In particular, changing τ crit does not lead to a qualitative change in an optimal policy to control the outbreak as long as τ crit is sufficiently larger than τ 2 and it is possible not to exceed the ICU capacity. In this section, we discuss different policies that can be considered to keep the number of fatalities due to COVID-19 low while at the same time also impose as little constraints as possible on the public. The most significant degree of freedom currently is certainly influencing the infection rates α and γ. Measures for influencing the infection rates include hygienic measures, face masks, and different nuances of distancing policies, up to a mandated lockdown. Therefore, we define u as introduced in Section 2.2 as our input, representing distancing policies or other measures that have a direct influence on the infection rates α and γ. We model this influence via where a value of u = 1 hence represents the policies in Germany as of mid April (lockdown) and u = 0 represents no social distancing or other measures (i.e. corresponding to infection rates as in the beginning of March). Furthermore, we assume that the policies affecting the infection rates α, γ (i.e. u) stay constant for at least one week and can only be changed every seven days. In the first subsection, we will introduce different baseline policies which can give insights into the effects of different inputs u and which will serve as a comparison to the optimal controller in the following subsection. More specifically, these baseline strategies will be used to define an upper bound on the social distancing measures that the optimal control in Section 3.2 and later on the feedback strategies in Sections 4 and 5 can employ to minimize the fatalities. In addition to varying the infection rates α and γ, another degree of freedom to influence the model (1) lies in adapting the testing policy. Testing individuals on COVID-19 is represented in the current model by parameters θ and for symptomatic and asymptomatic individuals, respectively. In the following, we assume that only a fixed number of tests can be carried out every day. If we wish to only test symptomatic individuals, this includes both symptomatic individuals infected with COVID-19 (i.e., members of the state A) and individuals suffering from other illnesses with similar symptoms. In [38] , the Robert Koch Institute estimates numbers on influenza-like illnesses (ILI) in Germany. While the numbers show clear seasonal differences, approximately 1.3% of the population become newly infected with ILI on average per week, and approximately 37% of them see a doctor (an indication for more severe symptoms). Moreover, influenza symptoms usually last 4-5 days leaving us with an approximate average of p sick = 0.3% of the population showing significant influenza-like symptoms at an arbitrary time of the year. When testing asymptomatic individuals, this includes infected persons without symptoms but also any other individual not known to be infected or healed and not showing any symptoms (i.e. S + I + H − H c − p sick , where H c are the confirmed healed cases, compare J o u r n a l P r e -p r o o f Section 2.2). The total amount of resources used for testing is then captured by the following cost: Denoting the parameter θ from Table 1 as θ n , we assume a fixed bound c > 0 on the amount of resources for testing c test and that θ n corresponds to the nominal value for A = 0, i.e., c := c test (0, θ n , 0). In the following, we assume that the current policy with respect to testing stays in place: all available tests are used on a daily basis for as many symptomatic people as tests are available. Then the testing policy used throughout this section reads (t) = 0 (as is current practice) and Note that this also implies that throughout this paper the state D ≡ 0. The allocation of tests (with the possibility of also saving test resources for later) can also be modeled as control inputs. However, in the present model the effects of temporarily saving tests (under the current resource constraints) are negligible compared to the effects of changes in the infection rates. Increasing the overall test capacity or improving the choice of test subjects (e.g. with tracing of cases), which corresponds to increasing values of θ n /c, on the other hand, can potentially improve the evolution of the pandemic significantly, since detected individuals are less contagious than undetected ones. However, increasing test capacities or better allocated testing (especially with regard to , i.e. tracing of also asymptomatic infections) is at the current stage not included in our consideration but could be addressed in future work with the presented model by choosing = 0 and making θ n /c an increasing and time-varying variable. Given the introduced control input u, different control goals can be formulated. One such goal could be to obtain herd immunity. Herd immunity corresponds to the only stable equilibrium given no social distancing measures (i.e. with α max , γ max ) and requires a large part of the society to be immune. More precisely, herd immunity is reached if S < S , where [12] provide a formula for calculating S (see Section 3.1.1 for more details). Given our model, we can now calculate the minimum time that is needed to reach herd immunity. For this, we assume that we can choose a policy that utilizes the full health care capacity at all times. gives a lower bound on the time required to reach herd immunity without exceeding the health care capacity given the introduced model. The herein identified J o u r n a l P r e -p r o o f model parameters yield a time span t herd of more than six years. A stable steady state in the absence of a vaccine (i.e. herd immunity) can hence only be obtained either after many years or by overstraining the health care system and a corresponding significant rise in the number of fatalities. Therefore, our ongoing assumption throughout this section is that prior to herd immunity, a vaccine will be available and we assume the availability of the vaccine in approximately two years. Our goal is thus to find an optimal policy minimizing the number of fatalities for the next two years while imposing as little constraints as possible on the public and the economy. In the next subsection, we simulate and discuss the following policies: 1. Keeping the social distancing measures in place (or even increasing the measures) until the virus is eradicated in Germany 2. Slowly (or more aggressively) loosening the distancing measures without overwhelming the health care capacities (while possibly risking a second wave). In fact, the presented baseline policies are similar to the policies suggested by the German "Helmholtz-Initiative" in [39] . We will discuss our conclusions in comparison to theirs at the end of the section. In Section 3.2, we will then improve these baseline policies by applying optimal control techniques and we will discuss the importance and significance of the improvements. In the following, we argue that a consistent lockdown strategy necessitates strong lockdown measures over a long time horizon to fully eradicate the virus as otherwise, dropping the social distancing measures too early leads to a second outbreak wave. Based on the SIDARTHE model fitted to the German outbreak, described in Section 2.1, we simulate how long we would need to remain in lockdown and simply wait for the virus to disappear. We define the disappearance of the virus as follows: If -most probably -there is less than one active contagious case, i.e., I + D + A + R + T < 0.5/N total , the virus is eradicated. It takes 305 days, which is almost one year, until this condition is fulfilled and clearly the economical and psychological damage caused by a lockdown period this long is excessive such that staying and waiting in lockdown is not an option. With even stricter measures, such as α 3 = 0.8α 3 , γ 3 = 0.8γ 3 , we could only marginally accelerate this process to 288 days while increasing social distancing is costly, cf. the cost function in Section 3.2. Note that the equilibrium attained under the above lockdown policy is an unstable one that is not robust to uncertainties. In particular, if only one person remains infected when the measures are suspended they could cause a new outbreak. Also, the virus may be reimported from other countries or humans might be reinfected by an interim host. Next, we simulate the following three scenarios in all of which the German population is kept in lockdown for a predefined period of time, followed by no measures at all. The only difference is the length of the lockdown period. In the first scenario, the measures are abolished immediately (April 21). The second one keeps the current strict measures for an additional 50 days. The third variant simulates an even longer lockdown period, ending after 150 days counting from April 21. In Figure 4 , we compare the three scenarios. We clearly see that in all three cases the number of currently infected people I +D+A+R+T rises drastically a few days after the measures are removed independent of how long the lockdown persisted before. In any case, we experience a second outbreak wave. Staying longer in lockdown slightly delays the following peak of the share of active cases I + D + A + R + T , yet the peak amplitude is almost the same in all three scenarios. This behavior can be explained as follows. If there is no one who currently has the virus, i.e., I eq = D eq = A eq = R eq = T eq = 0, such that S eq + H eq + E eq = 1, an equilibrium point is attained. The stability of the equilibrium point depends on the value of S eq and the model parameters. In [12] , the authors show that the IDART subsystem is asymptotically stable if and only if S eq < S , where S eq is the susceptible state at equilibrium for a given initial condition x 0 and the corresponding parameters, especially for α and γ that are actively adjusted according to an underlying policy. The value of S follows from J o u r n a l P r e -p r o o f the stability analysis of the linearized IDART subsystem and has the following structure with respect to α and γ where a i , i = 1 . . . 4 are constants, see [12] for details and the definition of S . Note further that the commonly stated base reproduction rate (2) is directly linked to the value S via R 0 = 1/S . The stability of an equilibrium that depends on the parameters changes once we adjust α and γ. For strict measures (α min , γ min ), the value S is high (S = 2.242), such that a stable equilibrium is attained for any S. This means that only a small number of people is infected by the virus before the equilibrium is attained. With no measures (α max , γ max ), a stable equilibrium requires the share of susceptible people to be smaller than S = 0.292, i.e., herd immunity. We can hence conclude that if S eq (x 0 , α 3 , γ 3 ) > S (α 0 , γ 0 ), the equilibrium attained during lockdown is unstable with no measures. This means, there inevitably is a second outbreak wave once the lockdown ends. For the fitted model of the German outbreak, S eq (x 0 , α 3 , γ 3 ) = 0.9956 is attained after the first wave. Hence, at least another 70.4% of the German population get infected in the second wave before a stable equilibrium is attained. Altogether, this leaves us with the following conclusion of two possible outcomes when choosing a consistent lockdown strategy: Strong lockdown measures over long time horizons have to be taken to eradicate the virus in Germany. However, this takes a big toll on the public and any infected person, e.g. from abroad, could spark a second wave at any point. Any lockdown strategy that does not fully wipe out the virus inevitably yields a second outbreak wave once all measures are suspended. As argued in the previous subsection, keeping the lockdown policy strict can never lead to a stable equilibrium when ending the lockdown, no matter how long it did take place before all measures were suspended. Hence, many countries are now discussing or have even already started to loosen the lockdown in very small steps. Indeed, experts consulting the German government ("Nationale Akademie der Wissenschaften Leopoldina") have recently published their recommendations concerning a possible strategy for loosening the lockdown gradually in small steps [40] . They name the following conditions for looseing the lockdown in small steps: a) The number of new infections remains at a low value. b) The capacity of the health care system must not be exceeded. c) Precautions (such as hygienic measures, face masks, distancing) remain in force. In the following, we try to translate these recommendations into a policy for our simplified model to first analyze the results and second, to use this as a baseline policy for the optimizer in the following subsection. We implement the conditions presented above via the following policy strategy: a) u can only be decreased if, over the last n stab days, the number of newly infected persons (i.e. S(t − 1) − S(t)) is decreasing and u has not been increased b) u can only be decreased if less than X lower of the ICUs are occupied c) The decrease in u can only be a small decrease at a time and therefore, the interval between u max = 1 (lockdown) and u min = 0 (no measures) is divided into n steps equidistant steps. Additionally, we add that u will be increased again (with the same step size as the decrease) if more than X upper of the ICUs are occupied and no decrease in the amount of necessary ICU is witnessed. This policy results in four 'tuning parameters' of the policy: n stab , X lower , X upper and n steps . In fact, it turns out that the outcome of the simulation is not very sensitive to the tuning parameters of the policy, but can be tuned to be slightly more careful or more aggressive. In the following subsection, we choose two different sets of parameters as baseline policies for the optimal control approach. In this section, we contrast the baseline policy from Section 3.1.2 with an optimal control policy, under idealized assumptions (exact model and state measurement). The purpose of this section is twofold: a) Understand how an optimal policy differs qualitatively from the baseline policies. b) Quantify the loss of performance (in terms of increased fatalities and/or unnecessary social policy u) resulting from using a suboptimal baseline policy. The degree of freedom is the input u ∈ [0, 1] affecting the social policy and we consider the fact that the policy can only be changed every T s = 7 days. In the following, we consider the problem only for the next N = 100 weeks, assuming that thereafter a vaccine might be developed that would ideally prevent (almost all) further fatalities in the future. The overall control problem can be seen as a multi-objective optimal control problem, where we wish to simultaneously minimize the number of fatalities E and the societal and ecomical cost of the social policy measures, which will be measured by the function c policy (u) = 1/α(u). We point out that due to the parametrization (6) this cost also inherently considers the infection rate γ. This cost function is such that the social cost of achieving an arbitrarily small infection rate α grows unbounded, while for large values of u incremental differences are less relevant. In order to suitably characterize an optimal solution to this multi-objective problem we use the baseline policies in Section 3.1. The resulting optimal control J o u r n a l P r e -p r o o f problem is given by (11) below, which will be explained in the following. In particular, our goal is to find an input policy that minimizes the number of accumulated fatalities, while using less resources than the baseline policy in terms of accumulated social impact of c policy (c. f. (11d) ). We point out that similar "stabilization" problems subject to resource constraints for the control of epidemic outbreaks can be found in the literature, also using a fractional cost c policy , compare e.g. [41, 42] . When minimizing the number of accumulated fatalities, it is important to consider not only the extinct individuals E(N · T s ) at the end of the two year horizon, but to account as well for the part of the already infected individuals that will decease after the two year horizon. The reason for this is, while the availability of a vaccine at the end of the horizon might prevent future infections, it cannot cure already infected people. Hence, if we do not account for the inevitable fatalities among the individuals infected at the end of the prediction horizon, the optimal controller does not take any efforts to keep them low and as a result a lot of people would die shortly after the two year horizon. Therefore, we propose an optimization objective that includes all past and inevitable future fatalities. Based on the model (1), we know that a total of ζ ζ+λ (I +D) of the infected people I +D will develop symptoms in the future and further that a total of µ µ+κ ζ ζ+λ (I +D)+A+R) will become threatened. Thus, assuming the capacity T ICU is not exceeded afterwards, i.e., setting constant values τ = τ (0) = µ1 µ τ 1 + µ2 µ τ 2 and σ = σ(0) = µ1 µ σ 1 + µ2 µ σ 2 , the amount of inevitable fatalities is exactly given by Hence, given a baseline solution u b , x b from Section 3.1.2, the corresponding optimal control problem reads as follows: Since we only change the policy every week, the index k in (11) corresponds to weeks and F (N · T s ) corresponds to the objective function in (10), where the states result from simulating the system (1) with the parameters and the initial condition from Section 2 and the input u(·) over k-weeks. Condition (11d) ensures that the encountered social cost is smaller than the cost of the baseline policy. This optimal control problem is such that the baseline policy u = u b is a feasible solution and thus the resulting fatalities F (N · T s ) will always be lower than that of the baseline policy. We point out that it is possible to consider a more restrictive transient constraint on the policy cost instead of (11d), which is discussed in detail in Appendix B. The optimal control problem (11) can be formulated as a nonlinear program (NLP) and is subsequently solved using CasADi [32] . For comparison and to implement the constraint (11d), we use the baseline policy from Section 3.1.2 with X lower = 0.4, X upper = 0.7, n steps = 14, n stab = 14, which overall is rather cautious and does not exceed the ICU capacity. The corresponding results for the baseline policy and the optimal control strategy can be seen in Figure 5 . Although the optimal control input yields initially (first 100 days) a slightly larger number of infected individuals and thus slightly more fatalities in the first 200 days, the number of infected individuals is subsequently significantly lower and the overall number of fatalities is reduced to only 26%. The optimal controller allows for a smooth increase of the infection rate α, while keeping the number of threatened individuals (T ) consistently below the corresponding value of the baseline policy after the first 200 days, thus yielding a small number of fatalities. The rising number of infected individuals (I) at the end results from the finite-horizon and will be considered later in more detail. We also consider a second baseline policy using X lower = 0.6, X upper = 0.85, n steps = 12, n stab = 14, which slightly relaxes the social policy, but also exceeds the ICU capacity. The result is shown in Figure 6 . We can see that in comparison to this second baseline policy the optimal policy significantly reduces the number of fatalities to only 39%. The optimal strategy is a lot more cautious in reducing the social policy, while the baseline is more aggressive and goes back and forth between increasing and decreasing α, resulting in significant violations of the ICU capacity. Furthermore, the simple baseline policy results in a second wave as the restrictions are loosened too quickly, while the optimal strategy slowly but steadily increases α after the first 200 days, and thereby avoids a second wave. In both examples, a further observation should be highlighted. After an initial phase of containing the outbreak, the measures are slowly but steadily relaxed until a larger release at the end of the horizon. Similar behavior can be observed for many optimal control problems with finite horizons and is commonly referred to as "turnpike" behavior, which goes back to [43] . An explanation for this is that the consequences of decisions taken at later points in time mainly occur after the end of the horizon, such that a more aggressive policy towards the end is optimal, when only considering the finite two year horizon. Of course, one would not implement the "leaving arc" if the development of a vaccine would not be finalized after two years, since implementing such a policy may lead to an uncontrollable increase of infections towards the end of the time horizon in case that the model is inaccurate and should thus be avoided in practice. In Appendix A, we therefore discuss how adding "terminal constraints" to the optimization problem can prevent this turnpike behavior of the optimal solution, at the price of an increasing number of fatalities. J o u r n a l P r e -p r o o f If we compare the results in Section 3.2 with the consistent full lockdown from Section 3.1.1, we can see that it is possible to appropriately increase α without exceeding the ICU capacities, while the consistent full lockdown strategy would require a lockdown that takes approximately a year to be effective. Hence, while a consistent lockdown can effectively minimize the number of deaths, this strategy is only viable in case this lockdown can be prolonged over the corresponding time horizon, unless a vaccine is developed earlier. On the other hand, both the optimal controller and the baseline controller allow for a significant relaxation of the lockdown (on average a doubling of α), without significantly increasing the number of fatalities. In comparison to the baseline policy suggested in Section 3.1.2, the optimal control policy results in a slower but smooth loosening of the distancing policies. Without increasing the social cost over the full time horizon, this optimal policy avoids any violation of the maximum ICU capacity and hence results in a significantly smaller fatality rate. We point out that the resulting optimal policy of slowly increasing α is qualitatively similar to the resulting optimal policy in [14] , albeit for a different control goal. It can be seen that an initially "stronger" lockdown (i.e., a smaller value of α) over a longer time period with subsequent loosening leads to a better handling of the pandemic, compared to repeated tightening and loosening of distancing measures. Moreover, a smooth and monotone loosening of distancing policies is also desirable from an economic aspect since repeated lockdowns after interim-periods of relaxed distancing guidelines may be even more damaging to the economy, compared to an initially longer lockdown. Comparing our results with the proposed scenarios by the Helmholtz Association [39] , we find that we agree that the goal of herd immunity without overwhelming the health care capacity would require years. Our results further agree with [39] that the contact restrictions can only be loosened slowly if the health care capacity must not be overwhelmed. However, since the authors in [39] do not consider the availability of a vaccine, their conclusion is keeping or even increasing the lockdown until the number of infected persons is small and all infections can be traced efficiently and effectively via strategically allocated (and increased) testing. With the assumption of a vaccine within the next two years, we argue that a slow and smooth loosening of the lockdown does not lead to many more fatalities while decreasing the social and economic cost significantly according to our model. Concurrently, increased and strategically better allocated testing, e.g., via a contact tracing mobile app [44] , is of course highly beneficial and greatly advisable to improve the performance (even if it this was not accounted for in our model). To summarize, it seems possible to reduce the current restrictions and thus allow α to increase without exceeding the ICU capacity. Furthermore, optimized policies can significantly improve the outcome (in terms of fewer deaths and/or less social restrictions). However, the result is highly sensitive w.r.t. the change in the infection rate, while an accurate control of the infection rate α (e.g. ±5%) J o u r n a l P r e -p r o o f through governmental policies seems difficult/unrealistic. In the next section, we will therefore deal with these issues by formulating a robust control strategy that takes uncertainty in our COVID-19 model into account and uses feedback based on uncertain state information. Section 3.2 shows that an optimal control policy can significantly reduce the number of fatalities compared to a baseline policy that allows for iterative loosening of social distancing measures. This optimal control policy is computed by optimizing over all possible policies to find the one minimizing the number of fatalities predicted by the model equations (1) without using stronger shutdown measures than the baseline. Hence, the policy proposed in Section 3.2 strongly relies on the accuracy of the model identified in Section 2 and thus may fail to effectively control the outbreak in case of a model mismatch. However, such a model mismatch is inevitable in practice, especially since the model itself is a simplification of a much more complex reality and the identification outlined in Section 2.2 strongly depends on the (sparse) available data and the additional prior knowledge based e.g. on further studies concerning COVID-19, which also provide only estimates. In addition, the optimal control policy relies on exact knowledge of all states and on the assumption that values for α and γ can be exactly imposed up to arbitrary precision via social distancing measures, both of which are unrealistic assumptions when applying the policy in practice. In this section, we show how online measurements can be utilized via feedback to effectively and robustly control the German COVID-19 outbreak in the presence of uncertain parameters. More precisely, we illustrate that the optimal open-loop policy of Section 3.2 may lead to poor performance when applied to validation models with a different set of parameters, although these validation models result from adjusting only one prior assumption in the identification and still fit the past data well. On the other hand, we show that a model predictive control (MPC) feedback strategy, based on repeatedly computing an open-loop policy for the nominal model from Section 2, is inherently robust w.r.t. model inaccuracies and successfully handles the outbreak. At each time step k = 1, . . . , N , where k corresponds to weeks and N = 100 as in Section 3.2, we solve the optimization problem (11) over the time horizon k, . . . , N using the current measurements as initial condition at week k. Then, we apply the computed optimal policy over one week before solving the problem for the new measurements again. In this way, since the initial conditions in the optimal control problem are updated, a feedback mechanism is included as is standard in MPC [45] . As a result, the prediction horizon N − k of the optimization problem is shrinking with each time step k, such that it never exceeds the considered total time horizon of N = 100 weeks, after which we assume the availability of a vaccine. Hence, since the constraint (11d) needs to J o u r n a l P r e -p r o o f hold over the whole time horizon N , we replace it by with c 1 policy = N −1 k=0 c policy (u b (k · T s )) being the cost of the first baseline policy in Section 3.1.2. As a second modification, we adapt the bound on the social distancing cost online, depending on the predicted states, as is detailed in the following. In Section 3.2, we proposed an open-loop optimal control strategy, where the inputs were the infection rates α and γ. Loosely speaking, the control goal was to achieve a minimum number of fatalities without imposing stronger social distancing measures than a simple baseline policy (compare (11d)). Since this constraint heavily depends on the model to which the baseline is applied, a realistic setting with imperfect model knowledge should allow to adapt the constraints on the policy online in case that the nominal model is overly optimistic or pessimistic. Instead of simply requiring that the cost of the MPC-based feedback cannot exceed c 1 policy , we increase the maximum cost in case that the predicted number of patients requiring intensive care lies above 90% of the maximum capacity T ICU at least once during the horizon, and we decrease it in case that the number consistently lies below 10% of T ICU . This adaption is natural, as in reality one would increase the efforts to contain the outbreak if the current measures are insufficient, and on the other hand, the population cannot be expected to accept strict measures when there are only few (severe) cases across the country. Therefore, the maximum cost in week k, denoted by c b (k), varies online with k and is initialized as c b (0) = c 1 policy . The amount by which we change c b online is ±∆ u N −k N , where ∆ u = 1 αmin − 1 αmax with α min and α max as in Section 2.2. If, for instance, the model predicts large numbers of future ICU patients, then the cost bound c b is increased by the difference between the minimum and the maximum social distancing cost, scaled by the remaining time horizon. This increase corresponds to the social cost of an additional week in full lockdown, scaled by the remaining time horizon via the factor N −k N . The proposed MPC-based feedback strategy is summarized in Algorithm 1. In the algorithm, T (j ·T s | k·T s ) denotes the number of threatened individuals at time j·T s , predicted by the optimal solution of (11) at time k·T s . Essentially, the algorithm repeatedly applies the open-loop optimal control policy of Section 3.2 with the key difference that, at time k, all past measurements j = 1, . . . , k are used in the optimization problem, thus including an online feedback. In addition, in Step 3 of the algorithm, the social policy constraints is adapted as described above. with the state-dependent cost F as in (10), based on simulating the model (1) over the remaining horizon N −k subject to the input u, starting at the current measured state at time k. 2. Apply the optimal policy u * (k · T s ) for the next T s = 7 days. 3. Update the social policy cost as To assess the improved robustness of Algorithm 1 compared to open-loop optimal control, we produce two validation models. More precisely, we identify two new sets of parameters A and B by proceeding exactly as in Section 2.2 with the only difference that we change the prior assumption that the stationary ratio of confirmed COVID-19 cases is in the interval [0.3, 0.45]. Instead, we assume that this value is in [0.3, 0.6] for set A and in [0.3, 0.4] for set B. When performing parameter identification based on these modified prior assumptions, we also obtain sets of parameters that can accurately explain the existing past data on COVID-19 cases in Germany. However, the resulting models have different dynamics and different reproduction rates for the same lockdown policy. Increasing the above ratio as in parameter set A decreases the number of infected and undetected individuals resulting in a higher reproduction rate to explain the same amount of confirmed cases. Hence, if an open-loop policy based on the nominal model (i.e., with parameters described in Section 2.2) is applied to the validation model with parameters A, then the number of infections and thus the number of fatalities increases significantly. To illustrate this effect, we apply the open-loop optimal control policy based on the model with parameters as in Section 2.2 to the new models with parameter sets A and B. The control effort of this policy, i.e., the amount of social distancing, is constrained as in (11d) by the cost of the baseline policy when applied to the nominal model identified in Section 2.2. The results for the model with parameters A can be seen in Figure 7 . Since this validation model has a higher reproduction rate for similar inputs as explained above, the number of fatalities after N = 100 weeks increases significantly compared to the simulations in Section 3. This is due to the fact that the open-loop policy is only computed once, at the beginning of the time horizon, and is then applied over the whole time span of two years without any online adaption based on new measurements. Therefore, it cannot handle the model mismatch and thus has a significantly worse performance. In addition, Figure 7 shows the evolution under the proposed MPC-based feedback, which leads to a significantly lower number of fatalities compared to the open-loop policy. We point out that the feedback (partially) compensates the fact that the control action is computed based on the nominal model parameters from Section 2.2, which differ significantly from parameter set A. Due to the larger number of infected individuals, the maximum social cost c b is increased at multiple time steps, which is indicated by the step-like increases of the input. Finally, an open-loop optimal control policy is computed which is allowed to use the same amount of resources as the MPC-based feedback (in hindsight), i.e., the adapted social and economical cost c b (N ). While this policy performs better than the initial open-loop policy with fewer resources, it leads to a similar number of fatalities at the end of the horizon compared to the feedback controller. However, the number of threatened patients is very large at time k = N , which would lead to a significant increase in fatalities after the considered time period, even if a vaccine is available. To conclude, the above discussion reveals that a combination of open-loop optimal control with feedback is inherently robust in the sense that it effectively controls the German COVID-19 outbreak even if the employed model is inaccurate. When comparing the result to an open-loop strategy, then the MPC-based feedback strategy can dramatically decrease the number of fatalities or the necessary amount of social distancing, respectively. Such robustness is an important property for applying any control strategy in a real-world scenario, where accurate model knowledge is rarely available. In the next section, we propose a more systematic robust MPC approach which explicitly takes model inaccuracies as well as uncertain state measurements and control inputs into account in order to safely and cautiously control the COVID-19 pandemic. While the MPC-based feedback policy proposed in Section 4 is significantly more successful in handling the outbreak compared to a simple open-loop policy, it relies on the assumption that exact measurements of the state in (1) are available at each time step. In this section, we consider a more realistic scenario of uncertain measurements in terms of biased state estimates, and we analyze the impact on the closed-loop operation. In particular, we develop a robust MPCbased feedback strategy using interval arithmetic that takes the uncertainty into account during the predictions and thus leads to a safer policy minimizing the number of fatalities. In the following, we consider the case where at each day k instead of the true state x(k) we only obtain an estimated statex(k), which is subject to an additional bias. In Table 2 , we summarize the uncertainties in the states. For individuals in states D and R, the disease COVID-19 was detected by tests. Hence, their values are well known, nevertheless, we assume that they can slightly differ from the true states by ±1%, as there might be cases on the borderline between D and R that are hard to assign to either of the states. The number of people in ICUs is well documented. However, the state T contains not only patients in ICUs (T ICU ± 1%) but also other infected members of the risk group (T 2 ± 5%), cf. Section 2.1, such that the uncertainty we use is ±( µ1 µ · 1% + µ2 µ · 5%). We assume that the number of deaths is certain by ±1% as it includes some people that died of different causes. As the undetected cases can by definition not be measured, they must be estimated using random J o u r n a l P r e -p r o o f sampling or strategies like [22] . Therefore, the states I and A are much less certain, especially without symptoms (I ± 50%, A ± 20%). Recovering from the disease is a resulting state from both rather certain states, D and R, and highly uncertain states, I and A such that overall it is uncertain itself (H ± 50%). The uncertainty of the state of susceptible persons S results from the other states: i . It is possible to directly use this biased state estimatex(k) in Algorithm 1 and compensate the bias through the inherent robustness in the feedback implementation. In the following, we derive an alternative robust formulation that explicitly considers the uncertainty in the prediction. First, given a biased state estimatex(k) and known bounds on the bias (Tab. 2), it is possible to compute interval bounds x(k), x(k) such that the true state is guaranteed to lie in that interval, i.e., The following formulation will predict the interval bounds x i and x i instead of using some nominal prediction. This methods is similar to interval arithmetic employed in robust MPC [46] and the robust moment enclosure for an SEIV epidemic model in [47] . Using the fact that the system is positive (x i and all the parameters are positive), it is possible to derive an interval prediction of the forṁ x =f (x, x, u), for all t ≥ 0, given suitable bounds on the uncertain parameters in the system model (1) . The detailed derivation of the interval prediction model (14) can be found in Appendix C (more precisely, Equations (C.4)). Since deriving reliable bounds on all parameters in the model (1) is rather difficult or unnecessarily complex, we only focus on the uncertainty associated with the infection rate α. In particular, we consider an uncertainty of ±5% on the infection rate α. Thereby, we explicitly consider the problem that the infection rate cannot be precisely specified via social distancing measures. We will see later in the simulations that, although we do not account for all possible mismatches in the prediction model, we nevertheless obtain the desired properties in closed loop. Given this interval prediction model, the proposed robust formulation now predicts an interval for the different state variables and minimizes the worst-case number of fatalities F based on x(N · T s ). The overall procedure is summarized in Algorithm 2. Algorithm 2. Robust MPC strategy using interval arithmetic 1. Given biased state estimatex(k · T s ), compute set [x(k · T s ), x(k · T s )]. 2. Solve the following problem min u(·) with F based on (10) using x, which results from the interval predictions of the model (14) over the emaining horizon N − k subject to the input u, starting at the current set estimate [x(k · T s ), x(k · T s )]. 3. Apply the optimal policy u * (k · T s ) for the next T s = 7 days. 4. Update the social policy cost as in Algorithm 1 using T instead of T . 5. Set k = k + 1. In the following simulations, we consider the extreme case where the number of estimated infected or previously infected individuals (I, D, A, R, T , H, E) is underestimated. The results for the robust MPC and the nominal MPC in comparison to open-loop optimal control strategies for the two validation parameter sets A and B (compare Section 4) can be seen in Figure 8 . Due to the worst-case prediction in the robust MPC, at t = 0 the robust MPC already increases the resources two times, for both parameter sets, such that at t = T s the predictions satisfy µ1 µ2 T (k · T s ) ≤ 0.9T ICU . In the simulation with the model based on parameter set A, we can directly see that both the nominal MPC and the robust MPC reduce the number of fatalities compared to an open-loop optimal control strategy. Furthermore, if we compare the robust MPC and the nominal MPC, we can see that after t = 140 days the nominal MPC implementation realizes that the spread is worse than initially assumed. This leads to a strong increase in social measures u at t = 140. With the robust formulation, u decreases almost monotonically. Furthermore, the nominal implementation results in twice the number of fatalities, while the applied resources c policy over the two year horizon differ by less than ∆ u , which corresponds to one week of lockdown. This indicates that the robust MPC, planning cautiously from the beginning, exploits its resources more efficiently by imposing stricter social distancing measures early on, which results to be beneficial in the long run. For the second parameter set B, we can see that the worst-case robust formulation uses initially an unnecessarily high control effort. Nevertheless, overall the applied resources c policy differ by less than ∆ u compared to the nominal MPC, while the number of fatalities is still reduced by 33%. In comparison to the open-loop policies, both MPC policies require less or equally restrictive policy measures u, while the number of fatalities are significantly reduced. In the following, we summarize our findings on a high-level and highlight the main take-away messages: Our results in Section 3 confirm the conclusions in [15] that neither eradication of the virus nor herd immunity without the availability of a vaccine are viable solutions to handle the current COVID-19 outbreak. Applying an optimizer to the mathematical model describing the outbreak, one can significantly reduce the number of fatalities without increasing the costs associated to decreasing the infection rate (social distancing policies, closing schools, etc.), compare Section 3.2. Since the proposed model can never exactly predict the COVID-19 pandemic, applying a nominal optimal policy introduces unnecessary conservatism, at best, up to posing a great danger (i.e. overwhelming the health care capacities risking high mortality rates). Therefore, our findings in Section 4 support [15] by showing that any policy to control the COVID-19 outbreak successfully has to be an adaptive strategy. This means we need to constantly measure, monitor and estimate the current numbers and adapt our policy accordingly, i.e., feedback is necessary for reliably handling the outbreak. If we already a priori take into account that our model includes mismatches and that all measured and estimated numbers are not exact and can have a bias, we can further improve the outcome, as shown in Section 5. More specifically, we developed a robust MPC-based feedback strategy using interval arithmetic. The application of feedback without the robust description of the considered model can lead to intermediate increases in the number of new infections necessitating another period of lockdown. On the contrary, a robust feedback strategy can take these model mismatches and other uncertainties into account and is hence able to avoid such behavior, thus significantly reducing the number of fatalities. When looking at the qualitative results the robust MPC-based feedback offers, one can see that, accounting for the instability and uncertainty of the spread of the virus, the controller suggests a rather strict policy at the beginning and only then allows for a gradual increase in the infection rate. Keeping this loosening slow at the beginning shows a beneficial effect in There are also influences on the course of the outbreak that were not taken into account in the present paper but which are important in an overall strategy towards the spread of COVID-19 (e.g. increasing testing capacities, tracking of infections, as well as investigating which measures lead to the desired infection rate). However, controlling the infection rate is certainly one of the key factors and hence, this paper contributes towards mitigating the spread of COVID-19 under manageable societal and economic costs. We hope that the proposed feedback strategies inspire further investigations in this direction and offer qualitative and high-level insights that underpin the current policies or strategy papers. J o u r n a l P r e -p r o o f Journal Pre-proof absolutely crucial, since the two policies are essentially equivalent in the period from 110 days -150 days, i.e., including the critical time period where the ICU capacity is exceeded, but differ significantly over 30 weeks prior to the violation of the ICU capacity. In the following, we derive the dynamics of the interval predictions f , f used in (14) . Note that the following property holds for any scalars a ∈ x i . In principle it would also be possible to directly set S, S using the other states x i , x i instead of simulating (C.4a)-(C.4b), but this may not necessarily ensure S ≤ 1 and S ≥ 0. Controlling epidemic spread by social distancing: Do it well or not at all Social distancing strategies for curbing the COVID-19 epidemic, medRxiv preprint Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China Inferring covid-19 spreading rates and potential change points for case number forecasts A first study on the impact of current and future control measures on the spread of COVID-19 in Germany, medRxiv preprint Modeling exit strategies from COVID-19 lockdown with a focus on antibody tests, medRxiv preprint COVID-19: from model prediction to model predictive control Analysis and control of epidemics Can the COVID-19 epidemic be controlled on the basis of daily test reports? On fast multi-shot epidemic interventions for post lock-down mitigation: Implications for simple COVID-19 models Modeling, state estimation, and optimal control for the US COVID-19 outbreak Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy Optimal COVID-19 epidemic control until vaccine deployment, medRxiv preprint Beyond just "flattening the curve": Optimal control of epidemics with purely non-pharmaceutical interventions Adaptive Strategien zur Eindämmung der COVID-19-Epidemie A contribution to the mathematical theory of epidemics An interactive web-based dashboard to track COVID-19 in real time novel coronavirus COVID-19 (2019-nCoV) data repository by Johns Hopkins CSSE Temporal dynamics in viral shedding and transmissibility of COVID-19 Average detection rate of SARS-CoV-2 infections is estimated around nine percent Suppression of a SARS-CoV-2 outbreak in the italian municipality of vo Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship Schätzung der aktuellen Entwicklung der SARS-CoV-2-epidemie in Deutschland -Nowcasting Report of the WHO-China joint mission on coronavirus disease 2019 (COVID-19) Incubation period and other epidemiological characteristics of 2019 novel coronavirus infections with right truncation: A statistical analysis of publicly available case data Clinical course and outcomes of critically ill patients with SARS-CoV-2 pneumonia in Wuhan, China: a single-centered, retrospective, observational study Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in CasADi: a software framework for nonlinear optimization and optimal control Estimates of the severity of coronavirus disease 2019: a model-based analysis Critical care utilization for the COVID-19 outbreak in Lombardy, Italy: early experience and forecast during an emergency response Forecasting COVID-19 impact on hospital bed-days, ICU-days, ventilator-days and deaths by US state in the next 4 months, medRxiv preprint Potential association between COVID-19 mortality and health-care resource availability COVID-19 Surveillance Group, Characteristics of COVID-19 patients dying in Italy, report based on available data on Systemische Epidemiologische Analyse der Covid-19-Epidemie', Stellungnahme der Helmholtz-Initiative 'Systemische Epidemiologische Analyse der COVID-19-Epidemie Coronavirus-Pandemie-Die_Krise_nachhaltig_%C3%BCberwinden_ final.pdf Optimal resource allocation for network protection against spreading processes Dynamic resource allocation to control epidemic outbreaks a model predictive control approach Linear programming and economic analysis, Courier Corporation Mobile phone data and COVID-19: Missing an opportunity?, (2020) Model Predictive Control: Theory, Computation, and Design Robust MPC of constrained nonlinear systems based on interval arithmetic Robust economic model predictive control of continuous-time epidemic processes In order to avoid artefacts of considering a finite-horizon problem (e.g. a lot of infected people at the end of the horizon), an alternative to considering the modified cost function F from (10) is the inclusion of additional terminal constraints for the contagious population IDART = (I, D, A, R, T ) ∈ R 5 . In particular, we require that at the end of the control horizon N , the number of contagious individuals in each category (I, D, A, R, T ) should be smaller than the corresponding number from the baseline policy (c.f. (A.1a) ). In addition, at the end of the horizon the number of contagious individuals should be nonincreasing, which is implemented as (A.1b).Hence, in the following, we replace the cost F by the number of fatalities E, and we add the following constraints to the optimal control problem (11) from Section 3.2:Again, the index k in (11) corresponds to weeks and the states IDART (k · T s ) correspond to the result of simulating the system (1) with the parameters and initial condition from Section 2. These terminal conditions (A.1a)-(A.1b) (which should be interpreted element-wise) ensure that the final state after the finite horizon N is "better" than the baseline solution (c.f. (A.1a) ) and the outbreak can be contained (c.f. (A.1b) ). The simulation results with the two baseline policies shown in Figure A.9 and A.10 demonstrate that the terminal constraints indeed effectively prevent the turnpike behavior. However, the additional constraints also lead to a slight increase in the number of fatalities. We wish to briefly mention a stronger restriction on the societal cost of the optimal control strategy. In particular, instead of only restricting the cost over the considered horizon of N = 100 weeks, a stronger property is to ensure that at any time t, the previously accumulated policy cost is smaller than the corresponding cost of the baseline policy. This can be done by replacing condition (11d) with the following transient constraint:The corresponding results for both baselines considered in Section 3.2 can be seen in Figure B .11. In this case the number of fatalities are reduced by 33% and 37%, respectively. Thus, also for this more restrictive setting, the optimal controller can significantly reduce the number of fatalities. In addition, in the comparison to the more aggressive baseline we also see that early measures are J o u r n a l P r e -p r o o f