key: cord-0807958-ycfav7ae
authors: Hayhoe, Mikhail; Barreras, Francisco; Preciado, Victor M.
title: Multitask learning and nonlinear optimal control of the COVID-19 outbreak: A geometric programming approach()
date: 2021-05-19
journal: Annu Rev Control
DOI: 10.1016/j.arcontrol.2021.04.014
sha: d95aa5e9d5004a0bda1fa9008ed8e46ebc0186ae
doc_id: 807958
cord_uid: ycfav7ae

We propose a multitask learning approach to learn the parameters of a compartmental discrete-time epidemic model from various data sources and use it to design optimal control strategies of human-mobility restrictions that both curb the epidemic and minimize the economic costs associated with implementing non-pharmaceutical interventions. We develop an extension of the SEIR epidemic model that captures the effects of changes in human mobility on the spread of the disease. The parameters of the model are learned using a multitask learning approach that leverages both data on the number of deaths across a set of regions, and cellphone data on individuals’ mobility patterns specific to each region. Using this model, we propose a nonlinear optimal control problem aiming to find the optimal mobility-based intervention strategy that curbs the spread of the epidemic while obeying a budget on the economic cost incurred. We also show that the solution to this nonlinear optimal control problem can be efficiently found, in polynomial time, using tools from geometric programming. Furthermore, in the absence of a straightforward mapping from human mobility data to economic costs, we propose a practical method by which a budget on economic losses incurred may be chosen to eliminate excess deaths due to over-utilization of hospital resources. Our results are demonstrated with numerical simulations using real data from the COVID-19 pandemic in the Philadelphia metropolitan area.

Ever since the first COVID-19 case was reported on December 31st 2019 [1] , the SARS-CoV-2 pandemic has spread world-wide, reaching alarming levels of spread and severity [2] . The response to the first wave of COVID-19 by governments was the implementation of large scale non-pharmaceutical interventions (NPIs) ranging from contact tracing, quarantines and mask usage, to more aggressive measures like city wide shelter-in-place orders, air-travel restrictions and closures of non-essential businesses [3] . In the absence of pharmaceutical treatment, prevention, or herd immunity, NPIs remain the only tool to curb the spread of an epidemic in its earlier stages. In the case of COVID-19, governments across the world have implemented strategies to relax mobility restriction measures and reactivate the economy [4] while, at the same time, preventing the collapse of healthcare systems. However, relaxing mobility restrictions too fast or carelessly can result in resounding waves, as we are currently observing for the case of COVID-19. In fact, as long as enough people in the population are susceptible, the danger of recurrent waves is not only real, but probable. In this situation, it is of utmost societal importance to develop reopening strategies in a principled manner utilizing the wealth of data readily available.

Several epidemic models have been proposed in the recent literature to simulate the effects of social distancing on the evolution of the pandemic; see, e.g., [5] [6] [7] . Although the majority of epidemic models in recent years are variations of the seminal mathematical models on theoretical epidemiology (see [8] and references therein), the availability of rich datasets describing human mobility and behavior is rapidly changing the field of mathematical modelling of epidemics. Companies like Google, Foursquare, Safegraph, Baidu and others, have provided public access to massive datasets describing human mobility, enabling the development of data-driven epidemic models capturing the effects of mobility restrictions. Indeed, the choices faced by decision makers regarding disease management involve the use of multiple control actuations such as vaccination, quarantine, treatment or, as is the case for COVID-19, non-pharmaceutical interventions such as social distancing. These decisions must face the trade-off of minimizing the impact of the disease and the economic cost associated with the implementation of non-pharmaceutical interventions.

In order to increase predictive power and utility for policy decision-makers, epidemic models have gradually increased their complexity to account for a multitude of features of real epidemics such as disease-specific compartmental models [9] , resurgence [10] , multi-scale effects [11] , seasonality [12] , differential risk structure in the population [13] , healthcare system capacity [14] , and uncertainty in testing data [15] , among others. This increased sophistication in the modeling often comes at the cost of mathematical intractability, a limitation often circumvented by formulating policies based on heuristics and/or simulations [5, 7, 12, 13, 16, 17] . Although informative for certain scenarios, these proposed interventions are not the result of rigorously formulated optimal control problems and, thus, lack the guarantees and flexibility of mathematical optimization frameworks.

2 J o u r n a l P r e -p r o o f Conversely, the control of epidemics does not usually admit straightforward solutions from optimal control theory due to the presence of nonlinearities and/or the lack of convexity [8] . There are important theoretical results in optimal control of epidemics which achieve mathematical tractability, for example, optimal resource allocation aiming to asymptotically drive the epidemic to extinction [18] [19] [20] [21] [22] [23] [24] . Other theoretical results are concerned with applications of Pontryagin's maximum principle (PMP), and find exact solutions to resource allocation problems under some variations of the SIS and SEIR, for example in [25] [26] [27] . In contrast, data-driven control frameworks in real epidemics have found limited applicability due to the difficulty in incorporating real data and the challenges in solving non-convex programs exactly. Recent work [18] has proposed a solution to the problem of optimal social distancing with economic constraints using a static convex program which seeks to ensure the decrease of the epidemic at all times; a condition that is too stringent and fails to consider more efficient dynamic strategies. In [14] , an optimal quarantine and testing policy is derived in an optimal control framework, but the control actions and objective functions are restrictive and chosen ad-hoc for tractability. Other recent applications have tackled the non-convex optimal control of social distancing policies using model predictive control (MPC), an approach which has been applied to a plethora of non-linear control problems in industry [28] . For example, in [29] the authors propose an optimal predictive control problem where a control input representing social distancing affects the infectivity rates directly and the number of fatalities is approximately minimized using a nonlinear program solver. A related application of MPC for optimal social distancing policies is found in [15] , in which, instead of controling the infectivity rates directly, the control input represents a binary lockdown policy enacted by the government which has a delayed effect in the population's level of isolation, in turn affecting the infectivity rate.

A practical concern is whether it may be possible to design optimal control strategies based on mobility restrictions that are fully data-driven, in the sense that human mobility is measured and explicitly incorporated in the epidemic model. In this paper, we propose a model of the spread of COVID-19 and a data-driven optimal control problem that directly minimizes the number of predicted cumulative deaths by implementing mobility restrictions in the population. We use real mobility data from Google [30] to learn a nonlinear mapping representing the impact of human mobility on the parameters of a dynamical epidemic model and propose a novel nonlinear optimal control problem that can be solved efficiently using tools from geometric programming [31] .

Our model consists of an extension of the classic SEIR model, augmented with compartments for asymptomatic and hospitalized agents. We assume that the rate at which agents become infected in any given day is a function of the mobility trends in the population for that same day, reflecting the fact that an increase in mobility leads to more infections. We rely on data regarding case counts and deaths from The New York Times [32] and from Google's COVID-19 Community Mobility Reports [30] to capture the changes in visitation patterns to different Places of Interest (POIs). The mobility data consists of several 3 J o u r n a l P r e -p r o o f time series measuring visits to various categories of places such as Retail & Recreation, Grocery & Pharmacy, and Workplaces. The dataset is organized into separate time series for all counties in the United States, and measures visits to multiple categories of places against a benchmark established in January and February of 2020. The key observation is that a decision maker can enforce restrictions on visits to each of these categories to reduce the spread of the epidemic while incurring a cost to the economy. Hence, our objective is to design optimal strategies for mobility restrictions to contain the spread of COVID-19 while taking in to account the associated economic cost.

Our approach is similar to that in [15, 29] in that we solve the problem of optimal social distancing policies to curb an epidemic under state constraints, which can be implemented with a receding horizon. However, we stress two important differences: first, we explicitly model and learn the impact of human mobility on the evolution of the disease spread using real mobility data and can formulate granular continuous mobility restriction policies that vary across economic sectors; second, unlike [15, 29] our optimization problem is reduced to a convex program which can be solved with great efficiency, has global optimality guarantees, and can accommodate a large number of variables and longer optimization horizons.

The structure of the paper is as follows. In Section 2 we introduce the notation used as well as some necessary background in geometric programming. In Section 3 we discuss the specifics of our data-driven model, consisting of a mobility layer and an epidemic layer. In Section 4 we discuss the details of our learning strategy to identify the parameters of the model. In Section 5 we present our optimal control framework and present simulations showing the effectiveness of our method. In Section 6 we conclude and discuss possibilities for further research. Appendix A includes additional results of practical interest, and Appendix B contains proofs of the results herein.

Throughout this paper bold characters are used to denote vectors and uppercase characters denote either matrices or compartments of the epidemic model. For the following definitions let x 1 , . . . , x n ≥ 0 denote n non-negative variables, and let x = (x 1 , . . . , x n ). When considering our epidemic model we use tildes for the states of the true (nonlinear) model, e.g.S(t), and omit the tildes for the linearized model, e.g., S(t).

Definition 2 (Posynomial). A sum of one or more monomials is called a posynomial, that is, a function of the form

Since posynomials admit negative exponents but do not admit negative coefficients they are not necessarily polynomials, and vice versa. We remark that posynomials are closed under addition, multiplication, and positive scalar multiplication. This implies that if the entries of two matrices A ∈ R m×k and B ∈ R k×n are posynomials of the same variables, then so are the entries of their product AB, since [AB] i,j = k t=1 A i,t B t,j , which is a sum of products of posynomials. This result extends trivially to the product of an arbitrary number of matrices with posynomial entries.

A careful application of Hölder's inequality shows that posynomials are convex in log-scale.

We solve the epidemic control problems presented herein using a quasiconvex optimization framework called geometric programming [31, 33] , which has found wide applicability in fields such as communication systems [34] , epidemiology [22] , and control [35] , among others. A geometric program (GP) is a mathematical optimization program of the form

where f (x), q 1 (x), . . . , q m (x) are posynomials and h 1 (x), . . . , h p (x) are monomials. Due to the convexity in log-scale, one can exactly transform 1 a GP to a convex program by means of the logarithmic change of variables y i = log(x i ) and transforming the objective and constraints with the logarithmic transformations F (y) = log f (exp(y)), Q i (y) = log q i (exp(y)) and H(y) = log h(exp(y)) to obtain minimize y F (y)

which is a convex program that can be efficiently solved using, for example, primal-dual interior-point methods; see [36] for more details. In practice, geometric programs with tens of thousands of decision variables and constraints can be solved to find the global optimum in a matter of seconds on a standard laptop computer.

We now describe the epidemiological model under analysis. We consider a region with N individuals and propose a population model with homogeneous mixing, i.e., every pair of individuals come in contact with a probability that depends on aggregated mobility variables. Population models are commonly used in the absence of granular data on the network of social contacts in a region [8] . We assume that each individual can visit POIs belonging to different categories 1, 2, . . . , K; let m(t) = (m 1 (t), m 2 (t), . . . , m K (t)) denote a vector of human mobility variables capturing the percentage change in volume of visits to each of those categories at a particular discrete time t (e.g., days) relative to a pre-established baseline; hence, m k (t) ∈ [0, ∞). In particular, we use publicly available mobility data from Google's COVID-19 Community Mobility Reports [30] which capture daily changes in visitation patterns to public places, for example, Retail & recreation, Grocery & pharmacy, and Parks, as well as time spent at work. For each category, Google reports the relative change in visits compared to a baseline of mobility measured before the lockdown measures took place. This baseline corresponds to the median daily visits to each category over the period comprising January 3 through February 6, 2020.

Our model consists of two layers: a mobility layer and an epidemic layer. The mobility layer captures the effect of human mobility on the spread of the disease and influences the dynamics taking place on the epidemic layer. In the epidemic layer, we consider an extension of the classic SEIR epidemic model [37, 38] which explicitly accounts for asymptomatic hosts and hospitalizations, as in [39] . Each of the individuals in a region belongs to one of seven possible compartments described below. In this model,S(t) represents the number of individuals susceptible to becoming infected at a discrete time t. In our optimal control problems, we will consider a finite horizon T c over which we can assume an almost constant number of susceptible individuals; hence, in these problems we assumeS(t) ≈ S 0 for all 0 ≤ t ≤ T c , which linearizes the model. The variableẼ(t) represents the number of individuals who have contracted the virus (exposed) but are in an incubation period at time t. After the incubation period, agents can move to one of the infectious compartments;Ĩ(t) represents the number of symptomatic individuals andÃ(t) represents the number of asymptomatic individuals. The asymptomatic compartment is included since asymptomatic individuals play a crucial role in the spread of COVID-19, with transmission rates that are different from symptomatic individuals [40, 41] . Asymptomatic individuals eventually recover on their own and move on to the recovered compartment, represented by the variable R(t). Symptomatic individuals can recover on their own, or their symptoms can worsen and they 6 J o u r n a l P r e -p r o o f subsequently require hospitalization, in which case they are moved into a hospitalized compartment, represented by the variableH(t). Since hospital capacity is a principal concern with the treatment of COVID-19, we include this compartment to constrain the control problems described in Section 5. In particular, our mobility-based control input will be constrained to prevent a hospital capacity overflow. Finally, agents that are hospitalized may either recover and transition to the compartment represented byR(t) or may die, and subsequently transition to the compartment represented byD(t). With explicit data on the number of deaths in every region, we may train our model to predict the population of this compartment. We make the simplifying assumption that only individuals with severe symptoms are at a risk of dying and, hence, all of them have been previously hospitalized. All parameters related to the dynamics of this model are summarized in Table 1 . Using these parameters, the discrete-time evolution of the number of individuals in each compartment, illustrated in Figure 1 , is given by:

In our model, we assume that susceptible individuals can transition into the exposed compartment when in contact with either Infected (symptomatic) or Asymptomatic individuals. We assume that the rate at which asymptomatic individuals infect others is weighted by a constant (unknown) parameter γ A . We also assume that the portion of hospitalized individuals who die, relative to those that recover, is equal to an unknown constant α D . In Section 4, we will introduce a methodology to learn these (and other) unknown parameters in our model. As mentioned above, this model is intended to solve optimal control 7 J o u r n a l P r e -p r o o f problems over a finite time horizon T c over which the number of new infected individuals is small compared to the entire population, so we can assume that the number of susceptible individuals at any time 0 ≤ t ≤ T c , S(t), is well approximated with a constant; hence, in our control problems we setS(t) = S 0 . Moreover, this assumption linearizes 2 the dynamics of the states; for notational clarity, we will omit the tildes when considering the states of the linear model. As we will see in Section 5, this linearization ensures that the entries of the state vector at any given time are posynomials on the parameter β(t). However, we incorporate a non-linear dependency of the parameter β(t) on the mobility restriction variables, which we will use as our external control variable, rendering the resulting model non-linear and multiplicative in the control input. Fortunately, the states of this linearized model upper-bound the states of the true model, as shown in the lemma below. Appendix A.2 contains simulation results to examine the tightness of these bounds in practice. Hence, by reducing the number of deaths in the linearized model, D(t), we are guaranteed to reduce the number of deaths in the true nonlinear model,D(t). 7). Thus, for all t > 0,

The mobility layer of our model incorporates the effects of non-pharmaceutical interventions, such as social distancing and other forms of mobility restrictions, which a decision maker may employ to curb the spread of the epidemic. By reducing human mobility, a decision maker induces fewer contacts between susceptible and infected individuals and, thus, reduces the risk of infection. In particular, we relate the infection rate β(t) to a time series m(t) of human mobility variables by means of an unknown function f (·). We choose f to be a parametric function whose parameters will be learned from data (described in detail in Section 4.1).

In order to employ non-pharmaceutical interventions, a decision maker designs a mobility control strategy to set the human mobility variables m(t) for some finite horizon t ∈ {0, . . . , T }. In mathematical terms, the decision maker designs an input {u(t)} Tc t=0 which affects future values of the mobility variables. For simplicity, we assume an identity mapping between mobility and the input, so that m(t) = u(t), and u(t) is in some set of admissible actions U. In other words, we assume that mobility patterns may be directly actuated to a desired level by a decision maker, subject to feasibility constraints.

Rate at which susceptible individuals become infected due to contacts with infectious individuals at time t; β(t) is a function of the mobility variables m(t) at time t γ A Weight representing lower risk of infection when infectious individuals are asymptomatic ρ EI Rate at which exposed individuals become symptomatic ρ EA Rate at which exposed individuals become asymptomatic ρ IR Rate at which symptomatic individuals recover on their own ρ IH Rate at which symptomatic individuals develop severe symptoms and become hospitalized ρ AR Rate at which asymptomatic individuals recover on their own ρ HR Rate at which hospitalized individuals recover α D Proportion of hospitalized individuals that die, relative to those that recover Table 1 : Summary of parameters in epidemic model. We assume ρ IR + ρ IH < 1, ρ EI < 1, ρ AR < 1, and ρ HR < 1, since these parameters capture the fraction of individuals in each compartment that transition to other compartments, which must be less than one for the model to be meaningful. We also assume γ A ∈ [0, 1], reflecting the fact that asymptomatic individuals are less likely to spread the disease.

Intuitively, lower values of u(t) correspond to more restrictions on human mobility. The decision maker may have fine-grained control over their control strategy, for example by closing individual establishments, imposing occupancy limits, or restricting hours of operation, and as such we treat the individual components of the control action u k (t) as continuous variables. Moreover, some categories may have different admissible actions, e.g., it may not be possible to close down all pharmacies but closing all gyms is reasonable. In mathematical terms, we will consider a set of allowable control actions U that is described using posynomial inequalities and monomial equalities.

Furthermore, implementing mobility restrictions in this manner cannot be done without incurring a financial loss. Closure of businesses causes economic losses, which need to be taken into consideration when selecting an appropriate control strategy. In particular, applying the temporal control strategy u(t) of mobility restrictions incurs a cost C t (u(t)) which we assume to be monotonically decreasing with u and convex in log-scale, reflecting that the costs on society of restricting mobility are marginally increasing.

Several recent epidemic prediction methods opt to set some (or all) of the parameters in their models to estimations from the medical and virology literature (e.g. [5-7, 16, 18] ). However, these parameters often have wide confidence intervals and are commonly inferred from statistical models that do not take into account the effects of social distancing and hospital capacity [42] . Other recent 9 J o u r n a l P r e -p r o o f . . .

Initial conditions works learn these parameters from data [6, 7, 9, 19 ], but do not explicitly model the infectivity of the epidemic using mobility patterns, making the design of related control strategies difficult. In contrast to these approaches, our model is entirely data-driven in that all parameters used (including initial conditions) are learned directly from data, and we learn an explicit mapping between mobility patterns and the spread of the epidemic. In particular, we employ a multitask learning approach [43] by leveraging both data on the number of deaths across a set of regions, and mobility data describing how often individuals in a region visit different points of interest. The key idea is to consider each region of interest as a separate prediction subtask, while many parameters of the models describing the evolution of the epidemic are shared across regions. Indeed, certain parameters of the epidemic model are intrinsic to the disease; hence, they do not depend on the geographical location from which data is collected. As such, when calibrating this model to a given region, we can benefit from the data from other regions by employing a multitask learning framework, providing better parameter estimates and avoiding overfitting. Towards this goal, we pool data from multiple regions (e.g., US counties) and minimize a global cost function in which the global parameters are shared across regions but the mobility parameters and initial conditions are specific to each region. This provides statistical data amplification, as the prediction subtask for each county is given additional data with which to learn the global epidemic parameters. Another relevant facet of multitask learning is representation bias; the learning procedure will avoid local minima intrinsic to only a few (or just one) of the counties. The learning pipeline from data to predictions for a single region is shown in Figure 2 . We fit our models using publicly available mobility data from Google J o u r n a l P r e -p r o o f measured before the lockdown measures took place. This baseline corresponds to the median daily visits to each category over the period comprising January 3 through February 6, 2020. Furthermore, we use public data from The New York Times, based on reports from state and local health agencies [32] , consisting of daily and cumulative caseloads and deaths attributed to COVID-19 in The United States. To account for inconsistencies and lags in reporting, we compute a 7-day rolling average on the original time series for the calibration of our model. As mentioned above, there are two layers to our model, namely (1) the mobility layer, which is a mapping from mobility data to the infection rate β(t) (described in Section 4.1) and (2) the epidemic layer, which describes the dynamics of the disease itself (discussed in Section 4.2). Since parameters such as the latency period, ratio of infected individuals who develop symptoms, and case fatality ratio are intrinsic to the disease and should not vary greatly based on the geographical area being studied, we group these together across regions as global parameters and learn them jointly with all the available data. However, the mapping from mobility data to the infection rate and initial conditions of the regional epidemic are dependent on the locality, and thus they are learned using only the data from their region.

To learn the function f : m → β from mobility data to the infection rate, we must first select an appropriate class of functions for such a mapping. Although we could use any parametric family of functions, such as neural networks, to estimate f , not all choices are tractable. In particular, neural networks may provide great prediction performance but would render an intractable control problem. In order to obtain a tractable control problem, we chose to model the function f using a parametric posynomial function [31] . As we will show in Section 5, this choice allows us to use geometric programming to efficiently solve several nonlinear optimal control problems of interest. Thus, we model f in a parametric way as

where we recall that m k ∈ [0, ∞). From a practical standpoint, this posynomial approximation is justified because β can be viewed as the product of the contact rate (the expected number of contacts an individual has with others) and the transmission risk, which is constant over time. Moreover, the number of contacts within a category should exponentially increase with the number of visits to points of interest in that category. Since the mobility data is stratified across K different categories, we allow the parameters to be different across the categories. Thus, in the parametric function f (m; θ, α, b) the probability of transmission is captured by θ = (θ 1 , . . . , θ K ) ∈ R K ≥0 , the exponential growth of infectivity is captured by α = (α 1 , . . . , α K ) ∈ R K , and the bias term b accounts for potentially unmodeled infections.

J o u r n a l P r e -p r o o f Since susceptible individuals may become infected by either symptomatic or asymptomatic individuals, the mobility mapping f is incorporated into the epidemic layer in two terms, as seen in (2). Firstly, it is used to model new infections from symptomatic individuals via the term β(m(t))S 0 I(t); secondly, we include a weighting term γ A in the term γ A β(m(t))S 0 A(t) to model the rate of new infections from asymptomatic infectious individuals. Thus, altogether we denote the set of parameters corresponding to the mobility mapping function for region i as Ψ i mobility := {θ, α, b, γ A }.

Our model is a latent-space model; hence, the states are not fully observable. Following common practice with these models, we treat the unobserved initial conditions as unknown parameters to be identified from the data. Since the dynamical trajectories of the epidemic are different across regions (e.g., US counties), for each region i we learn a set of initial conditions Ψ i 0 := {S i 0 , . . . , R i 0 }. As mentioned previously, we follow a multitask learning approach and, thus, the remaining global parameters, which we assume to be intrinsic to the disease, are shared across all regions and learned collectively using data available from all regions. The set of global parameters is denoted by Ψ global := {ρ EA , ρ EI , ρ AR , ρ IH , ρ IR , ρ HR , α D }, which includes all clinical parameters that depend on the nature of the virus alone (i.e., they are not influenced by social mobility).

In order to validate the predictive accuracy of our data-driven model, we conducted a case study using counties from the Delaware Valley metropolitan statistical area (commonly known as the Philadelphia metropolitan area), which includes Philadelphia County and nine surrounding regions, including counties in Maryland, New Jersey, and Delaware. Using county-level data, we learned both the local mobility mapping functions and initial conditions, as well as the global clinical parameters of the epidemic. Due to the known inconsistencies and lags in reporting of cases and deaths, we use the rolling seven-day average of cumulative deaths for both training and prediction. Formally, if we have training data for M counties over T d days, the training loss for the set of parameters Ψ := Ψ global ∪ M i=1 Ψ i mobility ∪ M i=1 Ψ i 0 is the following mean-squared error loss function:

where N i is the population of region i, X i (t) is the measured rolling 7-day average of cumulative deaths on day t for region i, andX i (t; Ψ) is the predicted value for the same region for a set of parameters Ψ. We normalize the number of deaths by the population in each county to avoid biasing our predictions towards counties with a larger population. Notably, any differentiable loss function may be substituted in this approach; mean-squared error was chosen for simplicity. Our model was trained by computing gradients of the loss function train with respect to all parameters in Ψ via the automatic differentiation package autograd [44] and running stochastic gradient descent using adam [45] over several independent trials using different initial guesses to account for the nonconvexity of the training problem. In particular, autograd performs reversemode differentiation (i.e., backpropagation) to compute numerical gradients; hence, differentiability of the loss function train and the dynamical equations governing our system help to ensure the existence of such gradients. Global clinical parameters were initialized by using plausible values from the medical literature [6, [46] [47] [48] [49] [50] [51] [52] , while local parameters are initialized randomly in each trial. Different counties are split across random batches, allowing the global parameters (trained using all data in our multitask approach) to converge more quickly, which in turn allows the local parameters to converge to values which fit more closely with the global parameters. From these trials, we select the set of parameters with lowest testing error, and present some examples of predictions using these chosen parameters in Figure 3 .

Traditional optimal control techniques are not directly applicable to compartmental epidemic models for a number of reasons. First, epidemic models are 13 J o u r n a l P r e -p r o o f typically nonlinear and the effect of non-pharmaceutical interventions is not additive, but multiplicative. Therefore, standard techniques such as LQR cannot be readily employed. Furthermore, a direct application of Pontryagin's Maximum Principle results in a high-dimensional two-point boundary value problem for which numerical methods have no convergence guarantees and present scalability issues. Due to the modeling choices proposed in this paper, we obtain a mobility-driven epidemic dynamics amenable to solve certain optimal control problems using tools from geometric programming [31] . In particular, we can solve the data-driven optimal control problems aiming to minimize the final number of deaths while respecting budget constraints on the economic costs associated with implementing mobility restrictions, as well as avoiding hospital overflows, with guarantees of global optimality. In practice, all people may not exactly follow the desired mobility restrictions; thus, such a strategy may be implemented in receding horizon. For example, if a decision maker sets mobility restrictions on a weekly basis, they may plan ahead for several months, implement a week's restrictions, then re-plan the following week and implement the resulting strategy, and repeat.

In order to employ geometric programming in this setting, it is necessary that the states can be expressed as posynomial functions of the control input. The negative term in (1) bars us from directly using our model; however, the horizon 0 ≤ t ≤ T c over which our control problem takes place is small enough that the number of new infected individuals is small relative to the total population, and so we may assume a nearly constant number of susceptible individuals exist, i.e.,S(t) ≈ S 0 . This assumption linearizes our model, and allows us to express the states as posynomial functions of the control input; hence, we will consider this linearized model for the remainder of this section. Fortunately, this linearization has a modest effect on our predictions; indeed, this linearized model upper-bounds the states of the true model, as illustrated in Lemma 1.

We start our analysis by considering the minimization of the final number of deaths. Since we will minimize deaths in the linear model, D(t), we remark by Lemma 1 that we induce conservatism in our control framework by minimizing an upper bound on the number of deaths in the nonlinear model,D(t). For completeness in our analysis, we allow the inclusion of a daily discount factor γ D and a terminal cost γ ∞ on the number of deaths at the end of the horizon T c . Hence, the decision maker aims to minimize the following objective function:

We remark from (7) that the incident deaths on day t, i.e., the number of new deaths D(t)−D(t−1), is exactly α D ρ HR H(t). The discount factor γ D ∈ (0, 1] is motivated by the uncertainty of deaths in the future which might be prevented by interventions not available in the present day. For example, the probability that a vaccine is widely available in the future increases as time passes. In particular, 1 − γ D can be considered the probability of a vaccine being widely available on each day in the future, so the probability of no vaccine being widely 14 J o u r n a l P r e -p r o o f available by day t is γ t D and, hence, deaths predicted at day t are only accounted for if there is not a vaccine that could prevent them. The terminal cost γ ∞ illustrates the desire to keep the number of deaths low beyond the time horizon in consideration. For example, let us assume that beyond the time horizon T c the epidemic has been curbed and the number of new daily deaths falls below α D ρ HR H(T c ). In the worst case scenario we would have

Defining γ ∞ := γ Tc D /(1 − γ D ), and γ ∞ = 0 if γ D = 1, the discounted number of deaths beyond T c is given by the terminal cost γ ∞ D(T c ).

Given that the infection rate β(t) depends on mobility, we assume that a decision maker can restrict mobility dynamically to curb the number of deaths by designing a mobility strategy u(t) so that β(t) = f (u(t)). Furthermore, we assume that u(t) is constrained to be within a set U reflecting that essential businesses cannot be severely restricted and that some mobility restrictions are only partially effective.

These mobility restrictions incur a cost which could be measured in terms of a pecuniary cost to the economy, absolute number of visits lost by businesses, or impact on the utility of citizens. In our framework, we quantify the economic cost of imposing a mobility control strategy u(t) using a cost function C t (u(t)) which, in general, can be time-varying; hence, for example, we can use different costs for mobility restrictions on workdays and weekends. Moreover, such a cost function may incorporate terminal costs to account for economic losses beyond the time horizon T c . We choose to model C t (u(t)) as a posynomial on u(t), since this is amenable to a geometric programming approach. We investigate the problem of choosing an optimal mobility control strategy u (t) that minimizes the number of cumulative deaths while keeping the total cost of the intervention, given by Tc−1 t=0 C t (u(t)), below a pre-specified budget B, while obeying a limit on hospitalizations τ H .

As we will show, this problem can be expressed as a geometric program; this stems from the fact that the states H(t) and D(t) can be expressed as posynomial functions of the mobility control variables u(t), as shown in the following lemma. Theorem 1. If C t (u(t)) is a posynomial cost function for all t, and the set of admissible control actions U is described by posynomial inequalities and monomial equalities, then the following is a geometric program: minimize u(0),u(1),...,u(Tc−1)

Proof. See Appendix B.3.

The choice of the cost function and the budget B may be difficult to discern. It may be unclear how reductions of mobility in a given category impact the economy, and the choice of a budget for those mobility restrictions engenders an implicit trade-off between minimizing deaths and economic costs. The choice of these cost functions is up to the decision maker and, in practice, could leverage available economic data about employment, revenue losses and social-wellness. To choose such a budget B once a cost function has been defined, we propose a principled approach to obtain the minimal budget under which hospitals remain below capacity, preventing the steep increase in deaths due to an overwhelmed healthcare system. This is achieved by solving an auxiliary optimal control problem that aims to find the minimum cost required to keep the number of hospitalized individuals H(t) below a threshold τ H at all times. This auxiliary optimal control problem is also a geometric program as long as the cost functions C t (u(t)) are posynomials (for example, found via posynomial fitting [31] on economic data); this is stated in Theorem 2 and the proof follows from Lemma 2.

Theorem 2. If C t (u(t)) is a posynomial cost function for all t, and the set of admissible control actions U is described by posynomial inequalities and monomial equalities, then the minimal budget required to keep hospitalizations below a given threshold τ H is given by

where u is the solution to the geometric program minimize u(0),...,u(Tc−1)

subject to H(t) ≤ τ H , t = 1, . . . , T c ,

Proof. See Appendix B.4.

The budget B obtained from Theorem 2 can be seen as a conservative cost which only guarantees that hospital operations remain within capacity, avoiding overflow. The decision maker should then use a budget B ≥ B in implementing the results from Theorem 1 to obtain a less conservative control input u(t).

As mentioned previously, in practice some individuals may not follow guidance regarding mobility restrictions; thus, our control strategy may be implemented in receding horizon. Since a decision maker may wish to set mobility restrictions on a rolling basis (e.g., weekly), they may plan ahead for several months, implement restrictions within the time frame of interest, then re-plan at the end of the time period, and repeat. Since the decision maker may tune the different categories of mobility variables independently, we allow each of the components k = 1, . . . , K of the mobility control action u(t) to vary between a lower bound u k > 0 (representing full lockdown) and an upper bound u k (representing no restrictions at all). Moreover, to reflect the fact that decision makers want to avoid dramatic changes to restrictions on a consistent basis, we enforce that the policy must remain fixed for periods of one week (seven days), and that each component of mobility may change no more than ∆%, i.e.,

, . . . , T }. However, it may be the case that categories of mobility are dependent, and cannot be tuned arbitrarily; we explore this idea in detail in Appendix A.1. These conditions lead to the set of admissible control actions

u(s) = u(s + 1) = · · · = u(s + 6), s ∈ {0, 7, . . .}}.

In practice the cost function C t (u(t)) may be supplied by a decision maker, or may be found via posynomial fitting [31] using economic data from a region. In the absence of such data, we choose a time-invariant cost function C(u(t)) that satisfies the requirements of being convex in log-scale and decreasing, given by

where c = (c 1 , . . . , c K ) is a relative cost weighting of the mobility categories. This relative cost could reflect that visits lost to, for example, healthcare facilities are more costly than visits lost to retail venues. As shown in the lemma below, this selection of cost function and set of allowable control actions renders (11) and (13) geometric programs.

(a) Measured mobility data m(t) for Philadelphia, PA.

(b) Value of minimal-cost control action u (t) (lower means less mobility allowed).

(c) Actuation cost Ct(u (t)) for control action u (t).

(d) Hospitalized individuals, with hospital bed threshold τ H shown as the dashed line. We also include the hospitalizations predicted based on the true mobility data m(t) as a baseline. Parameters of the models used herein were learned as described in Section 4.3.

We obtain the budget B to use in geometric program (11) by taking the total cost of the minimal-cost control strategy, i.e., B = Tc−1 t=0 Ct(u (t)). Control actions in Figure ( a) for categories except transit stations are similar, and are overlaid.

We demonstrate the effectiveness of our control approach with a case study for the counties in the greater Philadelphia area, illustrated in Figure 4 and Figure 5. All simulations were performed on a laptop computer with 16GB of RAM and a 2.2GHz Intel Core i7 CPU. The parameters of our compartmental model as well as the mobility mapping β(u(t)) are learned from data as described in Section 4.3 using the proposed multitask learning framework. Our models are trained using recent data before the widespread usage of vaccines, from August 1st to October 31st, 2020, and tested from November 1st-30th, 2020. In particular, in Figure 5 (c) we illustrate the number of cumulative deaths predicted using the optimal control strategy u (t) as compared to the number of cumulative deaths predicted based on the true mobility data m(t) from Philadelphia.

J o u r n a l P r e -p r o o f We invoke Lemma 3 to select the cost function C t (u(t)) and set of allowable control actions U. To elucidate the set of allowable control actions U, we select the values u k and u k independently for each category based on the mobility data used in our multitask learning framework. We set the limit on the relative change in mobility, ∆, to 10%. We further re-scale the mobility data to be in [0, 1] K , which has been shown to speed up convergence of learning methods [53] . For simplicity, we assign equal costs to each category of mobility, so that c k = 1 for all k ∈ {1, . . . , K}.

We show the minimal-cost control action u (t) computed by solving (13) in Figure 4 . As mentioned previously, interior point methods solve these problems very efficiently; full discussions on the computational complexity of our approach are in Appendix A.3. Figure 4 (a) illustrates the measured mobility data m(t) in Philadelphia County in order to compare with the designed control strategies. In Figure 4 (b) we show the individual mobility control strategies for each category; recall that a lower value of u k (t) corresponds to a lower value of mobility allowed for category k, which incurs a higher cost. In order to trade off between reduction in infections and economic costs, mobility in parks is allowed to be larger than all other categories. Mobility is kept consistently low across the time horizon to keep the number of hospitalized individuals under control, but allowed to increase later on, reducing the overall cost as an economic trade-off. In Figure 4 (d) we see the result of Lemma 1, namely that the number of hospitalizations in the linear model H(t), which is used in the GP (13) , upper-bounds the hospitalizations in the nonlinear modelH(t). Indeed, we see that our GP is providing a conservative estimate for the control action u (t): even though H(t) is near the hospitalization threshold τ H , the predicted hospitalizations under the nonlinear modelH(t) are approaching zero at the end of the time horizon ; we explore this phenomenon further in Appendix A.2.

In Figure 5 we show the results of Theorem 1 by solving the GP (11) . While this problem is related to the GP (13), the objective in this case is to minimize deaths subject to the budget B computed by solving (13) . Hence, while the mobility control strategy u (t) and corresponding cost are similar to what is shown in Figure 4 , there is a sharper initial restriction to mobility which is lifted in a more gradual manner. Figure 5 (c) shows the number of cumulative deaths using the mobility control action u (t) in blue, as compared to the deaths incurred without any control, i.e., with the measured mobility pattern data m(t). To confirm the accuracy of these predictions, we also show the number of deaths recorded in the same time horizon as the dashed black line. Indeed, by solving the GP (11) and implementing the associated control strategy, our model predicts over 200 lives could have been saved between August 1st and November 30th. As before we show the predictions using the linear model, which upper-bound the predictions in the nonlinear model; this further illustrates the conservatism of our control approach. 

In this paper we presented a multitask learning and nonlinear optimal control framework that aims to bridge the gap between optimal control theory of epidemic models and applicable data-driven models for analyzing the spread of COVID-19 and other future epidemics. To identify the parameters of our model, we propose a multitask learning approach that leverages mobility and epidemic data from multiple regions to capture how daily changes in mobility patterns affect the spread of the disease, and to accurately predict the resulting daily and cumulative deaths. Using this data-driven model we present a nonlinear optimal control framework using geometric programming to efficiently design nonpharmaceutical interventions to limit the spread of the epidemic while obeying a budget constraint on the economic loss incurred. Furthermore, we present a principled method for determining such a budget based on eliminating excess deaths due to over-utilization of hospital resources. We validate both our model and our control framework in a case study on the greater Philadelphia area.

In the future, this work could be extended to accommodate for robustness considerations as well as stochastic transitions in the epidemic layer, which introduce the additional challenge of expressing chance constraints as posynomial functions. The success of geometric programming in our work comes from expressing states of the system as posynomials on the mobility variables, allowing for the potential extension to models with more sophisticated mappings from human-mobility to epidemic dynamics. For example, generalized geometric programming admits functions that are max-monomials or posynomials with fractional exponents [31] , which opens the door to modeling epidemic dynamics and human-mobility using ReLU neural networks or posynomial approximations to 20 J o u r n a l P r e -p r o o f arbitrary functions. Furthermore, since geometric programs can be solved efficiently, our approach could be applied to models with higher complexity; for example, networked metapopulation models which can make use of more granular datasets. While mobility data may be expressed in terms of several different categories, in practice these categories are highly correlated; the correlation between mobility categories in Philadelphia is shown in Figure A.6(a) . As such, it may not be reasonable to assume that a decision maker may tune these categories independently and arbitrarily. However, it may also be difficult to elucidate the explicit relationships present from data alone. As a compromise, we may perform dimensionality reduction on the measured mobility pattern data via principal component analysis (PCA). In particular, since our data is in the form of time series, we take the first differences of the mobility data m d (t) := m(t) − m(t − 1) to ensure stationarity, find the K × K covariance matrix C of the K-dimensional observations m d (1), . . . , m d (T c ), and compute the principal components w k of C [53]. We then project into the space of the mobility pattern data asm(t) := [w 1 , . . . , w l ] m(t) for some number of principal components l. As shown in Figure A.6(b) , the first two principal components are enough to describe over 98% of the variance in mobility data in the greater Philadelphia region over the time period of interest. For this reason, we present additional results in which this two-dimensional principal component projection,m(t), is used in place of the full mobility data from the Google Mobility Report, m(t). In particular, our multitask learning approach usesm(t) to learn the mobility mapping β(·), as well as all other parameters in our model. We

J o u r n a l P r e -p r o o f then use the mapping and parameters to design control strategies via the GPs (11) and (13) . A reasonable budget B is deduced by solving (13) using this two-dimensional mobility data, and then a control strategy to minimize deaths u (t) is designed by solving (11) . These results are illustrated in Figure A. 7. Compared to using the full-dimensional mobility data, this approach does not control the pandemic as quickly and, hence, saves fewer lives. These results illustrate that, should it be possible, it is advantageous to independently control different categories of mobility. However, even if it is not possible to execute fine-grained control across many categories of mobility, this multitask learningdriven nonlinear optimal control framework is still able to reduce the number of deaths while respecting limits on the economic costs incurred.

(a) Value of optimal minimal-death control action u (t) (lower means less mobility allowed).

(b) Actuation cost Ct(u (t)) with control action u (t); average daily budget B /Tc shown dashed in red. Example of optimal minimal-death control strategy u (t) with lower-dimensional mobility data for Philadelphia County, PA, obtained by solving the minimal-death GP (11), from September 1st to November 30th, 2020. In (c) and (d), we plot the prediction using the linear model with the strategy u (t), H(t) and D(t), predictions using the nonlinear model with the strategy u (t),H(t) andD(t), and the predictions using the true mobility data m(t) as a baseline for no intervention,Ĥ(t) andD(t). Parameters of the models used herein were learned as described in Section 4.3 using the principal component projections of mobility datã m(t), with the budget B taken from the solution to the minimal-cost GP (13). (13) to find the budget B(τ H ), then take multiples of this budget from 1× to 10×. Trends appear consistent across hospitalization thresholds τ H , but the difference between the models quickly shrinks as the budget B(τ H ) grows.

As shown in Lemma 1, the linear epidemic model is always an upper bound to the nonlinear model. However, it is of practical importance to investigate how tight this bound may be. As such, we present results on the gap between the final states in the linear and nonlinear model in Figure A. 8. For each value of the hospitalization threshold τ H , we solve the minimal cost GP (13) to find the minimal budget B(τ H ). As mentioned in Section 5, this budget is minimal in the sense that it is the lowest cost incurred while respecting the threshold on the number of hospitalizations, τ H , but in practice a larger budget is desirable. As such, we vary the budget for the minimal death GP (11) between B(τ H ) and 10B(τ H ), and record the final number of hospitalizations in the linear model, H(T c ), and the nonlinear model,H(T c ), for T = 61 days. We also record the cumulative number of deaths in the linear and nonlinear models (D(T c ) and D(T c ), respectively). We then plot the difference between the quantities for the linear and nonlinear models as functions of the hospitalization threshold τ H and the respective budgets B(τ H ). Lower values of this difference correspond to a smaller gap between the linear and nonlinear models. As we can see in Figure A .8, budgets closer to B(τ H ), i.e., the minimal feasible budget to respect τ H , lead to a much larger gap between the models; this trend is consistent across values of τ H . We observe threshold behavior, whereby a moderate increase in the budget leads to a dramatic decrease in the gap between the models. This threshold behavior explains the difference between the linear and nonlinear models in the simulations of Section 5 and Appendix A.1, as we employ a budget close to the minimal cost obtained from solving the GP (13) . Hence, there exists a moderate trade-off between the prescribed budget and the tightness of the upper bounds given by the states of the linear model. Fortunately, these results illustrate that implementing the designed strategies on the true nonlinear system can lead to better outcomes than expected using any budget. In order to validate the efficiency and scalability of our approach, we present data on the size and time complexity of the GPs constructed herein. These results are not exhaustive, but meant to illustrate the practical applicability of our approach. For example, computational complexity may be reduced by dimensionality reduction techniques as in Appendix A.1. All experiments were performed on a laptop with an Intel Core i7 processor running at 2.2GHz and 16GB of RAM. These results illustrate the tractability of using geometric programming to solve problems with a large number of decision variables and contraints. (13) and minimal-deaths GP (11) instances solved herein, over multiple time horizons Tc. Space complexity is measured in terms of the number of decision variables and constraints of the GP in posynomial form and in the equivalent convex form, the latter of which the solver uses to find the optimal solution. Time complexity measures the running time for the solver to instantiate and solve the program.

Situation report 1

at-home orders across the country., COVID-19 community mobility reports

Our ongoing list of how countries are reopening, and which ones remain under lockdown, BUSINESS INSIDER

Comparing the accuracy of several network-based COVID-19 prediction algorithms

COVID-19 dynamics across the US: A deep learning study of human mobility and social behavior

Mobility network modeling explains higher SARS-CoV-2 infection rates among disadvantaged groups and informs reopening strategies, medRxiv

Analysis and control of epidemics: A survey of spreading processes on complex networks

The GLEaMviz computational tool, a publicly available software to explore realistic epidemic spreading scenarios at the global scale

Multiscale, resurgent epidemics in a hierarchical metapopulation model

A Polya contagion model for networks

Modeling the spatial spread of infectious diseases: The global epidemic and mobility computational model

Impact of non-pharmaceutical interventions (NPIs) to reduce COVID-19 mortality and healthcare demand

Optimal covid-19 quarantine and testing policies

An optimal predictive control strategy for COVID-19 (SARS-CoV-2) social distancing policies in Brazil

Modelling the impact of testing, contact tracing and household quarantine on second waves of COVID-19

A spatiotemporal epidemic model to quantify the effects of contact tracing, testing, and containment

Controlling epidemic spread: Reducing economic losses with targeted closures

A closed-loop framework for inference, prediction and control of SIR epidemics on networks

Optimal resource allocation for control of networked epidemic models

Optimal vaccine allocation to control epidemic outbreaks in arbitrary networks

Optimal resource allocation for network protection against spreading processes

Virus spread in networks

Epidemic spreading in real networks: An eigenvalue viewpoint

Optimal patching in clustered epidemics of malware

Market-based control of epidemics

Optimal and sub-optimal quarantine and isolation control in sars epidemics

Model predictive control: Theory and practice-a survey

Robust and optimal predictive control of the covid-19 outbreak

Covid-19 community mobility reports

A tutorial on geometric programming

Coronavirus (COVID-19) data in the united states

Convex optimization

Geometric programming for communication systems

Geometric programming for optimal positive linear systems

A primal-dual interior-point algorithm for nonsymmetric exponential-cone optimization

Modelling the covid-19 epidemic and implementation of population-wide interventions in italy

Presumed asymptomatic carrier transmission of covid-19

Asymptomatic transmission, the Achilles' heel of current strategies to control covid-19

Serial interval of SARS-CoV-2 was shortened over time by nonpharmaceutical interventions

Multitask learning

Autograd: Effortless gradients in NumPy

Adam: A method for stochastic optimization

COVID-19: four fifths of cases are asymptomatic, China figures indicate

Estimation of the asymptomatic ratio of novel coronavirus infections (COVID-19)

The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application

Temporal dynamics in viral shedding and transmissibility of COVID-19

Estimating COVID-19 underreporting across 86 nations: implications for projections and control

Differential effects of intervention timing on COVID-19 spread in the united states

Clinical presentation and virological assessment of hospitalized cases of coronavirus disease 2019 in a travel-associated transmission cluster

Appendix B.1. Proof of Lemma 1 Clearly, since in our model there is no possibility of reinfection,S(t) ≤ S 0 = S(t). We now proceed by induction on t. For the base case t = 1, recall that the initial conditions are identical for both the linearized and the true model. Thus, clearly from (2)-(7), the compartments have identical values for t = 1. Now, assuming the bounds hold for t = k − 1, we havẽWe can rewrite equations (2)-(5) in matrix form by defining a state vector] to obtain the dynamicsIt follows thatRecalling that the mobility mapping β(u(s)) is a posynomial, each of the matrices M s have posynomial entries on β(u(s)), and thus, on u(s); thus, it fol-as it is the product of matrices with entries that are posynomials on u(s) for s ∈ {0, 1, . . . , t − 3}. Moreover, from (7) we can see that D(t) is simply a sum of positive constants and positive scalar multiples of H(t) and is also a posynomial. as it is the sum of posynomials, since γ D ∈ (0, 1] and thus γ ∞ ∈ [0, ∞) (defining γ ∞ = 0 when γ D = 1). Moreover, the constraint H(t) ≤ τ H is clearly posynomial on {u(s)} t−3 s=0 for each 1 ≤ t ≤ T c by Lemma 2. By assumption the cost function C t (u(t)) is posynomial on the decision variables u(t) for all 0 ≤ t ≤ T c − 1, and thus the constraintt=0 . Finally, by assumption the set U is described via posynomial inequalities and monomial equalities on u(t) for every 0 ≤ t ≤ T c − 1. Thus, (11) is a geometric program, and the globally optimal solution u may be found efficiently. Similarly to the proof of Theorem 1, we may invoke Lemma 2 to show that the constraint H(t) ≤ τ H is posynomial on {u(s)} t−3 s=0 for each 1 ≤ t ≤ T c . Again similarly to the proof of Theorem 1, by assumption on C t (u(t)) the sum Tc−1 t=0 C t (u(t))] and, hence, the objective function, is posynomial on {u(t)} Tc−1 t=0 , and the set U is described via posynomial inequalities and monomial equalities on u(t) for every 0 ≤ t ≤ T c − 1. Hence, (13) is a geometric program, whose globally optimal solution u may be found efficiently. Using this solution, we may compute the minimal budget to prevent hospital overflow as B = Tc−1 t=0 C t (u(t)).

By definition in (14) , for all {(u 1 (t), . . . , u K (t))} Tc t=0 ∈ U we have u k (t) −1 ≤ u −1 k , u k (t) ≤ u k , and we may express the other inequalities as u k (t) −1 u k (t−1) ≤ (1 − ∆) −1 , and u k (t)u k (t − 1) −1 ≤ 1 + ∆ for each k ∈ {1, . . . , K}. Moreover, the equality constraints may be satisfied by considering only one decision variable for each seven day period. Hence, U may be described by posynomial inequalities. Next, notice thatClearly, since the values c k , u k , and u k are constant, and u k < u k implies u −1 k − u −1 k > 0, the left-hand term above is a posynomial in the entries of u(t), and the right-hand term is constant. Hence, the function C t (u(t)) is a posynomial in u(t) = (u 1 (t), . . . , u K (t)) shifted by a constant (which does not affect the optimal values of u(t) in (13)). Considering the budget-constraint inequality in (11), from above we have which is clearly a posynomial inequality amenable to geometric programming.