key: cord-0180884-kwz4pos8
authors: Krechetov, Mikhail; Sikaroudi, Amir Mohammad Esmaieeli; Efrat, Alon; Polishchuk, Valentin; Chertkov, Michael
title: Prediction and Prevention of Pandemics via Graphical Model Inference and Convex Programming
date: 2021-09-09
journal: nan
DOI: nan
sha: 96f3fe7d3b825a9bf6557c48e997ef8bd2d323ec
doc_id: 180884
cord_uid: kwz4pos8

Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with census tract granularity. In this manuscript, we provide algorithmic answers to the following two inter-related public health challenges. (1) Inference Challenge: assuming that there are $N$ census blocks (nodes) in the city, and given an initial infection at any set of nodes, what is the probability for a subset of census blocks to become infected by the time the spread of the infection burst is stabilized? (2) Prevention Challenge: What is the minimal control action one can take to minimize the infected part of the stabilized state footprint? To answer the challenges, we build a Graphical Model of pandemic of the attractive Ising (pair-wise, binary) type, where each node represents a census track and each edge factor represents the strength of the pairwise interaction between a pair of nodes. We show that almost all attractive Ising Models on dense graphs result in either of the two modes for the most probable state: either all nodes which were not infected initially became infected, or all the initially uninfected nodes remain uninfected. This bi-modal solution of the Inference Challenge allows us to re-state the Prevention Challenge as the following tractable convex programming: for the bare Ising Model with pair-wise and bias factors representing the system without prevention measures, such that the MAP state is fully infected for at least one of the initial infection patterns, find the closest, in $l_1$ norm, set of factors resulting in all the MAP states of the Ising model, with the optimal prevention measures applied, to become safe.

We follow our previous work 1 in justification for the use of the Graphical Models (GM) to study and mitigate pandemics. Therefore, we start from providing a brief recap of the prior literature on modeling of the epidemics, describe the logic which led us in 1 to the Ising Model (IM) formulation, and then state formally the inference and prevention problems addressed in the manuscript.

Difficulty in both predicting and neutralizing the spread of pandemics is a major social challenge of humanity. Technically speaking, we are yet to design a coherent data lifecycle for modeling and prevention both in terms of the global strategies and local tactics. To address the challenge, we must devise a hierarchy of spatio-temporal models with different resolutionsfrom individual to community, county to the city, and from the moment a pathogen first enters our bodies, to days of disease development and to community transmission. Importantly, the models should be efficient in computing probabilistic predictions (for instance, offering the marginal probability heat map for the city neighborhoods to transition from the current/prior state of infection to the projected/a-posteriori state in two weeks).

Epidemiology and Mathematical Biology experts have relied in the past on a number of modeling approaches. The Agent-Based-Models (ABMs), introduced in epidemiology in 2004-2008 2-7 , have complemented the earlier compartmental models [8] [9] [10] [11] . Using ABMs, even though not exclusive to epidemiology 12, 13 , became a breakthrough in the field, as they allowed to make a significant improvement in the quality of predictions, especially in the spatio-temporal resolution of how the disease spreads and how one can mitigate its spread. The models became and remained a core part of the epidemiology data life-cycle. (See for instance 14, 15 for most recent bibliography.) The ABMs provide a detailed prediction of how pandemics spread within counties, cities, and regions. A majority of the country-, city-or county-scale testbeds testing various mitigation strategies are resolved nowadays with ABMs. In particular, recently ABMs have been used extensively to inform public health in (non-pharmaceutical) interventions against the spread of COVID-19 [16] [17] [18] [19] [20] , and verify new strategies like test-trace-quarantine 15 , among many other applications.

There are two major problems with the modeling of pandemic. First, many parameters need to be calibrated on data. Second, even when calibrated for the current state of pandemic the models which are too detailed become impractical for making a forecast and for developing prevention strategies -both requiring checking multiple (forecast and/or prevention) scenarios. Using ABMs, which are clearly over-modeled (too detailed) is especially problematic in the context of the latter. For example, the open-source ABM solver FLUTE 21 developed originally for modeling influenza, works with data that are acquired through Geographic Information Systems (GIS) on the scale of census tracts or communities, which is a very reasonable scale of spatial resolution to understand the dynamics of pandemics on a local scale. FLUTE populates each of the communities with thousands to millions of inhabitants in order to account for their daily patterns of travel. We believe that constructing effective Graphical Models (GM) of Pandemics with community-scale spatial resolution and then modeling pairwise (and possibly higher-order) epidemic interactions between communities directly, without introducing the thousands-to-millions of dummy agents, will complement (as discussed in the next paragraph), but also improve upon ABMs by being more efficient, robust and easier to calibrate.

An important, and possibly one of the first, Graphical Model (GM) of the COVID-19 pandemic was proposed in 22 . Dynamic bi-partite GMs connecting census tracts to specific Points Of Interest (non-residential locations that people visit such as restaurants, grocery stores and religious establishments) within the city and studying dynamics of the four-state (Susceptible, Exposed, Infectious and Removed) of a census tract (graph node) on the graph, were constructed in 22 for major metro-area in USA based on the SafeGraph mobility data 23 .

In fact, similar dynamic GMs, e.g. of the Independent Cascade Model (ICM) type [24] [25] [26] [27] [28] , were introduced even earlier in the CS/AI literature in the context of modeling how the rumors spread over social networks (with a side reference on using ICM in epidemiology). As argued in 1 the Independent Cascade Models (ICMs) can be adapted to modeling pandemics. (Another interesting use of the ICM to model COVID-19 pandemic was discussed in 29 .) In its minimal version, an ICM of Pandemic can be built as follows. Assume that the virus spreads in the community (census tract) sufficiently fast, say within five days -which is the estimate for the early versions of COVID-19 median incubation period. If an infected person enters a community/neighborhood but does not stay there, he infects others with some probability. If a single resident of the community becomes infected, all other residents are assumed infected as well (instantaneously). The model is a discrete-time dynamic model in which nodes in a network are in one of the three states: Susceptible, Infected, or Removed. The nodes represent communities/neighborhoods. A contact between an Infected community/node and another community which is Susceptible has an assigned probability of disease transmission, which can also be interpreted as the probability of turning the S state into I state. Consistently with what was described above, the network is represented as a graph, where nodes are tracts and edges, connecting two tracts, have an associated strength of interaction representing the probability for the infection to spread from one node to its neighbor. A seed of the infection is injected initially at random, for example, mimicking an exogenous super-spreader infection event in the area; examples could include political or religious gatherings. See Figure 1 illustrating dynamics of the cascade model over 3-by-3 grid graph. Color coding of nodes is according to Susceptible=blue, Infected=red, Removed=black. Given the starting infection configuration, each infected community can infect its graph-neighbor community during the next time step with the probability associated with the edge connecting the two communities. Then the infected community moves into the removed state. The attempt to infect each neighbor is independent of all other neighbors. This creates a cascading spread of the virus across the network. The cascade stops in a finite number of steps, thereby generating a random Removed pattern, shown in black in the Fig. 1 , while other communities which were never infected (remain Susceptible) are shown in blue.

It was shown in 1 that with some regularization applied, statistics of the terminal state of the Cascade Model of Pandemic turns into a Graphical Model of the attractive Ising Model type.

This manuscript Road Map. Working with the Ising Model of Pandemic, we start the technical part of the manuscript by posing the Inference/Prediction Challenge in Section Ising Model of Pandemic. Here, the problem is stated, first, as the Maximum A-Posteriori over an attractive Ising Model, and we argue, following the approach which is classic in the GM literature, that problem can be re-stated as a tractable LP. We then proceed to Section Prevention Challenge to pose the main challenge addressed in the manuscript -the Prevention Challenge -as the two-level optimization with inner step requiring resolution of the aforementioned Prediction Challenge. Aiming to reduce the complexity of the Prevention problem, we turn in Section Geometry of the MAP States to the analysis of the conditions in the formulation of the Prediction Challenge, describing the Safety domain in the space of the Ising Model parameters. We show the Safety domain is actually a polytope, even though exponential in the size of the system. We proceed in Section Projecting to the Safe Polytope with analysis of the Prevention Challenge, discussing the interpretation of the problem as a projection to the Safety Polytope from the polytope exterior, needed when the bare prediction suggests that system will be found with high probability outside of the Safety Polytope. Section Two Polarized Modes is devoted to approximation which allows an enormous reduction in the problem complexity. We suggest here that if the graph of the system is sufficiently dense, the resulting MAP solution may only be in one of the two polarized states (a) completely safe (no other nodes except the initially infected) pick the infection, or (b) the infection is spread over the entire system. We support this remarkable simplification by detailed empirical analysis and also by some theoretical arguments. Section Results is devoted to the experimental illustration of the methodology on the practical example of the Graphical Model of Seattle. The manuscript is concluded in Section Discussion with a brief summary and discussion of the path forward.

We conclude this introductory Section with a general comment about relation between ABM and GM approaches. Focusing on GMs we did not mean to suggest that the GM approach is outright better than the ABM approach. In fact, we do believe that the two are complementary. ABM is more accurate but computationally heavy, while GM is coarser but computationally lighter. We are also convinced that it will be advantageous in the future to consider the two in tandem, e.g., with the parameters of the GM trained on the synthetic data generated by the ABM.

As argued in 1 the terminal state of a dynamic model generalizing the ICM model can be represented by the Ising Model of Pandemic (IMP), defined over graph G = (V , E ), where V is the set of N = |V | nodes and E is the set of undirected edges. The IMP, parameterized by the vector of the node-local biases, h = (h a |a ∈ V ) ∈ R N , and by the vector of the pair-wise (edge) interactions, J = (J ab |{a, b} ∈ E ), describes the following Gibbs-like probability distribution for a state,

where any node, a ∈ V can be found in either S-(susceptable, never infected) state, marked as x a = −1, or R-(removed, i.e. infected prior to the termination) state, marked as x a = +1. In Eq. (1), E(x|J, h) and Z(J, h) are model's energy function and partition function respectively:

In what follows, we will focus on finding the Maximum-A-Posteriori (MAP) state of the IMP conditioned to a particular initialization -setting a subset of nodes, I ∈ V , to be infected. We coin the MAP problem Inference Challenge:

In this paper, we focus on finding the mode of distribution defined in eq. 1. This problem is called maximum a posteriori (MAP) estimation, and the most probable states are called MAP states, or ground states (in physics literature). Moreover, in our setting, some nodes are initialized to plus one state (we denote this set of nodes by I). The corresponding binary optimization problem is formulated as follows:

where we emphasize dependence of the MAP solution on the set of the initially infected nodes, I . Note that in general finding x (MAP) is NP-hard 30 . However if J > 0 element-wise, i.e. the Ising Model is attractive (also called ferromagnetic in statistical physics), Eq. (4) becomes equivalent to a tractable (polynomial in N) Linear Programming (see 31 and references therein). In fact, the IMP is attractive, reflecting the fact that the state of a node is likely to be aligned with the state of its neighbor.

Let us also emphasize some other features of the IMP:

1. G should be thought of as an "interaction" graph of a city, reflecting transportation, commutes, and other forms of interactions between populations with the homes at the two nodes (census tracts) linked by an edge. The strength of a particular J ab shows the level of interaction associated with the edge {a, b}.

A component, h a , of the vector of local biases, h, is reflecting a-node specific factors such as immunization level, imposed quarantine, and degree of compliance with the public health measures (e.g., wearing masks and following other rules). Large negative/positive h a shows that residents of the census tract associated with the node a are largely healthy/infected.

If solution of the Inference Challenge problem is such that the R-subset of the MAP solution,

is sufficiently large, we would like to mitigate the infection, therefore setting the Prevention Challenge discussed in the next Section.

Let us assume that modification of J and h are possible and consider the space of all feasible J and h. We will then identify Safe Domain as a sub-space of feasible J and h such that for all the initial sets of the initially infected nodes, I , considered the resulting "infected" subset, R(I , J, h), is sufficiently small. A more accurate definition of the Safe Domain follows. Then, we rely on the definition to formulate the control/mitigation problem coined Prevention Challenge. At this stage, we would also like to emphasize that studying the geometry of the Safe Domain is one of the key contributions of this manuscript. Definition. Consider IMP over G = (V , E ) and with the parameters (J, h). Let us also assume that the set of initially infected nodes, I , is drawn from the list, ϒ. We say that (J, h) is in the k-Safe Domain if for every I from ϒ the number of R-nodes in the MAP solution (4), is at most k, i.e.

where

Prevention Challenge: Given (J (0) , h (0) ) describing the bare status of the system (city) which is not in the k-Safe Domain, and given the cost of the

, what is least expensive change to (J (0) , h (0) ) state of the system which is in the the k-Safe Domain? Formally, we are interested to solve the following optimization:

Expressing it informally, the Prevention Challenge seeks to identify a minimal correction (thus "corr" as the upper index) (J (corr) , h (corr) ), which will move the system to the safe regime from the unsafe bare one, (J (0) , h (0) ). The measures may include limiting interaction along some edges of the graph, thus modifying some components of J, or enforcing local biases, e.g., increasing level of vaccination, at some component of h.

Given that condition in Eq. (6) also requires solving Eq. (4) for each candidate (J, h), the Prevention Challenge formulation is a difficult two-level optimization. However, as we will see in the next Section, the condition in Eq. (6) (and thus the inner part of the aforementioned two-level optimization) can be re-stated as the requirement of being inside of a polytope in the (J, h) space. In other words, the (k)-Safe Domain is actually a polytope in the (J, h) space.

Before solving the Prevention Challenge problem, we want to shed some light on the geometry of the MAP states. We work here in the space of all the Ising models over a graph G = (V , E ), where each of the models is specified by (J, h) .

Proposition. Safe Domain of a graph G = (V , E ) with N = |V | nodes is a polytope in the space of all feasible parameters, (J, h), defined by an exponential in N number of linear constraints.

Remark. The Proposition allows us, from now on, to use Safe Polytope instead of the Safe Domain.

Proof of the Proposition. The space of all the Ising models is divided into 2 N regions by the corresponding MAP states. Moreover, the boundary between any pair of neighboring regions is linear: consider two states x (i) and x ( j) , and denote (J, h) (i) (resp. (J, h) ( j) ) the set of all the Ising models with the MAP state x (i) (resp. x ( j) ), then (J, h) (i) and (J, h) ( j) are separated by the equation, E(x (i) | J, h) = E(x ( j) | J, h), which is linear in (J, h) . For a subset, R ⊆ V , of nodes, let x (R) be the state in which, x a = +1, ∀a ∈ R, x a = −1, ∀a / ∈ R. Let X (R) be the set of all the MAP states, x, such that ∀a ∈ R, x a = +1 (while other nodes, i.e. b ∈ V \ R, are not constrained, x b = ±1). Then the k-Safe Polytope, which we denote, SP(k), is defined by at most

were some of these linear inequalities on the right hand side may be redundant.

Remark. In the case of k = 1 (which, obviously, applies only if all the initial infections are at a single nodes, i.e. ∀I ∈ ϒ, |I | = 1), there are at most, N · (2 N−1 − 1) linear inequalities.

We illustrate the geometry of the Ising model over the triangle graph (three nodes connected in a loop, K 3 ) in Fig. 2a and Fig. 2b . For both illustrations, we fix the h value to −1 at all the nodes, and we are thus exploring the remaining three degrees of freedom, J 12 , J 13 , J 23 (since J is symmetric), which corresponds to exploring interactions within the class of attractive Ising models, ∀a, b = 1, 2, 3 : J ab ∈ R + .

First, we consider the case when the only node a = 1 is infected. In this simple setting there are four possible MAP states, (x 1 , x 2 , x 3 ) ∈ {(+1, −1, −1), (+1, −1, +1), (+1, +1, −1), (+1, +1, +1)}, shown in Fig. (2a) as green, blue, yellow and red, respectively. Finally, in the figure Fig. (2b) we plot the Safe Polytope SP(1). We observe that the two "polarized" MAP states, (+1, −1, −1) and (+1, +1, +1), are seen most often among the samples, while domain occupied by the other two "mixed" MAP states, (+1, −1, +1) and (+1, +1, −1) is much smaller, with the two modes positioned on the interface between the two polarized states.

As will be shown below in the next Section, the polarization phenomena with only two "polarized" MAP states, which we coin in the following the two polarized modes, which we see on this simple triangle example, is generic for the attractive Ising model.

Definition. Consider a particular subset of the initially infected nodes, I (where thus, ∀a ∈ I : x a = +1). We call the MAP state of the model polarized when one of the following is true: (i) only initially infected nodes show +1 within the MAP solution, ∀a ∈ I : x a = +1, ∀b ∈ V \ I : x b = −1 or (ii) all nodes within the MAP state show +1, ∀a ∈ V : x a = +1. We call a MAP state mixed otherwise.

Experimenting with many dense graphs, which are typical in the pandemic modeling of modern cities with extended infrastructures and multiple destinations visited by many inhabitants, we observe that the two polarized MAP states dominate generically, while the mixed states are extremely rare. Fig. 3a illustrates results of one our ensemble of random IMPs' experiments. We, first, fix N to 20, pick M such that M ≤ N(N − 1)/2 = 190 and then generate at random M edges connecting the 20 nodes. Then, for each of the random graphs (characterized by its own M) we generate 500 random samples of (J, h), representing attractive Ising models. Finally, we find the MAP state for each IMP instance, count the number of mixed states and show the dependence of the fractions of the mixed states (in the sample set) in the Fig. (3a) . A fast decrease of the proportion of the mixed states is observed with an increase in M.

Extension of these experiments (see Fig. (3b) ) suggests that when we consider an ensemble of IMPs over graphs with N nodes and the average degree α = O(1) which is sufficiently large (so that the graph is sufficiently dense) and increase N, we observe that the Mixed State Probability (MSP), or equivalently proportion of the mixed-to-polarized states, decreases dramatically. Moreover, based on the experiments, we conjecture that the MSP decays to zero at α > α c , but it saturates at α < α c , where α c is the threshold depending on the ensemble details. This threshold behavior is akin to the phase transition that occurred in many models of the spin glass theory 32 and many models of the Computer Science and Theoretical Engineering defined over random graphs and considered in the thermodynamic limit, i.e. at N → ∞. See e.g. 33 (application in the Information Theory, and specifically in the theory of the Low Density Parity Check Codes) 34 (applications in the Computer Science, and specifically for random SAT and related models) and references therein. We postpone further discussions of the conjecture for a future publication (see also brief discussion in Section Discussion).

We will continue discussion of the two-mode solution in the next Section.

In this Section we aim to summarize all the findings so far to resolve the Prevention Challenge formulated in Section Prevention Challenge, specifically in Eq. (7) stating the problem as finding a minimal projection to the Safety Domain/Polytope from its exterior. The task is well defined, but in general, and as shown in Section Geometry of the MAP States, it is too complex -as the description of the Safety Polytope (number of linear constraints, required to define it) is exponential in the system size (number of nodes in the graph). However, the two-mode approximation, introduced in Section Two Polarized Modes, suggests a path forward: use the two-mode approximation and therefore remove all the linear constraints but one, separating the two polarized states. Let us denote the two-mode approximation of the Safe Polytope by SP(ϒ), where thus k in the original Safe Polytope, SP(k), is replaced by the set ϒ of all the initial infection patters. Then we write,

where, ∀a ∈ I : x by the l 1 -norm, l 2 -norm and their convex combination, to present the result of our experiments in a quasi-realistic setting describing a (hypothetical) pandemic attack and optimal defense, i.e., prevention scheme.

We illustrate our methodology on a case study of the city of Seattle. (To verify scalability of the approach we have also experimented with larger system, e.g., models of New-York City and of the State of Wisconsin. This is still work in progress, partially reported in ? .) Seattle has 131 Census Tracts. Each Census Tract includes 1 to 10 Census Block Groups with 600 to 3000 residents and represents 1200 to 8000 population. Boundaries separating Census Tracts are designed to represent natural or urban landmarks and also to be persistent over time 35 . To reduce complexity, we merge census tracts into 20 regions. See Fig. 4 . To prepare this splitting of Seattle into 20 regions/nodes, we utilize geo-spatial information from the TIGER/Line Shapefiles project provided by U.S. Census Bureau 36 . The travel data of Seattle was extracted from the Safegraph dataset 37 , which provides anonymized mobile tracking data. Each data point in the Safegraph database describes the number of visits from a Census Block to a specific point of interest represented by latitude and longitude. Mobility data associated with travelers crossing the boundaries of Seattle was ignored. We then follow the methodology developed in 1 to combine the aggregated travel data with the epidemiological data, representing current state of infection in the area. This results in the estimation of the pair-wise interactions, J, parameterizing the Ising Model of Pandemic. We also come up with an exemplary (uniform over the system) local biases, h, completing the definition of the model. (We remind that the prime focus of the manuscript is on developing methodology which is AI sound and sufficiently general. Therefore, the data used in the manuscript are roughly representative of the situation of interest, however not fully practical.) We consider a situation with different levels of infection and chose (J (0) , h (0) ) stressed enough, that is resulting in the prediction (answer to our Prediction Challenge), which lands the system in the dangerous domainoutside of the Safety Polytope.

In all of our experiments, we have used the general-purpose Gurobi optimization solver 39 to compute the MAP states and thus to validate the two-mode assumption. (We have also experimented with CVXPY 40 , but found it performing slower than Gurobi, at least over the relatively small samples considered in this proof of principles study. In the future, we plan to use existing, or developing new, LP solvers designed specifically for finding the MAP state of the attractive Ising model.) To illustrate our Prevention strategy, we took the Seattle data described above, and fed it as an input into the optimization (10), describing projection to the Safety Polytope, where C(·; ·) is substituted by the l 1 norm. CVXPY solver was used for this convex optimization task. Our code (python within jupyter notebook) is available at ? . Table 1 shows results of our Prevention experiments on the Seattle data. We analyze , first, l 1 projection to SP(ϒ) where ϒ consists of all the initial infection patterns consisting of up to k nodes. In all of our experiments, the values of the field 7/12 vector h (uniform across the system) was fixed to −1. We observe that the number of constraints grows exponentially with k; however, the cost of intervention remains roughly the same. We intend to analyze the results of this and other (more realistic) experiments in future publications aimed at epidemiology experts and public health officials. , h (0) ), characterizing stressed but unmitigated regime, and the optimal prevention regime, resulting in ( J (corr) , h (corr) ) computed according to Eq. (10).

Our initial reason for choosing l 1 in the experiments discussed above was based on empirical expectation that l 1 , as the closest of the convex options to l 0 , will enforce sparsity of the optimal reduction in the pair-vise influences/traffic patterns. However, by choosing to work with the l 1 -norm, we certainly did not mean to suggest that other norms cannot be used, especially if propounded by public health and/or government experts -after all it is the professional judgment of these experts which is expected to drive practical choices of the cost function. Therefore, we also experimented with the l 2 norm and more generally with a convex mixture (interpolation between) l 1 and l 2 norms. These richer experiments are illustrated in Fig. 5 and Fig. 6 . Fig. 5 , where color coding of the (i, j) element of the symmetric 20 × 20 matrix is chosen proportional to ( J (corr) − J (0) ) i j . Also, the histogram sub-figure show clearly that reduction mainly applies to immediate vicinity of the highest degree node in the downtown area of the Seattle city.

In this manuscript, written specifically for an interdisciplinary audience of researchers in mathematical, physical and engineering sciences, we follow our prior work 1 (aimed primarily at experts in public health and epidemiology) and explain challenges in inference (prediction) and control (prevention) of pandemics. We use the language of GMs, which is one powerful tool in the modern arsenal of Data Science and AI, and state the Prediction Challenge as a MAP optimization over an attractive Ising model, which can be expressed generically as a solution of a tractable Linear Programming (LP). We then turn to the analysis of the prevention problem, which one formulates if the aforementioned prediction solution suggests that the probability of significant infection is above a pre-defined (by the public health experts) tolerance threshold. We show that in its simplest formulation, the prevention problem is equivalent to finding minimal l 1 projection to the safety polytope, where the latter is defined by solving the aforementioned prediction problem. In general, the polytope does not allow a description non-exponential in the size of the system. However, we suggested an approximation that allows to approximate the safety polytope efficiently -that is, linearly in the number of the initial infection patterns. The approximation is justified (empirically, with supporting theoretical arguments, however not yet backed by a mathematically rigorous theory) in the case when the interaction graph of the system (e.g., related to the system/city transportation and human-to-human interaction network) is sufficiently dense. We conclude by providing a quasi-realistic experimental demonstration on the GM of Seattle. As a disclaimer but also a path forward, we would like to point out that this manuscript is not there yet where we would dream it to be in terms of practical use by heath-care authorities/practitioners. However, the manuscript suggests an important, and yet missing in the literature, path towards the applications. We focused in the manuscript on developing methodology which we intend to make useful for practical purposes in the future. This will require accurate calibration of the pairwise and singleton terms in the Ising (Graphical) models, e.g., as mentioned above, based on "learning" the rates from much better developed ABMs. When such calibration of the parameters in the Ising model is done, the pair-wise terms (J) and the singleton terms (h) may be interpreted as corresponding to pair-wise "traffic" or "influences" between aggregated areas and singleton biases, e.g., related to degree of vaccination or compliance with the local public health mandates. Then, the outcome of this paper methodology will become practical, e.g., resulting in (optimal) recommendations of the traffic/influence and local modifications to avoid a disastrous spread of infection.

We conclude with an incomplete list of technical challenges, presented in the order of importance (subjective), which need to be resolved to make the powerful GM approach to pandemic prediction and prevention practical:

• Build a hierarchy of Agent Based and Probabilistic Graphical Models which allow more accurate (than Ising model) representation of the infection patterns over geographical and community graphs. (See also our remark above about making the modeling more realistic.) The models may be both of the static (like Ising) or dynamic (like Independent

Cascade Model) types. Extend the notion of the Safety Region (polytope) to the new GM of pandemics.

• Consider the case when the resolution of the Prediction Challenge problem returns a positive answer -most likely future state of the system is safe, and then develop the methodology which allows estimating the probability of crossing the safety boundary. In other words, we envision formulating and solving in the context of the GM a problem which is akin to the one addressed in 41 : estimate the probability of finding the system outside of the Safety Polytope.

• Construct other (than two-mode) approximations to the Safety Polytope. In particular, approximations built on sampling of the boundaries of the safety polytope and learning (possibly reinforcement learning) are needed.

• Develop the asymptotic (thermodynamic limit) theory which allows validating (and/or correcting systematically) the efficient (two-mode and other) approximations of the Safety Polytope.

Graphical Models of Pandemic

Modelling disease outbreaks in realistic urban social networks

Containing pandemic influenza at the source

Strategies for containing an emerging influenza pandemic in Southeast Asia

Strategies for mitigating an influenza pandemic

Mitigation strategies for pandemic influenza in the united states

Modeling targeted layered containment of an influenza pandemic in the united states

The Prevention of Malaria

A contribution to the mathematical theory of epidemics

Disease of Humans: Dynamics and Control

The mathematics of infectious diseases

Think Complexity: Complexity Science and Computational Modeling

Population health methods: Agent based modeling

Covasim: An agent-based model of covid-19 dynamics and interventions

Impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand

impact of non-pharmaceutical interventions (npis) to reduce covid-19 mortality and healthcare demand

Covid-19 confirmed and forecasted case data

Agent-based modelling for sars-cov-2 epidemic prediction and intervention assessment: A methodological appraisal

Multiple epidemic wave model of the covid-19 pandemic: Modeling study

Flute, a publicly available stochastic influenza epidemic simulation model

Mobility network models of COVID-19 explain inequities and inform reopening

Safegraph social distancing metrics. san francisco, ca: Safegraph inc

Maximizing the spread of influence through a social network

Learning the graph of epidemic cascades

Inferring networks of diffusion and influence

Cuttingedge: Influence minimization in networks

Discriminative learning of infection models

A time-dependent sir model for covid-19 with undetectable infected persons

On the computational complexity of ising spin glass models

The power of lp relaxation for map inference

Spin Glass Theory and Beyond

Modern Coding Theory

A. Information, physics, and computation

United States Census Bureau. United states census bureau. tiger line shapefiles technical documentation

SafeGraph COVID-19 Data Consortium

Office of Planning & Community Development, Seattle. Census tract map of seattle

Importance sampling the union of rare events with an application to power systems analysis

This work was supported in part by NSF via #2027072 "RAPID:Infer and Control Global Spread of Corona-Virus with Graphical Models" project. The work of M.K. was supported by the Analytical center at Skoltech (subsidy agreement 000000D730321P5Q0002, Grant No. 70-2021-00145 02.11.2021). The work of V.P. was partially supported by the Swedish Research Council and the Swedish Transport Administration.