Prediction and prevention of pandemics via graphical model inference and convex programming

Hard-to-predict bursts of COVID-19 pandemic revealed significance of statistical modeling which would resolve spatio-temporal correlations over geographical areas, for example spread of the infection over a city with census tract granularity. In this manuscript, we provide algorithmic answers to the following two inter-related public health challenges of immense social impact which have not been adequately addressed (1) Inference Challenge assuming that there are N census blocks (nodes) in the city, and given an initial infection at any set of nodes, e.g. any N of possible single node infections, any \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$N(N-1)/2$$\end{document}N(N-1)/2 of possible two node infections, etc, what is the probability for a subset of census blocks to become infected by the time the spread of the infection burst is stabilized? (2) Prevention Challenge What is the minimal control action one can take to minimize the infected part of the stabilized state footprint? To answer the challenges, we build a Graphical Model of pandemic of the attractive Ising (pair-wise, binary) type, where each node represents a census tract and each edge factor represents the strength of the pairwise interaction between a pair of nodes, e.g. representing the inter-node travel, road closure and related, and each local bias/field represents the community level of immunization, acceptance of the social distance and mask wearing practice, etc. Resolving the Inference Challenge requires finding the Maximum-A-Posteriory (MAP), i.e. most probable, state of the Ising Model constrained to the set of initially infected nodes. (An infected node is in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$+ \, 1$$\end{document}+1 state and a node which remained safe is in the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$- \, 1$$\end{document}-1 state.) We show that almost all attractive Ising Models on dense graphs result in either of the two possibilities (modes) for the MAP state: either all nodes which were not infected initially became infected, or all the initially uninfected nodes remain uninfected (susceptible). This bi-modal solution of the Inference Challenge allows us to re-state the Prevention Challenge as the following tractable convex programming: for the bare Ising Model with pair-wise and bias factors representing the system without prevention measures, such that the MAP state is fully infected for at least one of the initial infection patterns, find the closest, for example in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_1$$\end{document}l1, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_2$$\end{document}l2 or any other convexity-preserving norm, therefore prevention-optimal, set of factors resulting in all the MAP states of the Ising model, with the optimal prevention measures applied, to become safe. We have illustrated efficiency of the scheme on a quasi-realistic model of Seattle. Our experiments have also revealed useful features, such as sparsity of the prevention solution in the case of the \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$l_1$$\end{document}l1 norm, and also somehow unexpected features, such as localization of the sparse prevention solution at pair-wise links which are NOT these which are most utilized/traveled.

www.nature.com/scientificreports/ Given the starting infection configuration, each infected community can infect its graph-neighbor community during the next time step with the probability associated with the edge connecting the two communities. Then the infected community moves into the removed state. The attempt to infect each neighbor is independent of all other neighbors. This creates a cascading spread of the virus across the network. The cascade stops in a finite number of steps, thereby generating a random Removed pattern, shown in black in the Fig. 1, while other communities which were never infected (remain Susceptible) are shown in blue. It was shown in 1 that with some regularization applied, statistics of the terminal state of the Cascade Model of Pandemic turns into a Graphical Model of the attractive Ising Model type.
This manuscript Road Map. Working with the Ising Model of Pandemic, we start the technical part of the manuscript by posing the Inference/Prediction Challenge in Section "Ising model of pandemic". Here, the problem is stated, first, as the Maximum A-Posteriori over an attractive Ising Model, and we argue, following the approach which is classic in the GM literature, that problem can be re-stated as a tractable LP. We then proceed to Section "Prevention challenge" to pose the main challenge addressed in the manuscript-the Prevention Challenge-as the two-level optimization with inner step requiring resolution of the aforementioned Prediction Challenge. Aiming to reduce the complexity of the Prevention problem, we turn in Section "Geometry of the MAP states" to the analysis of the conditions in the formulation of the Prediction Challenge, describing the Safety domain in the space of the Ising Model parameters. We show the Safety domain is actually a polytope, even though exponential in the size of the system. We proceed in Section "Projecting to the safe polytope" with analysis of the Prevention Challenge, discussing the interpretation of the problem as a projection to the Safety Polytope from the polytope exterior, needed when the bare prediction suggests that system will be found with high probability outside of the Safety Polytope. Section "Two polarized modes" is devoted to approximation which allows an enormous reduction in the problem complexity. We suggest here that if the graph of the system is sufficiently dense, the resulting MAP solution may only be in one of the two polarized states (a) completely safe (no other nodes except the initially infected) pick the infection, or (b) the infection is spread over the entire system. We support this remarkable simplification by detailed empirical analysis and also by some theoretical arguments. Section "Results" is devoted to the experimental illustration of the methodology on the practical example of the Graphical Model of Seattle. The manuscript is concluded in Section "Discussion" with a brief summary and discussion of the path forward.
We conclude this introductory section with a general comment about relation between ABM and GM approaches. Focusing on GMs we did not mean to suggest that the GM approach is outright better than the ABM approach. In fact, we do believe that the two are complementary. ABM is more accurate but computationally heavy, while GM is coarser but computationally lighter. We are also convinced that it will be advantageous in the future to consider the two in tandem, e.g., with the parameters of the GM trained on the synthetic data generated by the ABM.

Ising model of pandemic
As argued in 1 the terminal state of a dynamic model generalizing the ICM model can be represented by the Ising Model of Pandemic (IMP), defined over graph G = (V, E) , where V is the set of N = |V| nodes and E is the set of undirected edges. The IMP, parameterized by the vector of the node-local biases, h = (h a |a ∈ V) ∈ R N , and by the vector of the pair-wise (edge) interactions, J = (J ab |{a, b} ∈ E) , describes the following Gibbs-like probability distribution for a state, x = (x a = ±1|a ∈ V) ∈ 2 |V| , associated with V: where any node, a ∈ V can be found in either S-(susceptable, never infected) state, marked as x a = −1 , or R-(removed, i.e. infected prior to the termination) state, marked as x a = +1 . In Eq. (1), E(x|J, h) and Z(J, h) are model's energy function and partition function respectively: In what follows, we will focus on finding the Maximum-A-Posteriori (MAP) state of the IMP conditioned to a particular initialization-setting a subset of nodes, I ∈ V , to be infected. We coin the MAP problem Inference challenge.

Inference challenge
In this paper, we focus on finding the mode of distribution defined in Eq. (1). This problem is called maximum a posteriori (MAP) estimation, and the most probable states are called MAP states, or ground states (in physics literature). Moreover, in our setting, some nodes are initialized to plus one state (we denote this set of nodes by I). The corresponding binary optimization problem is formulated as follows:   www.nature.com/scientificreports/ where we emphasize dependence of the MAP solution on the set of the initially infected nodes, I. Note that in general finding x (MAP) is NP-hard 30 . However if J > 0 element-wise, i.e. the Ising Model is attractive (also called ferromagnetic in statistical physics), Eq. (4) becomes equivalent to a tractable (polynomial in N) Linear Programming (see 31 and references therein). In fact, the IMP is attractive, reflecting the fact that the state of a node is likely to be aligned with the state of its neighbor.
Let us also emphasize some other features of the IMP: 1. (G) should be thought of as an "interaction" graph of a city, reflecting transportation, commutes, and other forms of interactions between populations with the homes at the two nodes (census tracts) linked by an edge. The strength of a particular J ab shows the level of interaction associated with the edge {a, b}. 2. A component, h a , of the vector of local biases, h, is reflecting a-node specific factors such as immunization level, imposed quarantine, and degree of compliance with the public health measures (e.g., wearing masks and following other rules). Large negative/positive h a shows that residents of the census tract associated with the node a are largely healthy/infected.
If solution of the Inference Challenge problem is such that the R-subset of the MAP solution, is sufficiently large, we would like to mitigate the infection, therefore setting the Prevention Challenge discussed in the next Section.

Prevention challenge
Let us assume that modification of J and h are possible and consider the space of all feasible J and h. We will then identify Safe Domain as a sub-space of feasible J and h such that for all the initial sets of the initially infected nodes, I , considered the resulting "infected" subset, R(I, J, h) , is sufficiently small. A more accurate definition of the Safe Domain follows. Then, we rely on the definition to formulate the control/mitigation problem coined Prevention Challenge. At this stage, we would also like to emphasize that studying the geometry of the Safe Domain is one of the key contributions of this manuscript.
Definition Consider IMP over G = (V, E) and with the parameters (J, h). Let us also assume that the set of initially infected nodes, I , is drawn from the list, ϒ . We say that (J, h) is in the k-Safe Domain if for every I from ϒ the number of R-nodes in the MAP solution (4), is at most k, i.e.
Prevention challenge Given (J (0) , h (0) ) describing the bare status of the system (city) which is not in the k-Safe Domain, and given the cost of the (J, h) change, C (J, h); (J (0) , h (0) ) , what is least expensive change to (J (0) , h (0) ) state of the system which is in the the k-Safe Domain? Formally, we are interested to solve the following optimization: Expressing it informally, the Prevention Challenge seeks to identify a minimal correction (thus "corr" as the upper index) (J (corr) , h (corr) ) , which will move the system to the safe regime from the unsafe bare one, (J (0) , h (0) ) . The measures may include limiting interaction along some edges of the graph, thus modifying some components of J, or enforcing local biases, e.g., increasing level of vaccination, at some component of h.
Given that condition in Eq. (6) also requires solving Eq. (4) for each candidate (J, h), the Prevention challenge formulation is a difficult two-level optimization. However, as we will see in the next Section, the condition in Eq. (6) (and thus the inner part of the aforementioned two-level optimization) can be re-stated as the requirement of being inside of a polytope in the (J, h) space. In other words, the (k)-Safe Domain is actually a polytope in the (J, h) space.

Geometry of the MAP states
Before solving the Prevention Challenge problem, we want to shed some light on the geometry of the MAP states. We work here in the space of all the Ising models over a graph G = (V, E) , where each of the models is specified by (J, h).
Proposition Safe Domain of a graph G = (V, E) with N = |V| nodes is a polytope in the space of all feasible parameters, (J, h), defined by an exponential in N number of linear constraints.

Remark
The Proposition allows us, from now on, to use Safe Polytope instead of the Safe Domain.

Proof of the Proposition
The space of all the Ising models is divided into 2 N regions by the corresponding MAP states. Moreover, the boundary between any pair of neighboring regions is linear: consider two states x (i) and . www.nature.com/scientificreports/ ) the set of all the Ising models with the MAP state were some of these linear inequalities on the right hand side may be redundant.

Remark
In the case of k = 1 (which, obviously, applies only if all the initial infections are at a single nodes, i.e. ∀I ∈ ϒ, |I| = 1 ), there are at most, N · (2 N−1 − 1) linear inequalities.
We illustrate the geometry of the Ising model over the triangle graph (three nodes connected in a loop, K 3 ) in Fig. 2a and b. For both illustrations, we fix the h value to − 1 at all the nodes, and we are thus exploring the remaining three degrees of freedom, J 12 , J 13 , J 23 (since J is symmetric), which corresponds to exploring interactions within the class of attractive Ising models, ∀a, b = 1, 2, 3 : J ab ∈ R + .
First, we consider the case when the only node a = 1 is infected. In this simple setting there are four possible MAP states, (x 1 , Fig. 2a as green, blue, yellow and red, respectively. Finally, in Fig. 2b we plot the Safe Polytope SP(1) . We observe that the two "polarized" MAP states, (+1, −1, −1) and (+1, +1, +1) , are seen most often among the samples, while domain occupied by the other two "mixed" MAP states, (+1, −1, +1) and (+1, +1, −1) is much smaller, with the two modes positioned on the interface between the two polarized states.
As will be shown below in the next Section, the polarization phenomena with only two "polarized" MAP states, which we coin in the following the two polarized modes, which we see on this simple triangle example, is generic for the attractive Ising model.

Two polarized modes
Definition Consider a particular subset of the initially infected nodes, I (where thus, ∀a ∈ I : x a = +1 ). We call the MAP state of the model polarized when one of the following is true: (i) only initially infected nodes show + 1 within the MAP solution, ∀a ∈ I : x a = +1, ∀b ∈ V \ I : x b = −1 or (ii) all nodes within the MAP state show + 1 , ∀a ∈ V : x a = +1 . We call a MAP state mixed otherwise.
Experimenting with many dense graphs, which are typical in the pandemic modeling of modern cities with extended infrastructures and multiple destinations visited by many inhabitants, we observe that the two polarized MAP states dominate generically, while the mixed states are extremely rare. Figure 3a illustrates results of one our ensemble of random IMPs' experiments. We, first, fix N to 20, pick M such that M ≤ N(N − 1)/2 = 190 and then generate at random M edges connecting the 20 nodes. Then, for each of the random graphs (characterized by its own M) we generate 500 random samples of (J, h), representing attractive Ising models. Finally, we find the MAP state for each IMP instance, count the number of mixed states and show the dependence of the fractions of the mixed states (in the sample set) in the Fig. (3a). A fast decrease of the proportion of the mixed states is observed with an increase in M.
Extension of these experiments (see Fig. 3b) suggests that when we consider an ensemble of IMPs over graphs with N nodes and the average degree α = O(1) which is sufficiently large (so that the graph is sufficiently dense) and increase N, we observe that the Mixed State Probability (MSP), or equivalently proportion of the www.nature.com/scientificreports/ mixed-to-polarized states, decreases dramatically. Moreover, based on the experiments, we conjecture that the MSP decays to zero at α > α c , but it saturates at α < α c , where α c is the threshold depending on the ensemble details. This threshold behavior is akin to the phase transition that occurred in many models of the spin glass theory 32 and many models of the Computer Science and Theoretical Engineering defined over random graphs and considered in the thermodynamic limit, i.e. at N → ∞ . See e.g. 33 (application in the Information Theory, and specifically in the theory of the Low Density Parity Check Codes) 34 (applications in the Computer Science, and specifically for random SAT and related models) and references therein. We postpone further discussions of the conjecture for a future publication (see also brief discussion in Section "Discussion"). We will continue discussion of the two-mode solution in the next Section.

Projecting to the safe polytope
In this Section we aim to summarize all the findings so far to resolve the Prevention Challenge formulated in Section "Prevention challenge", specifically in Eq. (7) stating the problem as finding a minimal projection to the Safety Domain/Polytope from its exterior. The task is well defined, but in general, and as shown in Section "Geometry of the MAP states", it is too complex-as the description of the Safety Polytope (number of linear constraints, required to define it) is exponential in the system size (number of nodes in the graph). However, the two-mode approximation, introduced in Section "Two polarized modes", suggests a path forward: use the twomode approximation and therefore remove all the linear constraints but one, separating the two polarized states. This formula is the final result of this manuscript analytic evaluation. In the next Section we use Eq. (10), with C(·; ·) substituted by the l 1 -norm, l 2 -norm and their convex combination, to present the result of our experiments in a quasi-realistic setting describing a (hypothetical) pandemic attack and optimal defense, i.e., prevention scheme.

Results
Seattle data. We illustrate our methodology on a case study of the city of Seattle. (To verify scalability of the approach we have also experimented with larger system, e.g., models of New-York City and of the State of Wisconsin. This is still work in progress, partially reported in 35 .) Seattle has 131 Census Tracts. Each Census Tract includes 1 to 10 Census Block Groups with 600 to 3000 residents and represents 1200 to 8000 population. Boundaries separating Census Tracts are designed to represent natural or urban landmarks and also to be persistent over time 36 . To reduce complexity, we merge census tracts into 20 regions. See Fig. 4. To  Mobility data associated with travelers crossing the boundaries of Seattle was ignored. We then follow the methodology developed in 1 to combine the aggregated travel data with the epidemiological data, representing current state of infection in the area. This results in the estimation of the pair-wise interactions, J, parameterizing the Ising Model of Pandemic. We also come up with an exemplary (uniform over the system) local biases, h, completing the definition of the model. (We remind that the prime focus of the manuscript is on developing methodology which is AI sound and sufficiently general. Therefore, the data used in the manuscript are roughly representative of the situation of interest, however not fully practical.) We consider a situation with different levels of infection and chose (J (0) , h (0) ) stressed enough, that is resulting in the prediction (answer to our Prediction Challenge), which lands the system in the dangerous domain-outside of the Safety Polytope.

Convex projection.
In all of our experiments, we have used the general-purpose Gurobi optimization solver 40 to compute the MAP states and thus to validate the two-mode assumption. (We have also experimented with CVXPY 41 , but found it performing slower than Gurobi, at least over the relatively small samples considered in this proof of principles study. In the future, we plan to use existing, or developing new, LP solvers designed specifically for finding the MAP state of the attractive Ising model.) To illustrate our Prevention strategy, we took the Seattle data described above, and fed it as an input into the optimization (10), describing projection to the Safety Polytope, where C(·; ·) is substituted by the l 1 norm. CVXPY solver was used for this convex optimization task. Our code (python within jupyter notebook) is available at 35 . Table 1 shows results of our Prevention experiments on the Seattle data. We analyze , first, l 1 projection to SP(ϒ) where ϒ consists of all the initial infection patterns consisting of up to k nodes. In all of our experiments, the values of the field vector h (uniform across the system) was fixed to −1 . We observe that the number of constraints grows exponentially with k; however, the cost of intervention remains roughly the same. We intend to analyze the results of this and other (more realistic) experiments in future publications aimed at epidemiology experts and public health officials.
Our initial reason for choosing l 1 in the experiments discussed above was based on empirical expectation that l 1 , as the closest of the convex options to l 0 , will enforce sparsity of the optimal reduction in the pair-vise www.nature.com/scientificreports/ influences/traffic patterns. However, by choosing to work with the l 1 -norm, we certainly did not mean to suggest that other norms cannot be used, especially if propounded by public health and/or government experts-after all it is the professional judgment of these experts which is expected to drive practical choices of the cost function. Therefore, we also experimented with the l 2 norm and more generally with a convex mixture (interpolation between) l 1 and l 2 norms. These richer experiments are illustrated in Figs. 5 and 6. Figure 5 compares initial infectious patterns and optimal mitigation in the l 1 (left sub-figure) and l 2 (right sub-figure) cases. The experiments suggest (consistently with the original hypothesis) that significant reduction (of influence/traffic) in the case of l 1 norm is observed at a fewer pair-wise links and, most interestingly, these reductions are NOT all along the links which are most intense (traveled less) in the original instance. On the contrary, in the l 2 case there are more edges with comparable reductions and these reductions are largely across the links which show heavy influences/traffic already in the bare cases (without correction) which require prevention.  Table 1. Summary of our prevention experiments on the Seattle data. k, in the first column, is the maximal number of nodes in the initially infected patterns (all accounted for to construct the k-Safe Polytope). The second column shows number of linear constraints characterizing the k-Safe Polytope. Respective Run Time and Cost are shown in the 3rd and 4th column, where Cost shows the difference in l 1 norm between the (J (0) , h (0) ) , characterizing stressed but unmitigated regime, and the optimal prevention regime, resulting in ( J (corr) , h (corr) ) computed according to Eq. (10).

Discussion
In this manuscript, written specifically for an interdisciplinary audience of researchers in mathematical, physical and engineering sciences, we follow our prior work 1 (aimed primarily at experts in public health and epidemiology) and explain challenges in inference (prediction) and control (prevention) of pandemics. We use the language of GMs, which is one powerful tool in the modern arsenal of Data Science and AI, and state the Prediction Challenge as a MAP optimization over an attractive Ising model, which can be expressed generically as a solution of a tractable Linear Programming (LP). We then turn to the analysis of the prevention problem, which one formulates if the aforementioned prediction solution suggests that the probability of significant infection is above a pre-defined (by the public health experts) tolerance threshold. We show that in its simplest formulation, the prevention problem is equivalent to finding minimal l 1 projection to the safety polytope, where the latter is defined by solving the aforementioned prediction problem. In general, the polytope does not allow a description non-exponential in the size of the system. However, we suggested an approximation that allows to approximate the safety polytope efficiently -that is, linearly in the number of the initial infection patterns. The approximation is justified (empirically, with supporting theoretical arguments, however not yet backed by a mathematically rigorous theory) in the case when the interaction graph of the system (e.g., related to the system/ city transportation and human-to-human interaction network) is sufficiently dense. We conclude by providing a quasi-realistic experimental demonstration on the GM of Seattle. As a disclaimer but also a path forward, we would like to point out that this manuscript is not there yet where we would dream it to be in terms of practical use by heath-care authorities/practitioners. However, the manuscript suggests an important, and yet missing in the literature, path towards the applications. We focused in the manuscript on developing methodology which we intend to make useful for practical purposes in the future. This will require accurate calibration of the pairwise and singleton terms in the Ising (Graphical) models, e.g., as mentioned above, based on "learning" the rates from much better developed ABMs. When such calibration of the parameters in the Ising model is done, the pair-wise terms (J) and the singleton terms (h) may be interpreted as corresponding to pair-wise "traffic" or "influences" between aggregated areas and singleton biases, e.g., related to degree of vaccination or compliance with the local public health mandates. Then, the outcome of this paper www.nature.com/scientificreports/ methodology will become practical, e.g., resulting in (optimal) recommendations of the traffic/influence and local modifications to avoid a disastrous spread of infection. We conclude with an incomplete list of technical challenges, presented in the order of importance (subjective), which need to be resolved to make the powerful GM approach to pandemic prediction and prevention practical: • Build a hierarchy of Agent Based and Probabilistic Graphical Models which allow more accurate (than Ising model) representation of the infection patterns over geographical and community graphs. (See also our remark above about making the modeling more realistic.) The models may be both of the static (like Ising) or dynamic (like Independent Cascade Model) types. Extend the notion of the Safety Region (polytope) to the new GM of pandemics. • Consider the case when the resolution of the Prediction Challenge problem returns a positive answermost likely future state of the system is safe, and then develop the methodology which allows estimating the probability of crossing the safety boundary. In other words, we envision formulating and solving in the context of the GM a problem which is akin to the one addressed in 45 : estimate the probability of finding the system outside of the Safety Polytope. • Construct other (than two-mode) approximations to the Safety Polytope. In particular, approximations built on sampling of the boundaries of the safety polytope and learning (possibly reinforcement learning) are needed. • Develop the asymptotic (thermodynamic limit) theory which allows validating (and/or correcting systematically) the efficient (two-mode and other) approximations of the Safety Polytope.