key: cord-0575949-ck9kxz2f authors: Cutura, Gojko; Li, Boning; Swami, Ananthram; Segarra, Santiago title: Deep Demixing: Reconstructing the Evolution of Epidemics Using Graph Neural Networks date: 2020-11-18 journal: nan DOI: nan sha: 27cc2c00ca4c9766471efbdb735f70f32071b7c9 doc_id: 575949 cord_uid: ck9kxz2f We study the temporal reconstruction of epidemics evolving over networks. Given partial or aggregated temporal information of the epidemic, our goal is to estimate the complete evolution of the spread leveraging the topology of the network but being agnostic to the precise epidemic model. We overcome this lack of model awareness through a data-driven solution to the inverse problem at hand. In particular, we propose DDmix, a graph conditional variational autoencoder that can be trained from past epidemic spreads and whose latent space seeks to capture key aspects of the underlying (unknown) spreading dynamics. We illustrate the accuracy and generalizability of DDmix and compare it with non-graph-aware learning algorithms through numerical experiments on epidemic spreads simulated on synthetic and real-world networks. Networks or graphs have emerged as useful models to represent complex interconnected systems and data defined on them [1] . These representations have wide applicability across multiple domains, including neuroscience [2] , sociology [3] , and urban planning [4] . In an attempt to better understand and learn from data defined on networks, classical signal processing and machine learning methods have recently been extended to encompass this data type [5] . These novel tools have shown remarkable performance in a variety of popular network science problems such as node classification [6] , link prediction [7] , and several inference tasks related to partially observed network processes [8, 9] . Within the range of network processes, the use of graphs for modeling and understanding epidemics is a fertile subfield [10] . Networks provide versatile modelling tools where nodes might represent anything from single individuals [11] to large cities [12] and edges can encode different mechanisms of disease propagation -airborne, contact, or vector transmission -across the nodes of the network. Classical graph features such as node centralities and connectivity measures can then be used to inform public health measures, e.g., immunization strategies and lockdown procedures [13, 14] . In this paper, we go beyond established classical methods and propose a novel neural network architecture to solve the challenging Research was sponsored by the Army Research Office and was accomplished under Cooperative Agreement Number W911NF-19-2-0269. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the Army Research Office or the U.S. Government. The U.S. Government is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation herein. Emails: cg203066m@student.etf.bg.ac.rs, {boning.li, segarra}@rice.edu, ananthram.swami.civ@mail.mil. problem of temporally reconstructing an epidemic. More precisely, given partial or aggregated temporal information of the epidemic, we want to infer the evolution of the spread. Being an inherently ill-posed inverse problem, one might rely on precise knowledge of the network process or structural features of the initial condition to solve this underdeterminacy [8] . By contrast, we propose a modelinspired data-driven solution. Namely, we put forth a conditional variational auto-encoder (CVAE) [15] based on graph neural networks (GNNs) [6, 16] that is trained to solve the inverse problem from available data and, importantly, whose architecture and training loss are inspired by the locality of the spreading mechanism. Related work. We can model an epidemic spread as the temporal evolution of a signal defined on the nodes of a graph. From this viewpoint, epidemic reconstruction boils down to different well-studied problems depending on the observation model. More precisely, if the state of some nodes is observed and we want to infer the state of the remaining nodes, the problem can be modeled as the interpolation of graph signals [17, 18] ; if the final state of every node is observed, the problem resembles blind deconvolution on graphs [8] ; and if a temporally aggregated signal for each node is observed, then the problem boils down to demixing of graph signals [19] . This last body of work is the one that best fits our observation model. However, existing tools [19] were derived for linear network processes that cannot accurately model epidemic spreads, thus prompting the need for non-linear methods as DDmix, the one derived here. From the perspective of computational epidemiology, temporal reconstruction has also received attention [20, 21] . For example, under the assumption that an epidemic spread follows the SI model, [22] recovers multiple source nodes from a single snapshot of the complete graph. Similarly, [23] focuses on identifying key spreaders under the independent-cascade model. In general, this body of work largely relies on the precise knowledge of the epidemic model or the time of contagion for a subset of nodes. We depart from this paradigm and, instead, rely on observed past data to learn how to solve the reconstruction problem. Related to our proposed architecture, CVAEs [15] have been shown to be effective in solving prediction and data generation problems in non-graph settings [24, 25] and have been used for the reconstruction of videos from temporally aggregated data [26] . Motivated by this success, some of these tools have been extended to graph settings [27, 28] and mostly applied as node embedding procedures [29] . To the best of our knowledge, this is the first implementation of a graph CVAE for the study of epidemics and, generally, for the solution of inverse problems related to network processes. Contribution. The contributions of our paper are twofold: i) We propose DDmix, a novel graph CVAE architecture for the temporal reconstruction of partially-observed network processes; and ii) We successfully implement DDmix to infer the evolution of epidemics surpassing non-graph-aware deep learning methods in terms of accuracy and generalizability. We model our inter-connected system as an undirected graph G = (V, E) with N nodes in V representing individuals and edges (i, j) ∈ E encoding the possibility of contagion between nodes i and j. The graph structure can be represented using the symmetric adjacency matrix A ∈ R N ×N , where Aij = Aji = 1 for all (i, j) ∈ E, and Aij = Aji = 0 otherwise. We model the state of the nodes at any given time using graph signals, i.e., maps x : V → R from the node set into the reals. Graph signals can be conveniently represented as vectors x ∈ R N , where the entry xi collects the value of the graph signal at node i. Specifically, we consider a time series of graph signals {y (t) } T t=1 where y (t) ∈ {0, 1} N and y (t) i = 1 indicates that node i was in the infected state at time instant t while y (t) i = 0 indicates that it was not infected at that time. Although not assumed to be known by our proposed solution, for simplicity in this paper we focus on a particular epidemic model, namely the well-established SIRS model [11] . This is a parametric stochastic model that, given y (1) , determines the probability distribution of the signals y (t) for t > 1. More precisely, the SIRS model is defined by three parameters -the infection probability β, the healing probability δ, and the probability of losing immunity γ -and every node is at one of three states -susceptible (S), infected (I), or recovered (R). At every discrete time step t: i) Susceptible nodes can get infected independently with probability β by any of their infected neighbors in G, ii) Infected nodes recover with probability δ, and iii) Recovered nodes become susceptible with probability γ. Under an (unknown) SIRS spread {y (t) } T t=1 of interest, our partial information is given by a single graph signal x = f ({y (t) } T t=1 ) that captures some observable feature of the epidemic. In this paper we focus on the case where x represents a temporal aggregation of the infections, i.e., With this notation in place, we can formally state our problem. Problem 1. Consider an unknown epidemic {y (t) } T t=1 that spreads over a known graph G. Given the temporally aggregated information x as in (1), estimate the complete evolution of the spread {y (t) } T t=1 . Before presenting our proposed solution to Problem 1 in the next section, a few comments are in order. First, the observation model is realistic and reasonable in practice. We get to observe if a person was infected or not over a given period of time as well as the length of the infection (either self-reported or estimated by proxies such as serology tests), but we do not have precise temporal discrimination of when the infection started and ended. Second, Problem 1 can be reinterpreted as one of signal demixing in graphs since we are given the (temporal) aggregation of many signals and we want to tell them apart leveraging the graph structure. This is a challenging and highly underdetermined problem even in the simpler setting of linear network processes. Finally, to solve this ill-posed and non-linear problem, most existing approaches assume precise knowledge of the epidemic model [22, 23] . By constrast, we take a data-driven perspective where we assume nothing about the underlying epidemic model other than it being driven by the topology of G, and we use past known observation pairs (x, {y (t) } T t=1 ) to train our model. Crucially, only few known pairs are needed and these need not belong to the same graph G mentioned in Problem 1 since our proposed solution effectively generalizes across graph distributions and sizes, as we illustrate in Section 4. Assume that we have access to M pairs (x, Y) where, for notation simplicity, we have collected the T vectors in {y (t) } T t=1 as columns of Y ∈ R N ×T . Our goal is to leverage these observations to estimate the conditional distribution p(Y|x) for the particular scenario of interest. In this way, given a new observation x, we can sample our candidate temporal evolution Y from p(Y|x). In determining p(Y|x), we adopt a CVAE probabilistic model [15] with a latent variable z that seeks to model features of the temporal variation of the epidemic that have been collapsed in x. Intuitively, we want z to capture key aspects of the underlying (unknown) spreading dynamics. We model z as conditionally Gaussian given x, i.e., . Notice that we have made explicit our focus on a parametric form of the conditional probability, where both the mean and the standard deviation of z are functions of x dependent on the parameters φ. Following the CVAE framework, we define the distribution of our variable of interest as Under our probabilistic model, (2) reveals that if we have access to the true latent variable z in combination with our observation x, then we can apply a parametric deprojection function g θ to get the expected value of the temporal evolution Y, where σ 2 y is a common noise variance for all entries. From the two conditional probabilities introduced, it follows that we can compute the distribution of interest as However, solving the integral in (3) can become intractable even for fairly simple parametrizations of p θ and p φ . Thus, determining the parameters θ and φ that maximize the likelihood of the observed M pairs (x, Y) is a challenging endeavor in general. Instead, we follow the well-accepted route of variational inference and establish a tractable loss inspired by the evidence lower bound (see [30] for details) that can be optimized via stochastic gradient descent. To present this loss, we first need to introduce a parametrization for the posterior probability of z given Y, namely q ψ (z|Y) = N (µ ψ (Y), σ 2 ψ (Y)). We then consider the following compound loss consisting of two fitting terms L1(·) and L2(·), two regularization terms R1(·) and R2(·), and their relative importance given by the scalar weights {ηi} 3 i=1 . In our implementation, we select the loss L1(·) as the KL divergence between the argument distributions which ensures that, during testing, our draws from p φ (z|x) will be close to the more informative draws from q ψ (z|Y). For the reconstruction loss L2(·), we recall that the entries of Y are binary -either infected or not -and select an entry-wise binary cross entropy loss. In terms of regularization, we implement in R1(·) an 2 penalty on all our parameters. More interestingly, in R2(·) we incorporate the knowledge that the evolution of the epidemic should be local in G. To be precise, we de- [ · ]+ denotes the positive projection andŷ (t) is the t-th column of Y = g θ (x,ẑ). Intuitively, in R2(·) we penalize the appearance of an infected node at time t when neither itself nor any of its neighbors was infected at time t − 1. Finally, notice that the loss in (4) is actually a random quantity since the latent variableẑ used in the Y < l a t e x i t s h a 1 _ b a s e 6 4 = " / S j l t I t 1 / 2 1 I a y L Y w 4 V d Z d / w 4 r I = " > A A A C J H i c b Z B L S 8 Q w F I V T 3 9 b X q E s 3 w U F w M Q y t C M 7 s R B e 6 V H F 8 T Y u k 6 e 0 Y J k 1 D k g p D 6 b 9 w q 0 t / j T t x 4 c b f Y l p n 4 e t C 4 H D O v e T w R Z I z b T z v 3 Z m Y n J q e m Z 2 b d x c W l 5 Z X G q t r F z r L F Y U e z X i m r i K i g T M B P c M M h y u p g K Q R h 8 t o e F j l l / e g N M v E u R l J C F M y E C x h l B h r 3 Q Q p M X d R U l y X t 4 2 m 1 / b q w X + F P x Z N N J 6 T 2 1 V n K o g z m q c g D O V E 6 7 7 v S R M W R B l G O Z R u k G u Q h A 7 J A P p W C p K C b u H 4 n k l d 6 7 C o 6 5 d 4 y 6 Y x T j J l n z C 4 d r 9 f F y T V e p R G d r O q q 3 9 n l f l f 1 s 9 N 0 g k L J m R u Q N C v j 5 K c Y 5 P h i g W O m Q J q + M g K Q h W z v T G 9 I 4 p Q Y 4 m 5 Q Q y J p V r X K Q Y K R l K a s j g 7 O i i L T r f V 9 V q d b u m 6 l p v / m 9 J f c b H T 9 r 2 2 f 7 r b 3 D 8 Y E 5 x D G 2 g T b S M f 7 a F 9 d I x O U A 9 R J N A D e k R P z r P z 4 r w 6 b 1 + r E 8 7 4 Z h 3 9 G O f j E 5 I Y o y E = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " / S j l t I t 1 / 2 1 I a y L Y w 4 V d Z d / w 4 r I = " > A A A C J H i c b Z B L S 8 Q w F I V T 3 9 b X q E s 3 w U F w M Q y t C M 7 s R B e 6 V H F 8 T Y u k 6 e 0 Y J k 1 D k g p D 6 b 9 w q 0 t / j T t x 4 c b f Y l p n 4 e t C 4 H D O v e T w R Z I z b T z v 3 Z m Y n J q e m Z 2 b d x c W l 5 Z X G q t r F z r L F Y U e z X i m r i K i g T M B P c M M h y u p g K Q R h 8 t o e F j l l / e g N M v E u R l J C F M y E C x h l B h r 3 Q Q p M X d R U l y X t 4 2 m 1 / b q w X + F P x Z N N J 6 T 2 1 V n K o g z m q c g D O V E 6 7 7 v S R M W R B l G O Z R u k G u Q h A 7 J A P p W C p K C b u H 4 n k l d 6 7 C o 6 5 d 4 y 6 Y x T j J l n z C 4 d r 9 f F y T V e p R G d r O q q 3 9 n l f l f 1 s 9 N 0 g k L J m R u Q N C v j 5 K c Y 5 P h i g W O m Q J q + M g K Q h W z v T G 9 I 4 p Q Y 4 m 5 Q Q y J p V r X K Q Y K R l K a s j g 7 O i i L T r f V 9 V q d b u m 6 l p v / m 9 J f c b H T 9 r 2 2 f 7 r b 3 D 8 Y E 5 x D G 2 g T b S M f 7 a F 9 d I x O U A 9 R J N A D e k R P z r P z 4 r w 6 b 1 + r E 8 7 4 Z h 3 9 G O f j E 5 I Y o y E = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " / S j l t I t 1 / 2 1 I a y L Y w 4 V d Z d / w 4 r I = " > A A A C J H i c b Z B L S 8 Q w F I V T 3 9 b X q E s 3 w U F w M Q y t C M 7 s R B e 6 V H F 8 T Y u k 6 e 0 Y J k 1 D k g p D 6 b 9 w q 0 t / j T t x 4 c b f Y l p n 4 e t C 4 H D O v e T w R Z I z b T z v 3 Z m Y n J q e m Z 2 b d x c W l 5 Z X G q t r F z r L F Y U e z X i m r i K i g T M B P c M M h y u p g K Q R h 8 t o e F j l l / e g N M v E u R l J C F M y E C x h l B h r 3 Q Q p M X d R U l y X t 4 2 m 1 / b q w X + F P x Z N N J 6 T 2 1 V n K o g z m q c g D O V E 6 7 7 v S R M W R B l G O Z R u k G u Q h A 7 J A P p W C p K C b u H 4 n k l d 6 7 C o 6 5 d 4 y 6 Y x T j J l n z C 4 d r 9 f F y T V e p R G d r O q q 3 9 n l f l f 1 s 9 N 0 g k L J m R u Q N C v j 5 K c Y 5 P h i g W O m Q J q + M g K Q h W z v T G 9 I 4 p Q Y 4 m 5 Q Q y J p V r X K Q Y K R l K a s j g 7 O i i L T r f V 9 V q d b u m 6 l p v / m 9 J f c b H T 9 r 2 2 f 7 r b 3 D 8 Y E 5 x D G 2 g T b S M f 7 a F 9 d I x O U A 9 R J N A D e k R P z r P z 4 r w 6 b 1 + r E 8 7 4 Z h 3 9 G O f j E 5 I Y o y E = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " / S j l t I t 1 / 2 1 I a y L Y w 4 V d Z d / w 4 r I = " > A A A C J H i c b Z B L S 8 Q w F I V T 3 9 b X q E s 3 w U F w M Q y t C M 7 s R B e 6 V H F 8 T Y u k 6 e 0 Y J k 1 D k g p D 6 b 9 w q 0 t / j T t x 4 c b f Y l p n 4 e t C 4 H D O v e T w R Z I z b T z v 3 Z m Y n J q e m Z 2 b d x c W l 5 Z X G q t r F z r L F Y U e z X i m r i K i g T M B P c M M h y u p g K Q R h 8 t o e F j l l / e g N M v E u R l J C F M y E C x h l B h r 3 Q Q p M X d R U l y X t 4 2 m 1 / b q w X + F P x Z N N J 6 T 2 1 V n K o g z m q c g D O V E 6 7 7 v S R M W R B l G O Z R u k G u Q h A 7 J A P p W C p K C b u H 4 n k l d 6 7 C o 6 5 d 4 y 6 Y x T j J l n z C 4 d r 9 f F y T V e p R G d r O q q 3 9 n l f l f 1 s 9 N 0 g k L J m R u Q N C v j 5 K c Y 5 P h i g W O m Q J q + M g K Q h W z v T G 9 I 4 p Q Y 4 m 5 Q Q y J p V r X K Q Y K R l K a s j g 7 O i i L T r f V 9 V q d b u m 6 l p v / m 9 J f c b H T 9 r 2 2 f 7 r b 3 D 8 Y E 5 x D G 2 g T b S M f 7 a F 9 d I x O U A 9 R J N A D e k R P z r P z 4 r w 6 b 1 + r E 8 7 4 Z h 3 9 G O f j E 5 I Y o y E = < / l a t e x i t > V K u F E a W B Z L u I 5 v T + b + 9 R S 0 E U V + i T M F w 4 y N c 5 E K z t B J o 2 Y z m j C 0 U c Z w E q f 2 d 1 W N m q 2 g E 9 S g 7 0 m 4 J C 2 y x P l o x 2 t E S c H L D H L k k h k z C A O F Q 8 s 0 C i 6 h 8 q P S g G L 8 l o 1 h 4 G j O M j B t m k y F M j U f 2 v o b F f 3 h 3 I S m h X Y v R 1 q r L 7 c t y 4 y Z Z b G b n O c 1 b 7 2 5 + J E 3 K D H t D q 3 I V Y m Q 8 8 W h t J Q U C z r v h C Z C A 0 c 5 c 4 R x L V x V K u F E a W B Z L u I 5 v T + b + 9 R S 0 E U V + i T M F w 4 y N c 5 E K z t B J o 2 Y z m j C 0 U c Z w E q f 2 d 1 W N m q 2 g E 9 S g 7 0 m 4 J C 2 y x P l o x 2 t E S c H L D H L k k h k z C A O F Q 8 s 0 C i 6 h 8 q P S g G L 8 l o 1 h 4 G j O M j B t m k y F M j U f 2 v o b F f 3 h 3 I S m h X Y v R 1 q r L 7 c t y 4 y Z Z b G b n O c 1 b 7 2 5 + J E 3 K D H t D q 3 I V Y m Q 8 8 W h t J Q U C z r v h C Z C A 0 c 5 c 4 R x L V x V K u F E a W B Z L u I 5 v T + b + 9 R S 0 E U V + i T M F w 4 y N c 5 E K z t B J o 2 Y z m j C 0 U c Z w E q f 2 d 1 W N m q 2 g E 9 S g 7 0 m 4 J C 2 y x P l o x 2 t E S c H L D H L k k h k z C A O F Q 8 s 0 C i 6 h 8 q P S g G L 8 l o 1 h 4 G j O M j B t m k y F M j U f 2 v o b F f 3 h 3 I S m h X Y v R 1 q r L 7 c t y 4 y Z Z b G b n O c 1 b 7 2 5 + J E 3 K D H t D q 3 I V Y m Q 8 8 W h t J Q U C z r v h C Z C A 0 c 5 c 4 R x L V x A q j q L 1 + j l Z s b Z X u 8 + R o p X / C V c 4 9 t f 0 g i q u / I 5 u n B z 4 G m m l 0 c x 7 e r M T K y k M B s G j t / K p s f p 5 b X 3 D 3 9 z 6 8 n W 7 u f P t y h S l 5 t D n h S z 0 M h 3 c k n 6 h J M p + U 3 u y Y P 3 6 P 3 x n r 2 / r 6 N L 3 m L n K 3 k D 7 + U f r e C m Q A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " U 3 k u M h 3 c k n 6 h J M p + U 3 u y Y P 3 6 P 3 x n r 2 / r 6 N L 3 m L n K 3 k D 7 + U f r e C m Q A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " U 3 k u M h 3 c k n 6 h J M p + U 3 u y Y P 3 6 P 3 x n r 2 / r 6 N L 3 m L n K 3 k D 7 + U f r e C m Q A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " U 3 k u M h 3 c k n 6 h J M p + U 3 u y Y P 3 6 P 3 x n r 2 / r 6 N L 3 m L n K 3 k D 7 + U f r e C m Q A = = < / l a t e x i t > x < l a t e x i t s h a 1 _ b a s e 6 4 = " 4 U / a e e t v p D k o q r H P E e L 2 G S U I h P g = " y h L 6 V g i S g W z i 6 Y 1 J X e p B X 9 Q u 8 Z d M I x 6 m y T x h c u d + v c 5 J o P U 5 C u 1 n W 1 b + z 0 v w v 6 2 c m 7 g x y J m R m Q N C v j + K M Y 5 P i k g W O m A J q + N g K Q h W z v T G 9 J Y p Q Y 4 m 5 Q Q S x p V r V y Y c K x l K a I j 8 7 O i j y T r f V 9 V q d b u G 6 l p v / m 9 J f c b H T 9 r 2 2 f 7 r b 3 D + o C c 6 i D b S J t p G P 9 t A + O k Y n q I c o E u g B P a I n 5 9 l 5 c V 6 d t 6 / V h l P f r K M f 4 3 x 8 A s d g o 0 A = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 4 U / a e e t v p D k o q r H P E e L 2 G S U I h P g = " > A O x E Y R K Z u 7 G 9 I 5 I Q r W B Z n s h R A Z s e U 4 + k j A W Q h f 5 1 V m 3 y F v t R t t p t N q F b R t u 7 j y l v 6 J / 1 H S d p n t 5 X O 9 0 Z w R X 0 R 7 a R 4 f I R S e o g 8 7 R B e o h i h 7 Q E 3 p G L 9 a r 9 W a 9 W x / T 1 g V r N r O L f p X 1 9 Q 1 K d q S M < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 0 h j T g l F K J G 8 L Y / y R O l 9 w M 9 W B w v 0 = " 3 p G L 9 a r 9 W a 9 W x / T 1 g V r N r O L f p X 1 9 Q 1 K d q S M < / l a t e x i t > µ < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 n y E 1 O 6 z a 6 F V i 7 B Q B J 8 X L H 6 9 7 X k = " > A A A C J H i c b Z D N S g M x F I U z / j v + V V 2 6 C R b B R S k z I t j u R B e 6 V L G t 2 C k l k 7 n T B j O Z k G S E M s x b u N W l T + N O X L j x W U z H L r T 1 Q O D j n n v J 4 Y S S M 2 0 8 7 9 O Z m 1 9 Y X F p e W X X X 1 j c 2 t y r b O 2 2 d Z o p C i 6 Y 8 V X c h 0 c C Z g J Z h h s O d V E C S k E M n f D g f + 5 1 H U J q l 4 t a M J P Q S M h A s Z p Q Y O 7 o P k q y f B 3 L I i n 6 l 6 t W 9 U n g W / A l U 0 U R X / W 1 n I Y h S m i U g D O V E 6 6 7 v S d P L i T K M c i j c I N M g C X 0 g A + h a F C Q B X c P R I 5 O 6 5 F 5 e x i / w g X U j H K f K P m F w O f 1 9 n Z N E 6 1 E S 2 s 2 E m K G e 9 s b D / 7 x u Z u J G L 2 d C Z g Y E / f k o z j g 2 K R 5 3 g S O m g B o + s k C o Y j Y 3 p k O i C D W 2 M T e I I L a t l n H y g Y K R l K b I b y 7 O i r z R r D W 9 W q N Z u K 7 t z Z 9 u a R b a R 3 X f q / v X x 9 X T s 0 m D K 2 g P 7 a N D 5 K M T d I o u 0 R V q I Y o E e k L P 6 M V 5 d d 6 c d + f j Z 3 X O m d z s o j 9 y v r 4 B s w q j N A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 n y E 1 O 6 z a 6 F V i 7 B Q B J 8 X L H 6 9 7 X k = " > A A A C J H i c b Z D N S g M x F I U z / j v + V V 2 6 C R b B R S k z I t j u R B e 6 V L G t 2 C k l k 7 n T B j O Z k G S E M s x b u N W l T + N O X L j x W U z H L r T 1 Q O D j n n v J 4 Y S S M 2 0 8 7 9 O Z m 1 9 Y X F p e W X X X 1 j c 2 t y r b O 2 2 d Z o p C i 6 Y 8 V X c h 0 c C Z g J Z h h s O d V E C S k E M n f D g f + 5 1 H U J q l 4 t a M J P Q S M h A s Z p Q Y O 7 o P k q y f B 3 L I i n 6 l 6 t W 9 U n g W / A l U 0 U R X / W 1 n I Y h S m i U g D O V E 6 6 7 v S d P L i T K M c i j c I N M g C X 0 g A + h a F C Q B X c P R I 5 O 6 5 F 5 e x i / w g X U j H K f K P m F w O f 1 9 n Z N E 6 1 E S 2 s 2 E m K G e 9 s b D / 7 x u Z u J G L 2 d C Z g Y E / f k o z j g 2 K R 5 3 g S O m g B o + s k C o Y j Y 3 p k O i C D W 2 M T e I I L a t l n H y g Y K R l K b I b y 7 O i r z R r D W 9 W q N Z u K 7 t z Z 9 u a R b a R 3 X f q / v X x 9 X T s 0 m D K 2 g P 7 a N D 5 K M T d I o u 0 R V q I Y o E e k L P 6 M V 5 d d 6 c d + f j Z 3 X O m d z s o j 9 y v r 4 B s w q j N A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 n y E 1 O 6 z a 6 F V i 7 B Q B J 8 X L H 6 9 7 X k = " > A A A C J H i c b Z D N S g M x F I U z / j v + V V 2 6 C R b B R S k z I t j u R B e 6 V L G t 2 C k l k 7 n T B j O Z k G S E M s x b u N W l T + N O X L j x W U z H L r T 1 Q O D j n n v J 4 Y S S M 2 0 8 7 9 O Z m 1 9 Y X F p e W X X X 1 j c 2 t y r b O 2 2 d Z o p C i 6 Y 8 V X c h 0 c C Z g J Z h h s O d V E C S k E M n f D g f + 5 1 H U J q l 4 t a M J P Q S M h A s Z p Q Y O 7 o P k q y f B 3 L I i n 6 l 6 t W 9 U n g W / A l U 0 U R X / W 1 n I Y h S m i U g D O V E 6 6 7 v S d P L i T K M c i j c I N M g C X 0 g A + h a F C Q B X c P R I 5 O 6 5 F 5 e x i / w g X U j H K f K P m F w O f 1 9 n Z N E 6 1 E S 2 s 2 E m K G e 9 s b D / 7 x u Z u J G L 2 d C Z g Y E / f k o z j g 2 K R 5 3 g S O m g B o + s k C o Y j Y 3 p k O i C D W 2 M T e I I L a t l n H y g Y K R l K b I b y 7 O i r z R r D W 9 W q N Z u K 7 t z Z 9 u a R b a R 3 X f q / v X x 9 X T s 0 m D K 2 g P 7 a N D 5 K M T d I o u 0 R V q I Y o E e k L P 6 M V 5 d d 6 c d + f j Z 3 X O m d z s o j 9 y v r 4 B s w q j N A = = < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " 6 n y E 1 O 6 z a 6 F V i 7 B Q B J 8 X L H 6 9 7 X k = " > A A A C J H i c b Z D N S g M x F I U z / j v + V V 2 6 C R b B R S k z I t j u R B e 6 V L G t 2 C k l k 7 n T B j O Z k G S E M s x b u N W l T + N O X L j x W U z H L r T 1 Q O D j n n v J 4 Y S S M 2 0 8 7 9 O Z m 1 9 Y X F p e W X X X 1 j c 2 t y r b O 2 2 d Z o p C i 6 Y 8 V X c h 0 c C Z g J Z h h s O d V E C S k E M n f D g f + 5 1 H U J q l 4 t a M J P Q S M h A s Z p Q Y O 7 o P k q y f B 3 L I i n 6 l 6 t W 9 U n g W / A l U 0 U R X / W 1 n I Y h S m i U g D O V E 6 6 7 v S d P L i T K M c i j c I N M g C X 0 g A + h a F C Q B X c P R I 5 O 6 5 F 5 e x i / w g X U j H K f K P m F w O f 1 9 n Z N E 6 1 E S 2 s 2 E m K G e 9 s b D / 7 x u Z u J G L 2 d C Z g Y E / f k o z j g 2 K R 5 3 g S O m g B o + s k C o Y j Y 3 p k O i C D W 2 M T e I I L a t l n H y g Y K R l K b I b y 7 O i r z R r D W 9 W q N Z u K 7 t z Z 9 u a R b a R 3 X f q / v X x 9 X T s 0 m D K 2 g P 7 a N D + E y I A t z 8 m H E i Z C 6 C K / v j g r 8 l a 7 0 X Y a r X Z h 2 4 a b + 5 v S X 9 E 7 a r p O 0 7 0 6 r p + e z Q l W 0 B 7 a R 4 f I R S f o F F 2 i D u o i i u 7 R I 3 p C z 9 a L 9 W q 9 W e + z 1 g V r P r O L f p T 1 + Q U 3 g 6 S B < / l a t e x i t > e V P z 4 F q V i W / t C F g F l C F i m L G S X a S v P G j p i b Q C x Z u R s k R C / D 2 J y X F 3 / w r P w y b 7 S 8 j l c V f g 7 + G l p o X c N 5 0 6 k F U U b z B F J N O V F q 6 n t C z w y R m l E O p R v k C g S h J 2 Q B U 4 s p S U C 1 c X T K h K p 4 Z q p f l f i z d S M c Z 9 K + V O N K / X v a k E S p I g l t 5 + p e 9 d R b i f / z p r m O u z P D U p F r S O n j o j j n W G d 4 F R G O m A S q e W G B U M n s 3 Z g u i S R U 2 y D d I I L Y h l 2 d Y x Y S C i F 0 a Y 6 + D 0 r T 7 b V 7 X r v b K 1 3 X 5 u Y / T e k 5 H H / t + F 7 H P 9 x r 9 Q f r B O t o G 3 1 C u 8 h H + 6 i P D t A Q j R B F P 9 E l u k K / n N / O j X P r 3 D 2 2 b j j r m Y / o n 3 L u H w B 7 9 K y 0 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q p / 5 1 U Z 9 r Z H m v w m B v H u I j H 9 W F U 8 = " > A e V P z 4 F q V i W / t C F g F l C F i m L G S X a S v P G j p i b Q C x Z u R s k R C / D 2 J y X F 3 / w r P w y b 7 S 8 j l c V f g 7 + G l p o X c N 5 0 6 k F U U b z B F J N O V F q 6 n t C z w y R m l E O p R v k C g S h J 2 Q B U 4 s p S U C 1 c X T K h K p 4 Z q p f l f i z d S M c Z 9 K + V O N K / X v a k E S p I g l t 5 + p e 9 d R b i f / z p r m O u z P D U p F r S O n j o j j n W G d 4 F R G O m A S q e W G B U M n s 3 Z g u i S R U 2 y D d I I L Y h l 2 d Y x Y S C i F 0 a Y 6 + D 0 r T 7 b V 7 X r v b K 1 3 X 5 u Y / T e k 5 H H / t + F 7 H P 9 x r 9 Q f r B O t o G 3 1 C u 8 h H + 6 i P D t A Q j R B F P 9 E l u k K / n N / O j X P r 3 D 2 2 b j j r m Y / o n 3 L u H w B 7 9 K y 0 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q p / 5 1 U Z 9 r Z H m v w m B v H u I j H 9 W F U 8 = " > A e V P z 4 F q V i W / t C F g F l C F i m L G S X a S v P G j p i b Q C x Z u R s k R C / D 2 J y X F 3 / w r P w y b 7 S 8 j l c V f g 7 + G l p o X c N 5 0 6 k F U U b z B F J N O V F q 6 n t C z w y R m l E O p R v k C g S h J 2 Q B U 4 s p S U C 1 c X T K h K p 4 Z q p f l f i z d S M c Z 9 K + V O N K / X v a k E S p I g l t 5 + p e 9 d R b i f / z p r m O u z P D U p F r S O n j o j j n W G d 4 F R G O m A S q e W G B U M n s 3 Z g u i S R U 2 y D d I I L Y h l 2 d Y x Y S C i F 0 a Y 6 + D 0 r T 7 b V 7 X r v b K 1 3 X 5 u Y / T e k 5 H H / t + F 7 H P 9 x r 9 Q f r B O t o G 3 1 C u 8 h H + 6 i P D t A Q j R B F P 9 E l u k K / n N / O j X P r 3 D 2 2 b j j r m Y / o n 3 L u H w B 7 9 K y 0 < / l a t e x i t > < l a t e x i t s h a 1 _ b a s e 6 4 = " q p / 5 1 U Z 9 r Z H m v w m B v H u I j H 9 W F U 8 = " > A e V P z 4 F q V i W / t C F g F l C F i m L G S X a S v P G j p i b Q C x Z u R s k R C / D 2 J y X F 3 / w r P w y b 7 S 8 j l c V f g 7 + G l p o X c N 5 0 6 k F U U b z B F J N O V F q 6 n t C z w y R m l E O p R v k C g S h J 2 Q B U 4 s p S U C 1 c X T K h K p 4 Z q p f l f i z d S M c Z 9 K + V O N K / X v a k E S p I g l t 5 + p e 9 d R b i f / z p r m O u z P D U p F r S O n j o j j n W G d 4 F R G O m A S q e W G B U M n s 3 Z g u i S R U 2 y D d I I L Y h l 2 d Y x Y S C i F 0 a Y 6 + D 0 r T 7 b V 7 X r v b K 1 3 X 5 u Y / T e k 5 H H / t + F 7 H P 9 x r 9 Q f r B O t o G 3 1 C u 8 h H + 6 i P D t A Q j R B F P 9 E l u k K / n N / O j X P r 3 D 2 2 b j j r m Y / o n 3 L u H w B 7 9 K y 0 < / l a t e x i t > Sample (train time) Sample (test time) Fig. 1 : Overall view of our proposed deep demixing (DDmix) architecture based on graph neural networks for the temporal reconstruction of epidemics. The three subnetworks -prior, posterior, and deprojection -parametrize key functions in the assumed probabilistic model. computation of L2(·) and R2(·) is drawn from q ψ (z|Y). However, we can still optimize over it via the reparametrization trick [30] . We implement the parametric functions in our model -g θ and the mean and standard deviations of p φ and q ψ -using graph neural networks. Our overall deep demixing architecture (DDmix) is illustrated in Fig 1. An essential building block of our model is the Graph U-Net (g-U-Net) [28] , which consists of down-sampling and up-sampling graph convolutional network (GCN) layers [6] , alongside graph pooling layers (gPool and gUnpool). More precisely, the output of a generic GCN layer is given by H = ReLU(ÃH0W), where H0 is the input to the layer,Ã is a normalized adjacency matrix, and W is a matrix of trainable weights for that layer. The gPool layer performs a global max-pooling operation on graph data by introducing a trainable projection vector that maps the features of a given node onto a scalar; see [28] for details. On the other hand, the gUnpool layer restores the original graph structure by performing the inverse operation to the corresponding gPool layer. In our implementation, the four g-U-Nets included in Fig. 1 have depth equal to one, i.e., they have exactly one pair of pooling and unpooling layers, where the pooling operation reduces in half the number of nodes in the graph. For each of the g-U-Net blocks, the number of input and output node features (signal values per node) can be seen at the top left and right corners, respectively. The prior and posterior networks are used to respectively parametrize p φ (z|x) and q ψ (z|Y). The prior network takes as input the aggregated graph signal x with a single feature per node and passes it through the g-U-Net block while increasing the number of features per node to T . Two parallel GCN layers are then applied to the output of g-U-Net, resulting in the mean µ φ and standard deviation σ φ of the Gaussian distribution p φ . We design the posterior network analogously, with the only difference being that the number of features per node in the input signal Y is T , corresponding to the complete, uncollapsed time series. The obtained parameters are then used to sample from the corresponding distribution (denoted by the red triangles in Fig. 1) , where the posterior network is used during training and the prior network during testing. Finally, the deprojection network takes as input both the observed x and the drawn latent variableẑ and seeks to output a good estimateŶ = g θ (x,ẑ) of the temporal evolution. In our implementation, we first pass x through a g-U-Net block while increasing the number of node features to T . The output of this block is concatenated withẑ, resulting in a graph signal with 2T features per node. This signal is then passed through our last g-U-Net block that combines the information in x andẑ and reduces the number of node features to T to match the length of the time series being estimated. Through synthetic and real-world graphs, we illustrate the behavior of DDmix in diverse settings. 1 In evaluating its performance, we compare DDmix with the following three baselines: It should be noted that, although data-driven, the three conventional machine learning baselines considered are graph agnostic. More precisely, MLP incorporates fully-connected layers, thus overlooking the notion of locality in G. CNN-nodes, relying on convolutional filters, assumes a notion of locality inherited by the indexing of the nodes, which need not align with the true notion of locality driving the underlying epidemic. Finally, CNN-time ignores the effect of interconnections between nodes and seeks to solve the temporal reconstruction for each node independently. By contrast, DDmix explicitly incorporates the graph structure in its architecture redounding in higher performance and enhancing its generalizability. For our synthetic graphs, we use a random geometric graph generator that places N nodes uniformly at random in the unit cube. An edge is inserted between two nodes if their Euclidean distance is less than or equal to dr. For generating graphs of size N ∈ {100, 250, 500, 1000} nodes, we use parameters dr ∈ {0.25, 0.15, 0.1, 0.075}, respectively. SIRS parameters (cf. Section 2) are set to β = 0.15, δ = 0.1 and γ = 0.01. We set η1 = η3 = 1 and η2 = 10 −6 in the loss (4). We use the Adam optimizer with a fixed learning rate of 10 −2 , β1 = 0.9 and β2 = 0.999. The models are trained in mini-batches of size 4, and a maximum of 50 epochs with auto-stop mechanism in case the validation loss stops dropping for 5 epochs. Unless otherwise stated, we train the models with 4500 realizations of 20-step data on a 100-node random graph and use as figure of merit the mean square error (MSE) of the reconstruction given by 1 N T Y −Ŷ 2 F . Varying the density of the graph. In this experiment, we test the candidate models using data randomly generated on three different graphs with the same number of nodes (N=100) as in the training graph but different underlying topology. To be specific, their graph densities are respectively the same (d r = dr), denser (d r = 1.2dr) and sparser (d r = 0.7dr) compared to the baseline density of the training graph. For each test graph, we test the temporal reconstruction of T = 10 and T = 20 consecutive steps. Table 1 summarizes the models' performance revealing that DDmix significantly outperforms the baseline methods. This observation aligns well with our hypothesis: Due to their lack of dependence on the underlying graph structure, non-graph-aware methods are unable to adjust their outputs to topologies unseen during training. However, the graph convolution based DDmix incorporates the adjacency matrix in its architecture, thus, it is more robust to topological perturbations. Provided identical collapsed signals on differently structured graphs, CNN and MLP based models would always output identical reconstructions, whereas DDmix can adjust its reconstruction to the changing topology. Varying the size of the graph. We also tested the models with data generated on graphs of different sizes. We fix the number of steps to be T = 20 and apply models trained on 100-node data to graphs of size N ∈ {100, 250, 500, 1000}. As one might expect, the MLP model is unable to handle inputs of varying sizes, thus, can only be tested for N = 100. The variation of MSE for the other methods as a function of test graph size is depicted in Fig. 2 (left) . The MSE curves in this plot show that DDmix consistently outperforms the competing approaches. More importantly, DDmix is more robust to changes in the growing size of the test graphs, leading to increased performance gaps for N ∈ {250, 500}. Varying the size of the training set. Finally, we are interested in the minimal sufficient training set for each model to achieve high accuracy and good generalizability. In Fig. 2 (right) we illustrate how MSE generally drops as more data (from 0 to 4500 samples) are used for training each model (N = 100 and T = 20). Initially, all models perform poorly with 4 or fewer training samples. Starting at 8 training samples, the models (except for MLP) begin to show improved performance and saturate at around 1000 samples. MLP, having the largest number of parameters and not exploiting any locality, cannot effectively learn in this setting. Most interestingly, DDmix is able to generalize fairly well with only 8 independent training samples, significantly faster than the competing approaches. The fact that DDmix learns to recover collapsed signals much easier indicates that our proposed architecture has successfully imposed the right implicit bias by incorporating the graph topology. Reconstructing an epidemic in a primary school. We study the identification of epidemic sources in a real-world primary school network. Using a public dataset [32, 33] containing 2 days of temporal contact among 232 children from 10 school classes, we construct 2 daily contact graphs, one for training (N = 226) and the other one for testing (N = 228). The graphs are undirected and unweighted, and an edge exists between a pair of children if they had more than 5 face-to-face contacts per day. Furthermore, during training, a subset of 4 classes is dropped in order to speed up the process and improve generalizability. During testing, the graph has all possible classes and children. Synthetic epidemic data contains T = 20 days of SIRS epidemic evolution with random initial infected locations. We simulate 1000 samples for training and 1000 for testing. Based on this real-world interpersonal network, we evaluate the demixing models by their ability to correctly locate the class where the epidemic started. Given a reconstructedŶ, we determine a top-k ranking of the source class as follows. First, we determine the first day for which a prediction larger than 0.5 was made for at least one node, i.e., the smallest t for which there exists some i such thatŷ (t ) i > 0.5. We then focus on day t and rank the classes based on their largest prediction of infection, i.e., we rank each class C based on maxiŷ (t ) i for all nodes i ∈ C. We say that a top-k ranking yields accuracy equal to 1 if it contains the true source class and 0 otherwise. The average accuracy of DDmix is {25.9, 61.2, 82.9}% for the top-1, 3 and 5 predictions, respectively. This significantly outperforms CNN-nodes' {12.8, 36.8, 60.0}% and CNN-time's {11.6, 30.4, 53.5}% mean accuracies. Accurately tracing the source of epidemics is an increasingly important problem, thus motivating part of our future work. We introduced DDmix, a novel graph CVAE architecture for temporal reconstruction of network epidemics from aggregated observations. Being agnostic to the model of epidemic spread, DDmix relies on the network topology and the observation of past epidemics to solve this inverse (temporal demixing) problem. Through numerical experiments, we showed that DDmix outperforms non-graphaware learning techniques in accuracy as well as generalizability across graph distributions and sizes. Current and future research goals include: i) The use of DDmix for epidemic prediction and its evaluation on a variety of real epidemic processes, and ii) The development of architectures geared towards the identification of key epidemic actors (such as sources or super-spreaders) without the need for full temporal reconstruction, and iii) The study of other observation models where data is only observed at a subset of nodes or temporal data is randomly missing. Social and Economic Networks Brain network efficiency is influenced by the pathologic source of corticobasal syndrome Random graph models of social networks Modeling urban structures using graph-based spatial patterns Graph signal processing: Overview, challenges, and applications Semi-supervised classification with graph convolutional networks Link prediction based on graph neural networks Blind identification of graph filters Estimating network processes via blind identification of multiple graph filters Mathematics of epidemics on networks: from exact to approximate models A node-based sirs epidemic model with infective media on complex networks Multi-city modeling of epidemics using spatial networks: Application to 2019-ncov (COVID-19) coronavirus in india Network properties of salmonella epidemics Social network-based distancing strategies to flatten the COVID-19 curve in a post-lockdown world Learning structured output representation using deep conditional generative models Convolutional neural network architectures for signals supported on graphs Discrete signal processing on graphs: Sampling theory Sampling of graph signals with successive local aggregations Demixing and blind deconvolution of graphdiffused sparse signals Bayesian reconstruction of disease outbreaks by combining epidemiologic and genomic data Reconstructing an epidemic over time Spotting culprits in epidemics: How many and which ones? Finding effectors in social networks Molecular generative model based on conditional variational autoencoder for de novo molecular design Anomaly Detection With Conditional Variational Autoencoders Visual deprojection: Probabilistic recovery of collapsed dimensions Variational graph auto-encoders Inductive representation learning on large graphs Auto-encoding variational bayes A guide to convolution arithmetic for deep learning Mitigation of infectious disease at school: targeted class closure vs school closure High-resolution measurements of face-to-face contact patterns in a primary school