key: cord-0958170-9ha9obet
authors: Valente, Angelo; Domenico, Manlio De; Artime, Oriol
title: Non-Markovian random walks characterize network robustness to nonlocal cascades
date: 2022-04-18
journal: Physical review. E
DOI: 10.1103/physreve.105.044126
sha: 47ab4e458641cbad3ddab10ccfe3f2b2013af9ec
doc_id: 958170
cord_uid: 9ha9obet

Localized perturbations in a real-world network have the potential to trigger cascade failures at the whole system level, hindering its operations and functions. Standard approaches analytically tackling this problem are mostly based either on static descriptions, such as percolation, or on models where the failure evolves through first-neighbor connections, crucially failing to capture the nonlocal behavior typical of real cascades. We introduce a dynamical model that maps the failure propagation across the network to a self-avoiding random walk that, at each step, has a probability to perform nonlocal jumps toward operational systems' units. Despite the inherent non-Markovian nature of the process, we are able to characterize the critical behavior of the system out of equilibrium, as well as the stopping time distribution of the cascades. Our numerical experiments on synthetic and empirical biological and transportation networks are in excellent agreement with theoretical expectation, demonstrating the ability of our framework to quantify the vulnerability to nonlocal cascade failures of complex systems with interconnected structure.

Introduction.-A wide variety of complex systems is structured in terms of nodes, representing systems' units, and links, encoding the different types of interactions among them. Yet, any trustworthy model aiming at reproducing observations and making principled predictions needs to incorporate some dynamical behavior on the network [1, 2] . In fact, understanding the interplay between structure and dynamics is still one of the major challenges in network science [3] [4] [5] [6] [7] . A central question concerns the robustness of a system against perturbations [8] , since it can advance the development of powerful analytical techniques to explain and unravel rich phenomenology [9] , as well as it can provide a solid ground for informed interventions, e.g., disease containment based on contract tracing [10] and immunization [11] , hate speech counter-measures in online environments [12] or the characterization of vulnerabilities in infrastructural networks against natural disasters [13] .

A main assumption behind the analysis of robustness is that for a system to be functional, it needs to be connected. Hence, a first quantitative proxy to assess the robustness to failures is s, the normalized size of the largest connected component; thereby concepts and techniques from percolation theory become useful [14] . In percolation, a given fraction φ ∈ [0, 1] of nodes (or links), either selected uniformly at random or based on topological or nontopological descriptors [15] [16] [17] , is removed from the network. Other quantities are computed along with s as a function of φ, showing interesting phenomenology such as multiple [18, 19] or abrupt phase transitions [20] , and a robust-yet-fragile effect for networks with a broad enough degree distribution [21] . Percolation quantities are also employed to assess the robustness under cascading failure problems, such as the value of s when the cascade stops spreading. These models are dynamical, in the sense that a small perturbation placed in the network evolves according to some rules [22] [23] [24] [25] [26] , which depend on the phenomenon one is trying to model. For the sake of mathematical tractability, cascades are assumed to spread via direct contacts.

Be it because the physical mechanisms behind the failure propagation permit far-off malfunctions, be it because the knowledge on the observed network topology is incomplete and the failure propagates through hidden or unobserved edges, real-world cascades display nonlocal features. To name but a few examples of empirical nonlocal cascades, where node or links failures did not always occur in the neighborhood of previous ones, we have the 1996 disturbance of the Western Systems Coordinating Council (WSCC) system [27] , the 2003 blackout in the northeastern region of the USA [28] , or the air-traffic disruption due to the eruption of the Icelandic volcano Eyjafjallajökul [29, 30] . From a modeling standpoint, some mechanisms like flow redistribution can lead to nonlocal spreading of failures [31] [32] [33] -with the possibility of abrupt transitions, see e.g., Refs. [33] [34] [35] -but the mathematical treatment has been hitherto underresearched due to its sophistication and there is no direct way to control the underlying properties of the nonlocal events, seriously undermining our understanding of the phenomenon.

In this article we overcome these longstanding limitations by introducing an analytically solvable model based on a class of self-avoiding random walks (SARW) [36] , which are used to model failures that propagate across the network while combining, probabilistically, local transitions and nonlocal jumps.

We first describe the model and then show how to compute the timedependent degree distribution in the surviving network. We obtain important time-dependent percolation quanti-FIG. 1: Evolution of a network N 0 under a self-avoiding teleporting random walk-(SATRW) based dismantling process. N t is the residual network after t steps of SATRW and "step t" denotes the choice on N t−1 of the t th node to visit (in orange).

ties and first-stop properties of the cascades which characterize the critical behavior of the process, therefore offering an estimate of the robustness of the system as a function of time, whose validity is tested in several scenarios. Finally, we validate our theory against synthetic and empirical networks.

Mapping cascade failure to self-avoiding teleporting random walks.-In the following, we assume that the cascade unfolds in a timescale much faster than the recovery of nodes, and that a disrupted unit cannot be visited more than once by the failure. This fact causes the failure to be no longer Markovian and, for modeling purposes, a natural choice is to consider a SARW-like dynamics on the network. To cope with the nonlocality, we introduce a teleporting probability: At each step t the failure proceeds as in a SARW -uniformly choosing an operational neighbor and transitioning there-with probability 1 − α ∈ [0, 1]; otherwise, with probability α it teleports to any operative node according to a teleporting rule T t (k), in principle time-and degree-dependent; see a sketch in Fig. 1 . If the failure arrives at a degree-0 node, then it automatically teleports to an operative node. We name this stochastic process the self-avoiding teleporting random walk (SATRW). Notice that our model interpolates between percolation, which can be seen as a purely nonlocal phenomenon, if α = 1 and φ = 1 − t/N 0 , where N 0 1 is the initial network size, and the purely local process of a growing SARW when α = 0 [37] [38] [39] .

The evolution of the SATRW can be viewed as if every visited node is removed from the network, along with all the edges connected to it. This suggests to associate a time-dependent sequence of residual networks {N t } t≥0 to each possible SATRW on a fixed graph N 0 and study its average evolution. Narrowing up to configuration model networks, the natural way to characterize such a sequence is via the temporal degree distributions {p t } t≥0 . Similarly, we can define the time-dependent excess distribution {q t } t≥0 [40] and a new quantity {d t } t≥1 , convenient for the mathematical treatment, standing for the timedependent probability distribution governing the degree of the node visited at step t, namely the transition from N t−1 to N t . For example, d t (0) denotes the probability that the walker arrives at a degree-0 node in N t−1 , whose deletion leads to N t . Two teleportation rules are considered. If the probability to teleport to a degree-r node on N s is denoted as T s (r), then the uniform teleportation is

where a node is chosen uniformly at random in the residual network, and the biased teleportation is

where a node is chosen with probability proportional to its degree in the residual network, see the Supplementary Material (SM) for details. By definition, q s (−1) := 0. Degree distributions and giant components.-It is possible to recursively express p t and d t in terms of variables in times t − 1 and t − 2, resulting in an effective quasi-Markovianity despite the infinite memory of the self-avoiding behavior. We obtain (see the Appendix for details)

valid for degrees k = 0, 1, . . . , N t − 1, where N s := N − s is the number of nodes in N s and r t is the expected value of d t . For r = 0, 1, . . . , N t , we have

Solving the above system of coupled equations, we gain information about the degree distribution of the residual network and the degree-dependent probability to find the walker in a functional node. In Figs. SM1-SM4 we compare the analytical approximation against simulations, finding a perfect agreement.

With p t at hand, we can compute the fractional size of the giant component s t of N t through the system [1] where g t and h t are the time-dependent probability generating functions respectively of the degree and the excess degree distribution,

Application to synthetic networks.-We next illustrate the validity of our theory against simulations of the SATRW nonlocal process on synthetic networks with homogeneous and heterogeneous connectivity distribution, namely, Erdős-Rényi (ER) and scale-free (SF) networks, respectively defined by

where k denotes the average degree and γ > 0. We add constraints to these parameters in order to generate connected networks without topological correlations, namely, k ≥ log N 0 for ER [41] and k min ≥ 2 and k max ≈ √ N 0 for SF [42] . We present in Fig. 2 the time evolution of the size of the giant component for the different topologies and teleporting rules. A general trend that appears in all cases is that the network properties ( k for ER, γ for the SF; note, though, that a change in γ induces a variation in the mean degree) and the teleportation rule do not affect the decay rate of s t at the beginning of the cascade spreading. However, there is a strong impact when approaching the dismantling point: For a fixed teleportation parameter α, reducing the mean connectivity in ER graphs leads a faster disintegration as it could be expected. Moreover, when α grows, so nonlocality is enhanced, the final fragmentation occurs faster. This behavior no longer holds in SF nets. In fact, we observe that the critical point might increase or decrease when the teleportation parameter α is varied, evincing the nontrivial interplay that topology and nonlocality have in the robustness of interconnected systems. Nonlocal cascades.-So far we have assumed that when the walker steps into a degree-0 node, it is forced to teleport according to T t so it keeps exploring the network until all nodes are removed. However, real cascading processes normally do not dismantle the entire network but cease at a certain point leaving a part of the structure unaffected. This motivates us to incorporate a stopping criterion to the SATRW: When the walker reaches a degree-0 node, either it teleports with probability α or it stops with probability 1 − α. We call this a nonlocal cascade.

To assess the robustness to nonlocal cascades, we are interested in the size of the giant component when the cascade stops, S (STOP) . This is a stochastic variable, and we can compute its moments. To do so, first we need the stopping time distribution e(t) for t = 1, . . . , N 0 , that reads (see Appendix for details)

We note that an increase of the teleportation parameter is always associated with a larger stopping time, but the influence of specific network parameters strictly depends on the general topology. On one hand, in ER networks, e(t) tends to concentrate on higher times as k increases [see Fig. SM6(a) and (b) ]. On the other hand, in SF networks, e(t) is quite insensitive to γ, and thus to the mean degree [see Fig. SM6 (c) and (d)]. The physical intuition behind this phenomenon is that, for fixed α, an increase of the mean degree in ER graphs yields a uniform increase of the number of links per node, hence we expect longer stopping times as k grows since at each time step the probability to find a functional neighbor of a previously failed node is high. For SF this is not the case, though: additional links are likely concentrated around hubs which, when disrupted, do not allow to easily find still-operational nodes in a neighborhood, leading to similar stopping times regardless the value of γ.

It is instructive to inspect the average size of the giant component at the cascade stop, given by

a quantity that brings together the time-dependent percolation theory developed for the SATRW, Eqs. (5), with its first-stop properties, Eq. (9). We report its behavior in Fig. 3 . Several trends are noticeable, present in both ER and SF networks and for the two types of teleportation. First, the higher it is the nonlocality of the cascade process, the more destructive it becomes, since the probability of stopping is smaller. Second, the biased teleportation dismantles the network in a more efficient way than uniform teleportation, specially for values of the teleportation parameter α > 0.5. Third, the topology matters if α is not close to 1: The hierarchical structure typical of SF networks helps to stop nonlocal cascades early, regardless of topological details governed by γ, while ER networks have a compact and homogeneous structure and the higher the average degree, the easier it is for nonlocal cascades to spread. These last results seem to contradict what we reported in Fig. 2 : ER networks with high k were considered robust (large critical point) under the SATRW-based dynamics, while they are found to be more fragile (low value of S (STOP) ) to nonlocal cascades as k increases. We interpret this as a dynamical version of the robust-yetfragile phenomenon: An avalanche of nonlocal failures can quickly destroy the giant component if it is able to spread, but if there is a chance for it to stop, then the topology of the network might effectively hinder such diffusion. This also holds true for SF networks, as their topology is very good at stopping cascades quite early in their evolution, but if failures are allowed to keep progressing then the network is dismantled in a similar timescale to those of ER networks, which is a topology that is not good at blocking cascades. This evinces, once more, the nontrivial relation between dynamics and topology, and sheds light on the importance of the metrics one looks at when assessing robustness.

Application to empirical networks.-The analytical results have been derived assuming that the degree of two adjacent nodes is not correlated and that the network is tree-like. This prompts us to ask how our theory performs when applied in empirical systems, that frequently display a wealth of different topological correlations and for which these approximations may fail.

In the following, we show that we can capture well the evolution of the nonlocal cascade by focusing on two real topologies with a moderate value of degree assortativity [1] . The first one is a network of air traffic routes from the Federal Aviation Administration (FAA) of the National Flight Data Center, USA [43] . In this context, a node malfunction could be seen as an airport being shut down, e.g., due to meteorological events. The second network is the C. elegans protein-protein interactome [44] , where the perturbation could be understood as an initial inhibition of a certain protein, e.g., via a protein synthesis inhibitor, which is responsible for the activation/inhibition of other proteins in the PPI [45] . We show in Fig. 4 the results for these empirical systems. The agreement between theory and simulations is good, for both the expected value of the giant component at the cascade stop (main panel) and the SATRW dynamics without stopping criterion (insets). These results agree with the trends reported in the uncorrelated SF networks: When a significant teleporting effect is present (α far from zero), the targeted (biased) evolution of a failure disconnects the networks quicker than the random counterpart, but if it proceeds in cascade, the networks topology manages to stop it promptly.

Conclusions.-Localized network disruptions in empirical settings might trigger nonlocal effects in terms of cascade failures whose propagation is usually more complex than first-neighbor search widely adopted in the literature. To better reconcile theory and observation, we have introduced a dynamical model of nonlocal failure spreading that combines local and nonlocal effects. We have characterized the rich critical behavior of our model by providing analytical expressions for several quantities employed to assess the system's robustness, such as the time-dependent degree distribution, the size of the giant component in the residual network as the process evolves, and the cascade stopping time distribution, among others. These descriptors display an excellent agreement with simulations in synthetic systems characterized by different types of complexity in terms of the heterogeneity of their structural connectivity. We find remarkable differences between homogeneous and heterogeneous systems, e.g., their dependence, or lack thereof, on the particular network parameters. However, we also report some hidden similarities between them, such as a dynamical version of the popular robust-yet-fragile feature to static attacks. It is worth noticing that, despite our framework is expected to work for locally tree-like networks lacking topological correlations, such as degreedegree ones, it still works in empirical settings as we have shown for the case of a biomolecular system, namely the interactome of the nematode C. elegans, and an infrastructural system, namely a national air traffic network.

We envision a plethora of future generalizations for our model. One is to modify the failure dynamics itself, e.g., by including a branching mechanism of the self-avoiding walkers or by exploring more complicated stopping criteria. Similarly for the teleportation rule: it remains an open question what properties of the jumps would induce an abrupt/discontinuous percolation transition. Another direction regards the architecture on which the spreading takes place, e.g., by relaxing the assumption of uncorrelated networks and including, explicitly, topological correlations in a controlled way to understand their effect. Moreover, many networks, such as the infrastructural ones, usually show a multilayer, interdependent structure [46] [47] [48] [49] , which could be included in an adequate extension of our model. Anyway, our findings provide a solid ground for the analytical study of network robustness, in particular, and for nonlocal non-Markovian processes, in general.

The effect of a Self-Avoiding Teleporting Random Walk (SATRW) on a network N 0 is described by the time-dependent sequence of residual networks {N t } t≥0 . We are interested in studying the average evolution of such a sequence through that of two time-dependent distributions: p t , which is the degree distribution of N t , and d t , standing for the degree distribution of the node visited at step t (transition from N t−1 to N t ). In this Appendix we show how to compute both distributions, as well as the stopping time distribution when a stopping criterion is added to the evolution of the SATRW.

Computation of p t .-Rather than trying to directly derive an expression for p t we focus, for convenience, on the average number of degree-k nodes in N t , namely

Here N t is the total number of nodes of N t . How does N t (k) change in one step, from N t−1 (k) to N t (k)? Analyzing the degree of the node visited at step t, hereinafter appealed as "deleted", along with the degrees of its neighbours, we distinguish three different contributions to this quantity, see (a) The deleted node could have k adjacent nodes, and this happens with probability d t (k). Thus, on average, d t (k) degree-k nodes are deleted;

(b) The degree-(k + 1) neighbors of the deleted node will lose an edge, then becoming degree-k nodes in the next residual network N t . If the deleted node has degree s, denote the number of these neighbors in N t−1 with N t−1 (k + 1|s). Summing over all the possibilities, the number of such nodes is equal to s times the probability that a neighbour of a randomly chosen node has degree k + 1 (or equivalently, excess degree k), leading to

The summation in Eq. (A.2) then becomes

having denoted with r t the expected value of d t .

(c) The degree-k neighbors of the deleted node will lose an edge too, so they will not have degree k in N t . As before, if the deleted node has degree s, the number of these neighbors is

Summing over all the possibilities, the number of such nodes is

Putting the pieces together, for k = 0, 1, . . . , N t − 1

where q t−1 (−1) := 0. We now come back to probabilities using Eq. (A.1), thus obtaining

Some approximations have been made to derive this expression, so the probability is not properly normalized. The normalization constant for p t , t ≥ 1, is

When dealing with degree distributions that assign very low probabilities to the tail, it is safe to assume p s (N (s)− 1) ≈ 0 for any time s ≥ 0. Moreover, d t (N t )/N t is a small number too for t small and/or N large, being d t a distribution. All things considered, the normalization constant deviates from 1 by a negligible quantity, at least when t is not too large.

Computation of d t .-In principle, there are two ways to step on a node during the walk: the walker either reaches it by following an edge (probability 1 − α) or teleports to it (probability α). d t is then a weighted average of two distributions with weights 1 − α and α.

1. In the teleporting case, the walker steps on a random node in N t−1 according to the specific teleportation rule. The probability to teleport to a degree-r node on N t−1 is denoted as T t−1 (r).

2. In the nonteleporting case, the walker reaches a degree-r node in N t−1 coming from a node present in N t−2 . The reached node has degree r+1 in N t−2 , therefore excess degree r there. Since configuration model networks exhibit no degree correlation, as we have already pointed out in the previous computations (see Eq. (A.4)), q t−2 (r) is a good approximation for this event to happen, no matter the degree of the node the walker comes from.

In practice, the degree of the previously deleted node should be considered as well because, in case this is equal to 0, the walker must proceed via teleportation with probability 1. This suggests to introduce the conditional probabilities d t (r|0) and d t (r|0 c ), i.e. the probability to reach a degree-r node at step t knowing that the walker comes from either a degree-0 node or not, events happening with probabilities d t−1 (0) and (1 − d t−1 (0)). In the first case we simply choose the next node via teleportation, while in the second case we weigh the two possibilities described above. For r = 0, 1, . . . , N t :

where the conditional probabilities are computable as Since the first step is always a teleportation,

Some comparisons between the theoretical degree distribution and the one from the simulated process are shown in Figs. SM1-SM4, depicting two particular ER and SF networks (the fractional steps for which p t reduces to a Dirac delta concentrated at 0 are not graphically considered). We stress that theoretical predictions (solid lines) and empirical simulations (dashed lines) match so well that they are almost indistinguishable.

Stopping time distribution.-For a nonlocal cascade, the stopping time t is the time corresponding to the last step. Let E t denote the event "The nonlocal cascade stops at time t" (equivalently, step t is the last one). We denote withē(t) the conditional probability to stop at time t knowing that step t has been reached. This event happens when the walker visits a degree-0 node at step t (probability d t (0)) but it is unable to proceed via teleportation (probability 1 − α):

The probability e(t) := P[E t ] is computable in terms of e(t). Indeed, e(1) =ē(1) and for t ≥ 2: 

In order to have a distribution with support included in the unitary interval [0, 1], we define the fractional stopping time distributionẽ(s), defined as

The theoretical fractional stopping time distributions e(s) for ER and SF networks are compared with the empirical ones in Fig. SM6 . Since the duration of the cascade strictly depends on α, we see thatẽ(s) peaks at higher and higher values as α increases. A first big difference emerging between the homogeneous and the heterogeneous case is the overall impact of the average degree on the fractional stopping time distribution, regardless of α and the teleportation rule. On the one hand,ẽ(s) shifts its mass more and more toward high times as the average degree increases in ER networks; on the other hand,ẽ(s) in SF networks seems not to be influenced by a change in the average degree caused by a variation of γ. This precludes the possibility of increasing the number of functioning (unvisited) nodes at the end of a nonlocal cascade by tuning the parameter γ if the graph has power-law degree distribution.

Another worth mentioning aspect is the effect of teleportation. When most of the nodes of the starting network N 0 have already been visited, many degree-0 nodes will appear. The degree-biased teleportation rule will avoid the walker to teleport to such nodes, reducing the number of times the walk has the chance to stop. On the contrary, the uniform teleportation rule ignores the degrees of the residual nodes, and it is therefore more likely to hop to isolated nodes, leading to smaller stopping times. However, this difference is evident only if α is big enough, otherwise the walk is expected to stop after a few times the walker has found itself in degree-0 nodes.

In this supplementary material we provide some supporting figures to the results discussed in the main paper. Figs. SM1-SM4 show the comparison between the theoretical expression of the time-dependent degree distribution p t (k) and its values when computed directly from the simulations. Several network topologies and teleporting rules are displayed: Erdős-Rényi (ER) networks with uniform teleportation in Fig. SM1 and with biased teleportation in Fig. SM2 ; scale-free (SF) networks with uniform teleportation in Fig. SM3 and with biased teleportation in Fig. SM4 . For each case, several cases are presented as a function of the teleportation probability α. Fig. SM5 complements Fig. 2 of the main text, displaying here the two cases not shown there (ER with uniform teleportation in Fig. SM5 (a) and SF with biased teleportation in Fig. SM5(b) ). Fig. SM6 supports what is discussed in the last section of the Appendix. It shows the stopping time distribution for different network topologies and different teleportating rules.

Finally, Fig. SM7 complements Fig. 3 of the main text, providing the single plots for each network topology and teleportation type, and curves with more values of k and γ. The network and teleportation type is indicated above each panel. Solid lines come from theory, markers from simulations, averaged over 900 realizations. Network size is N 0 = 10 3 , and for SF nets we set k min = 3

Dynamical Processes on Complex Networks

Universality in network dynamics

Multilayer networks

Networks beyond pairwise interactions: structure and dynamics

Dynamical systems on networks

Dynamics on networks: competition of temporal and topological correlations

Universal resilience patterns in complex networks

Critical phenomena in complex networks

Mobile applications to support contact tracing in the EU's fight against COVID-19: Common EU Toolbox for Member States (2020), Download from here

Immunization and targeted destruction of networks using explosive percolation

Effectiveness of dismantling strategies on moderated vs. unmoderated online social platforms

Small vulnerable sets determine large network cascades in power grids

Introduction to Percolation Theory

Network robustness and fragility: Percolation on random graphs

Percolation on featureenriched interconnected systems

Double percolation phase transition in clustered complex networks

Bond percolation on multiplex networks

Explosive percolation in random networks

Error and attack tolerance of complex networks

A simple model of global cascades on random networks

Suppressing cascades of load in interdependent networks

System crash as dynamics of complex networks

Cascading failures in bi-partite graphs: model for systemic risk propagation

Recent advances on failure and recovery in networks of networks

System Disturbances. Review of Selected 1996 Electric System Disturbances in North America

Blackout: What Happened, Why, and What Did We Learn?

Airspace restrictions in Europe

Cascade-based attacks on complex networks

Model for cascading failures in complex networks

Cascading failures in interdependent systems under a flow redistribution model

Network overload due to massive attacks

Abrupt transition due to non-local cascade propagation in multiplex systems

The Self-avoiding Walk

Kinetic growth walks on complex networks

A model of self-avoiding random walks for searching complex networks

Self-avoiding pruning random walk on signed network

Random graphs with arbitrary degree distributions and their applications

On the evolution of random graphs

Generation of uncorrelated random scale-free networks

KONECT: the Koblenz network collection

Empirically controlled mapping of the Caenorhabditis elegans proteinprotein interactome network

Multiscale statistical physics of the pan-viral interactome unravels the systemic nature of SARS-CoV-2 infections

Networks formed from interdependent networks

Mathematical formulation of multilayer networks

Emergence of network features from multiplexity

Arenas, The physics of spreading processes in multilayer networks