key: cord-0438078-0im43bk3 authors: Liu, Yangyang; Sanhedrai, Hillel; Dong, GaoGao; Shekhtman, Louis M.; Wang, Fan; Buldyrev, Sergey V.; Havlin, Shlomo title: Efficient network immunization under limited knowledge date: 2020-04-02 journal: nan DOI: nan sha: 46c97f10f1565b2587a04f19a70300530f639872 doc_id: 438078 cord_uid: 0im43bk3 Targeted immunization or attacks of large-scale networks has attracted significant attention by the scientific community. However, in real-world scenarios, knowledge and observations of the network may be limited thereby precluding a full assessment of the optimal nodes to immunize (or remove) in order to avoid epidemic spreading such as that of current COVID-19 epidemic. Here, we study a novel immunization strategy where only $n$ nodes are observed at a time and the most central between these $n$ nodes is immunized (or attacked). This process is continued repeatedly until $1-p$ fraction of nodes are immunized (or attacked). We develop an analytical framework for this approach and determine the critical percolation threshold $p_c$ and the size of the giant component $P_{infty}$ for networks with arbitrary degree distributions $P(k)$. In the limit of $ntoinfty$ we recover prior work on targeted attack, whereas for $n=1$ we recover the known case of random failure. Between these two extremes, we observe that as $n$ increases, $p_c$ increases quickly towards its optimal value under targeted immunization (attack) with complete information. In particular, we find a new scaling relationship between $|p_c(infty)-p_c(n)|$ and $n$ as $|p_c(infty)-p_c(n)|sim n^{-1}exp(-alpha n)$. For Scale-free (SF) networks, where $P(k)sim k^{-gamma}, 2 p c . The giant component and p c characterize the efficiency of the immunization. The smaller is the giant component, the immunization strategy is better. The larger is p c the immunization is more efficient since less immunization doses are needed to stop the epidemics. We find an analytical relationship between n and p c for both Erdős-Rényi (ER) and Scale-Free (SF) networks. Surprisingly, we also find that p c quickly reaches a plateau even for relatively small n, after which increasing n has negligible effect on p c . This means that immunization with small n (not knowing the whole structure) can dramatically improve the immunization. In this figure, we set n = 3 and only degrees of nodes v1, v2 and v3 are known. Given this limited information, the preventer would choose to immunize v3, being unaware that an unobserved higher degree node exists. At the next immunization or attack, only nodes which have not been immunized or removed yet will be observed. of nodes and edges, respectively. N = |V | is the number of nodes in the network. We assume that a preventer or attacker has limited knowledge of the overall network structure and instead possesses only limited information on several nodes. Specifically, we randomly select n nodes for which the preventer or attacker is assumed to have information on the node degree. The preventer or attacker then targets the node with the highest degree among these n. This procedure is then repeated until a 1 − p fraction of nodes are immunized or removed from the network. In Fig.1 , the limited information immunization (attack) is illustrated together with global targeted immunization on a network. Here a total of n = 3 nodes are observed. In panel (a), an individual with global information about the network structure chooses the highest degree node u to immunize (or remove). However, in panel (b), the individual only knows at a time the degree of 3 nodes in the network, i.e. v 1 , v 2 , v 3 . Consequently, node v 3 with the highest degree k = 4 (marked in red) is immunized (or removed). Suppose the degree distribution of a network is given by P (k) and F (k) = k s=0 P (s) being the cumulative probability that the degree of a randomly chosen node is less than or equal to k. Furthermore, at an arbitrary time t during the iterative percolation process, assume the distribution of the original degree (including the immunized neighbors) of the remaining nodes is P (k, t). Then, the degree distribution of the node which is immunized at time t is given by This formula can be recognized as being derived from order statistics giving the maximum of several independent random variables [36] . For k = 0, Eq. (1) becomes P r (0, t) = F (0, t) n . Hence we define F (k − 1, t) = 0, and then Eq. (1) is valid for k ≥ 0. In a limited knowledge immunization or attack, each node's immunization changes the degree distribution of the remaining nodes in the following way where N (k, t) is the number of nodes with degree k at time t, and P r (k, t) is the likelihood that a node immunized at time t has a degree k. Then, plugging Eq. (1) into Eq. (2) gives, which becomes in the continuous limit, and using P (k, t) = ∆F (k, t), we obtain, Note that F (k = −1, t) = 0, and thus the entire term inside the ∆ is 0 for k = −1. Similarly, this implies that for k = 0 and likewise for any k ≥ 0 this term is also 0. Thus, we get the following simple ordinary differential equation, with the initial condition F (k, t = 0) = F (k). It can be shown (see Sec I in SM) that the solution of this ODE, Eq. (3) is or equivalently, where F p (k) is the cumulative distribution of the degree after immunizing (removing) 1 − p fraction of nodes. For n = 1, the solution of Eq. (3) is F p (k) = F (k) as expected. Also Eq. (5) converges to F (k) in the limit n → 1. We can now obtain the degree distribution of the occupied nodes after fraction 1 − p nodes are immunized, which is given by The equation for v, the probability of a randomly chosen link to lead to a node not in the giant component, is where P (Θ|k) is the probability of a node to be occupied given its degree is k. Using Bayes Theorem, we note that P (k)P (Θ|k) = P (Θ)P (k|Θ) = pP p (k). Hence Eq. (7) becomes The giant component S can be found by where v is found from Eq. (8) . At criticality, we take the derivative of both sides of Eq. (8) and substitute v = 1 representing the location where the first solution with v < 1 exists, as opposed to only the v = 1 solution. Thus, the critical condition is We now study our limited knowledge immunization (attack) strategy, i.e., the general result, Eqs. (8) and (9), on ER networks. First, we analyze the giant component P ∞ . For the case n = 1, limited knowledge immunization (or attack) reduces to the classical random attack, while for n → ∞ (meaning the global network is observed) corresponds to targeted attack [14, 29, 31, 32] . Using Eq. (9), the giant component P ∞ can be solved numerically for any given p. In Fig. 2(a) , simulations and analytic results are shown for the giant component P ∞ as a function of 1 − p under limited information immunization with different n. As the knowledge index n increases from 1 to N , limited knowledge immunization moves from being like random immunization (attack) to being like targeted immunization (attack). The simulations are in excellent agreement with the theoretical results (lines). Next, we focus on the critical threshold, p c , of limited knowledge immunization (attack). Overall, we find that one does not need a very large knowledge n ∼ 10 to achieve nearly the very close effect as targeted immunization (attack) with complete information. This can be seen by observing the critical threshold p c as a function of n in Fig. 2(b) . In Fig. 2(c) we show the variation of p c with k for several fixed n. Next, the behavior of p c in the limit of large n is derived analytically. By examining Eq. (5) we notice that when n → ∞ there are two distinct behaviors depending on whether k is small, F (k) < p; or k is large, F (k) > p. It can be shown (see Sec II in SM) that the leading term behaves as, where α k = |log [p/F (k)]|. In the limit n → ∞, we can get the expected result for targeted immunization (attack), F p (k) = min {F (k)/p, 1} [31, 32] . Plugging Eq. (11) into Eq. (10), and noting that from a sum of exponentials decaying with n only the lowest decay rate contributes to the leading term, we obtain (see Sec II in SM) where p ∞ c = p c (n → ∞), and the decay rate α is now The pre-factor A = (2p ∞ c k slow )/(k > k < ), where k < is the largest degree such that F (k) < p ∞ c , k > = k < + 1 and k slow is the degree which gives the lowest rate α. (See illustration in SM). It is clear that k slow must be k < or k > because F (k) is monotonic. If F (k slow ) = F (k > ) = 1 then k < should be taken as k slow , and the corresponding α should be taken. It should also be noted that if k slow is not unique, it would simply change the pre-factor A in Eq. (12) . Another special case is where F (k slow ) = p ∞ c , then |p ∞ c −p c | ∼ 1/n (see Sec IV in SM). Fig. 2(d) shows ∆p c = |p ∞ c − p c | as a function of n. As expected from the theory, one can see that ∆p c ∼ 1/n for small n and exponential decay for large n. When p c → 1 which occurs for ER network when k → 1, the power law regime becomes much broader as explained in the Sec II of SM. Next, we study SF networks with P (k) = Ak −γ , k = m, · · · , K, where A = (γ − 1)m γ−1 is the normalization factor, and m and K are the minimum and maximum degree respectively [30] . Similar to ER networks, the size of the giant component, P ∞ can be obtained from Eq. (9). Fig. 3(a) , shows P ∞ as a function of 1 − p for different n values. The results demonstrate that SF networks become more immunized/vulnerable compared to ER networks under the immunization/attack as n increases. Compared with ER networks, one can observe that slightly higher values of n (more knowledge) are needed to reach the near-steady-state region of fully targeted strategy. For SF networks with 2 < γ < 3, under random immunization/attack (n = 1), it has been shown that p c = 0 for an infinite system [30] , while for high-degree immunization/attack (n → ∞), p c > 0 [31, 32] . Next we wish to find out for which n, p c becomes non-zero, and how it depends on the system size N . To this end, we analyze Eqs. (5) and (10) for large k (high degrees govern the behavior in SF), small n and p as follows (elaborated in SM). It can be shown that for large degrees, Substituting this into Eq. (5) and assuming (k/m) γ−1 n for large degrees, it can be concluded that In addition, we notice that P p (k) has a new natural cutoff, K p , which depends on p and N as follows (see Sec III in SM), This helps us to evaluate the second moment of P p (k) where β = (3 − γ)/(γ − 1). Considering this, and plugging Eq. (13) into Eq. (10), and keeping the leading terms in the limit of large N , we obtain (see Sec III in SM for more details) where From Eq. (14), it is easy to see that if n log N then p c → 0, while if n ∼ log N then p c is non-zero. The pre-factor C(n) depends on n but not in N . Fig. 4(a) shows p c versus γ. It is known that for 2 < γ < 3 and n = 1, if N → ∞ then p c → 0 [30] . Also for n = 5, we can see that system size matters and p c decreases as N increases. Fig. 4(b) shows that the scaling with n/ log N of Eq. (14) is valid. Furthermore, it is seen in Fig. 4 (b) that when n is small or N is large, such that n/ log N 1 (in Fig. 4 it is 0.07), p c approaches 0. Fig. 4 (c) supports the exponential scaling of p c versus n −1 log N obtained analytically in Eq. (14) . In summary, our results provide a framework for understanding and carrying out efficient immunization with limited knowledge. Especially in cases of global pandemics such as e.g., the current COVID-19, it is impossible to know the full interactions of all individuals. Thus an effective way to limit spreading is obtaining information on a few (n) individuals and targeting the most central of these. For example, testers could stand at a supermarket and select a group of people entering the store simultaneously. Information on the connections of these people e.g., the number of people they live with, where and how often they meet with other people etc. could be quickly obtained (such as through cell phone tracking) and then the individual with the most connections in the group could be quarantined or immunized. Our results demonstrate that even when this is done in small groups of people (low n), it is possible to obtain a significant improvement in global immunization compared to randomly selecting individuals. In our model, this was seen by the reduced size of the giant component and the large critical threshold p c . Overall, these findings could help to develop better ways for immunizing large networks and designing resilient infrastructure. Sanhedrai contributed equally to this work † To whom correspondence may be addressed. Email: dfocus.gao@gmail.com or lsheks@gmail The structure and function of complex networks Statistical mechanics of complex networks Complex networks: structure, robustness and function Efficient immunization strategies for computer networks and populations The hidden geometry of complex, network-driven contagion phenomena Core percolation on complex networks Controllability of complex networks A simple model of global cascades on random networks Internet: Diameter of the world-wide web Network dynamics: Jamming is limited in scale-free systems Epidemic spreading in scale-free networks Collective dynamics of "small-world" networks Stability and topology of scale-free networks under attack and defense strategies Robustness of interdependent networks under targeted attack Assessing the vulnerability of the fiber infrastructure to disasters Globally networked risks and how to respond Modelling disease outbreaks in realistic urban social networks Introduction to percolation theory: revised second edition Phase transitions and critical phenomena Catastrophic cascade of failures in interdependent networks Random graphs with arbitrary degree distributions and their applications Networks formed from interdependent networks Robustness of network of networks under targeted attack Recent advances on failure and recovery in networks of networks Fractals and disordered systems Resilience of networks with community structure behaves as if under an external field Percolation points and critical point in the ising model Influence maximization in complex networks through optimal percolation Error and attack tolerance of complex networks Resilience of the internet to random breakdowns Network robustness and fragility: Percolation on random graphs Breakdown of the internet under intentional attack Spectrum of controlling and observing complex networks Network observability transitions Observability of complex systems Introduction to mathematical statistics