key: cord-0662286-3bmjtfwa
authors: Friedrich, Tobias; Gawendowicz, Hans; Lenzner, Pascal; Melnichenko, Anna
title: Social Distancing Network Creation
date: 2022-04-21
journal: nan
DOI: nan
sha: e4d0ecf6034cdbdcba5850ce740f301b0ac43705
doc_id: 662286
cord_uid: 3bmjtfwa

During a pandemic people have to find a trade-off between meeting others and staying safely at home. While meeting others is pleasant, it also increases the risk of infection. We consider this dilemma by introducing a game-theoretic network creation model in which selfish agents can form bilateral connections. They benefit from network neighbors, but at the same time, they want to maximize their distance to all other agents. This models the inherent conflict that social distancing rules impose on the behavior of selfish agents in a social network. Besides addressing this familiar issue, our model can be seen as the inverse to the well-studied Network Creation Game by Fabrikant et al. [PODC 2003] where agents aim at being as central as possible in the created network. Thus, our work is in-line with studies that compare minimization problems with their maximization versions. We look at two variants of network creation governed by social distancing. In the first variant, there are no restrictions on the connections being formed. We characterize optimal and equilibrium networks, and we derive asymptotically tight bounds on the Price of Anarchy and Price of Stability. The second variant is the model's generalization that allows restrictions on the connections that can be formed. As our main result, we prove that Swap-Maximal Routing-Cost Spanning Trees, an efficiently computable weaker variant of Maximum Routing-Cost Spanning Trees, actually resemble equilibria for a significant range of the parameter space. Moreover, we give almost tight bounds on the Price of Anarchy and Price of Stability. These results imply that, compared the well-studied inverse models, under social distancing the agents' selfish behavior has a significantly stronger impact on the quality of the equilibria, i.e., allowing socially much worse stable states.

whole network into account. To better understand the dynamics arising in these decentralized settings and the network structures resulting from them, many influential game-theoretic network formation models have been introduced in the last decades [32, 9, 23, 7, 8] . The main research questions are: Do equilibrium networks, i.e., stable networks where no agent can improve by performing a local change, exist? What properties do these networks have? And how efficient are they compared to centrally computed optimal solutions? All of the above mentioned influential game-theoretic network formation models assume that the creation of an edge is costly but the agents benefit from having small distances to other agents in the network. However, departing from this standard assumption in the field, there are real-world settings that should better be modeled via an inverted utility function: neighbors yield benefit but being close to many agents is costly as it yields an increased risk. One example for this choice are financial networks. There, financial institutions benefit from working together but suffer from risks arising from one of them failing 1 . Another example, that is the main motivation of our work, came up with the current COVID-19 pandemic and is described by the now commonly used term social distancing. It refers to reducing social contacts in order to contain the spread of a contagious virus in the population. While often mandated by the government, social distancing was performed by many people voluntarily. One of the main reasons is quite simple: While reducing social contacts is a restriction of the quality of life, it also reduces the probability of getting infected. Hence, the network of social interactions between people was sparsified by individual strategic decisions.

In this work we introduce a novel game-theoretic network formation model in which selfish agents strategically form a social network under the influence of social distancing. Agents benefit from direct connections to other agents, modeling the positive effects of social contacts on their social life. However, at the same time they want to maximize their distances to all other agents in the network in order to reduce their risk of getting infected via an increased reaction time in case a contagious disease starts spreading in the network. Here we assume that a random network node becomes infected and that it is beneficial to be far away from the source of infection in order to gain valuable time for setting up counter-measures.

The agents in our model act according to an inverted utility function, compared to the famous models by Jackson and Wolinsky [32] and Fabrikant et al. [23] . Thus, to the best of our knowledge, this is one of the rare cases of a game-theoretic model where both minimizing and maximizing the utility function has a natural interpretation. Another similar well-known example is the contrast between the Network Design Game with fair cost sharing by Anshelevich et al. [7] and the Selfish Routing model by Roughgarden and Tardos [46] . In both models the agents select paths in a given network but in the former sharing an edge is beneficial for the involved agents whereas in the latter edge sharing is detrimental. This difference yields vastly different behavior in terms of the quality of the equilibria. However, this is not obvious, as can also be seen by comparing classical minimization and maximization variants of optimization problems, e.g., Minimum Spanning Tree versus Maximum Spanning Tree or Shortest Path versus Longest Path. Sometimes, as with spanning trees, the inverse problems are almost identical, whereas sometimes, as with the path problems, the inverse problems may have completely opposite behavior. We set out to explore this comparison for the natural inverse counter-part to the well-known Network Creation Game by Fabrikant et al. [23] . Along the way, we will uncover a connection to the Maximum Routing-Cost Spanning Tree problem that is inverse to the well-studied Minimum Routing-Cost Spanning Tree problem [30] 2 .

Before we start with the model definition, we introduce some notation regarding networks. A network is a tuple G := (V, E) where V is the set of nodes and E is the set of edges. An edge is represented by a set containing both incident nodes. If we do not give the tuple defining G explicitly, we denote the set of nodes of G as V G and the set edges of G as E G . We only consider unweighted undirected networks. For addition and removal of a single edge e, we write G + e := (V, E ∪ {e}) and G − e := (V, E \ {e}). A network G with V G ⊆ V and E G ⊆ E is called a subnetwork of G and denoted as G ≤ G. If G is connected and V G = V , G is a spanning subnetwork of G. Let n ∈ N denote the number of nodes. The set of all connected networks containing exactly n nodes will be referred to as G n .

For two nodes v, x ∈ V , we define d G (v, x) as the distance between v and x in network G, that is, the number of edges on a shortest path from v to x in G. For convenience, we extend the definition of d G to sets of nodes: Let v ∈ V be a node and M, N ⊆ V be sets of nodes. Then d G (v, M ) := x∈M d G (v, x) and d G (M, N ) := x∈M,y∈N d G (x, y). We call the special case d G (v, V ) the distances from/for v and d G (V, V ) the total/summed distances or routing costs of G. The degree of v in the network G is the number of edges that are incident to v and is denoted as deg G (v). We call a tree which is a spanning subnetwork of G a spanning tree of G. A spanning tree of G with routing costs at least as high as the routing costs of any other spanning tree of G will be called a Maximum Routing-Cost Spanning Tree (MRCST). A spanning tree of G with routing costs that cannot be increased by swapping one edge is a Swap-Maximal Routing-Cost Spanning Tree (SMRCST). Now, we can define the game-theoretic model. Let H = (V, E) be a connected network. We call H the host network and its nodes agents. A state of the game G ≤ H is a spanning subnetwork of H. We only consider connected networks as host networks and states.

Each agent v ∈ V selfishly tries to maximize its utility in state G given by

where α ∈ R >0 is a global parameter. We will call α deg G (v) the edge utility and d G (v, V ) the distance utility of v. Note that α is a parameter of the game, i.e., equal for all agents, that allows to adjust the agents' trade-offs between edge utility and distance utility. Here α is the benefit of a single edge, i.e., the benefit for each direct neighbor in the network. For measuring the efficiency of the network G, we use the social welfare defined as SW(G) := v∈V G u v (G) = 2α|E G | + d G (V, V ). This quantifies the well-being of the society of all agents. We call a network maximizing the social welfare for the host network H a social optimum and denote it as OPT H .

Agents are allowed to form connections bilaterally. More specifically, each agent can unilaterally remove any incident edge if it does not disconnect the network, and two agents together can form an edge between them if it is contained in the host network. If removing an edge strictly increases the utility of one of its incident nodes or adding an edge strictly increases the utility of both incident nodes, we call this an improving move. A network without improving moves is referred to as pairwise stable [32] or stable for short 3 

We will call this model Social Distancing Network Creation Game (SDNCG). In Section 2 we will restrict the host networks to complete networks K n . We will call this restricted variant complete Social Distancing Network Creation Game (K-SDNCG).

Variants of game-theoretic network formation models have been studied extensively for decades and we refer to Jackson [31] for an overview.

Closest to our work is the literature on the Network Creation Game (NCG) by Fabrikant et al. [23] . This influential model can be seen as the unilateral inverted variant of the K-SDNCG. There, an agent can buy any incident edge without the consent of the other endpoint for the price of α > 0. Each agent aims at minimizing its cost, which is defined as the sum of α times the number of bought edges and the sum of hop-distances to all agents. The authors of [23] show that Nash equilibria always exist, i.e., complete networks are stable for α ≤ 2 and stars are stable for α ≥ 2. However, besides these generic examples finding Nash equilibria is challenging since the NCG and many of its variants do not belong to the class of potential games [38, 36] . Besides finding equilibria, also computing a best possible strategy is challenging, since this problem was shown to be NP-hard in [23] . However, such strategies can be efficiently approximated with greedy strategy changes [39] . Regarding the quality of equilibrium states the authors of [23] show that the PoA is in O( √ α), that the PoA for tree Nash equilibria is constant, and that the PoS is at most 4 3 . Later, a series of papers [2, 20, 42, 41, 4, 12, 5] improved the PoA bounds, with the best general upper bound of 2 O √ log n by Demaine et al. [20] . The latter also proved that the PoA is constant for α ∈ O(n 1−ε ) for any fixed ε > 1 log n . For large α, it was shown by Bilò and Lenzner [12] that for α > 4n − 13 all Nash equilibria must be trees and this bound was recently improved by Dippel and Vetta [21] to α > 3n − 3. This implies a constant PoA for α > 3n − 3. Finally, Álvarez and Messegué [5] established a constant PoA for α > n(1 + ε), for any ε > 0.

The NCG was generalized by Demaine et al. [19] by introducing a host network that specifies which edges can be bought. They show that the PoA deteriorates by providing a lower bound of Ω(min{α/n, n 2 /α}) and an upper bound of O( √ α), for α < n, and O( min{ √ n, n 2 /α}), for α ≥ n. Interestingly, no results on the existence of equilibria are known. Recently, a further generalization that allows weighted host networks was proposed by Bilò et al. [11] . This variant has a tight PoA of (α + 2)/2 for metric weights. Later a tight bound of Θ(α) was shown for arbitrary weights [24] . Also a bilateral variant of the NCG was studied by Corbo and Parkes [18] . There, similar to our model, edges can only be established by bilateral consent of the involved nodes and both nodes have to pay α. The unilateral deviations and it must be stable against joint strategy changes by coalitions of agents of size two. The strategy space of any agent i ∈ V is the power set of V \ i. An edge {u, v} is formed if and only if v is in agent u's strategy and u is in agent v's strategy.

authors of [18] prove existence of pairwise stable networks, i.e., complete networks are stable for α ≤ 1 and stars are stable for α ≥ 1, they give a tight PoA bound of Θ(min{ √ α, n/ √ α}), and they show that the PoS is 1. To the best of our knowledge, the bilateral variant with a given host network has not yet been studied. Recently, also a bilateral variant modeling the formation of social networks was introduced [10] .

The idea of a game-theoretic model of network formation in a context of spreading risk is not new. Goyal et al. [27] study a setting where a node is attacked and this attack spreads to all vulnerable neighbors. Agents strategically create edges and immunize themselves to maximize their connected component post attack. For this model, also the efficient computation of best strategies [25] and a variant with probabilistic spread [17] was studied. Moreover, there has been much research in the context of financial contagion, where agents benefit from collaborating, but also suffer from the risk of cascading failure arising with the collaboration [3, 29, 15, 1] . In particular, Blume et al. [13] developed an elegant model where nodes form a network and then some randomly chosen nodes fail and this failure then spreads with some probability via the edges. The utility is a linear combination of the node degree and the risk of failing in the second phase. The virtue of this model is that utilities are based on a random process that realistically models the spread of a contagious infection. However, the major downside of this model is that the computation of the random process is #P-complete. Thus, this model does not yield a realistic prediction of real-world behavior. While analyzing our model for general host networks, we consider Maximum Routing-Cost Spanning Trees. Routing costs have been studied much in mathematics, mostly under the name of the Wiener index [47] . Trees were of special interest and there has been much research on the Wiener index of trees with different properties. But although spanning trees minimizing the Wiener index were studied extensively, the concept of spanning trees maximizing the Wiener index received little attention [22, 48] . However, it was shown that finding or even approximating a tree maximizing the Wiener index is NP-hard [16, 26] .

We introduce the Social Distancing Network Creation Game (SDNCG), a game-theoretic model in which selfish agents try to maximize their utility by strategically connecting to other agents and thereby creating a network. Each agent values direct connections to other agents but at the same time wants to maximize the distances to all other agents in order to lower their exposure and increase their reaction time to risks appearing in the network. In contrast to the similar model by Blume et al. [13] , our model, while not modeling a perfectly realistic spread of the infection, has the advantage of an efficiently computable utility function. By using the distance to the other agents as part of the utility, it also accounts for reaction time:

If an infection breaks out far away, an agent has more time to prepare or react to it. Another virtue of our model is that it is the inverse to the well-known Network Creation Game [23] and its bilateral variant [18] . Hence, we can study and compare the game-theoretic properties of the inverted models. To the best of our knowledge, this is one of the rare cases where both the minimization and the maximization of a utility function have a natural interpretation.

Our results and the comparison with the inverted models are summarized in Table 1 . We analyze two variants of the SDNCG. For the K-SDNCG, where we assume a complete host network, we characterize optimal and several stable networks and show that the PoS is 1. We provide an improving response cycle, which implies that equilibrium existence for the (K-)SDNCG cannot be derived from potential function arguments. Finally, derive several bounds for the PoA which are tight for α ≥ n 2 , asymptotically tight for α ≤ √ n, and asymptotically tight up to a log-factor for α ≤ n 6 − 3. [20] α ∈ O n 1−ε : Θ(1) [20] α > n(1 + ε) : Θ(1) [5] α ≤ 1 : 1 [23] 1<α<2 : ≤ 4 3 [23] α ≥ 2 : 1 [23] BNCG [18] α < 1 : K n [18] α > 1 : S n [18] α < 1 : K n [18] α > 1 : S n , . . . [18] Θ min √ α, n √ α [18] α < 1 : 1 [18] 1 [18] K-SDNCG α < n 3 : Table 1 An overview of our results (yellow) and a comparison with the results for the inverted models (white). BNCG abbreviates the bilateral NCG by Corbo and Parkes [18] whereas H-NCG denotes the NCG on a host network by Demaine et al. [19] . N2 := (n−1) 2

, N3 := (n−2)n(n+2) 24 , H denotes the host network, Pn, Kn, Sn are the path, clique, and star networks on n nodes, respectively.

For the SDNCG on arbitrary host networks we utilize Maximum Routing-Cost Spanning Trees for characterizing optimal networks for α ≤ 1. As our main result, we show that their locally optimal variant, the Swap-Maximal Routing-Cost Spanning Trees, and hence also Maximum Routing-Cost Spanning Trees, are pairwise stable for α ≤ n 3 . We prove that computing the MRCST is NP-hard, while the SMRCST can be constructed efficiently. Thus, for the significant range of 1 ≤ α ≤ n 3 , we not only have guaranteed equilibrium existence on any host graph, but we can compute stable states efficiently. This is in stark contrast to what is known for the inverse model studied by Demaine et al. [19] . Additionally, we approximate optimal networks and we derive several (tight) bounds on the PoA and the PoS.

Compared with the NCG [23] and the bilateral NCG [18] , we find that the results for the K-SDNCG regarding optimal and stable networks are analogous but reversed, with the spanning path taking over the role of the spanning star. Moreover, our PoA results for both the K-SDNCG and the SDNCG show that our maximization variant has a significantly worse PoA that is linear or almost linear in n, compared to the PoA upper bounds of o(n ε ) and O( √ α, n/ √ α) for the NCG and the bilateral NCG, respectively. As main take away from our paper, this implies that under social distancing the agents' selfish behavior has significantly more impact on the quality of the equilibria. This calls for strong coordination mechanisms governing the network formation to avoid detrimental stable states.

We analyze the properties of the K-SDNCG, i.e., the SDNCG on complete host networks. First, we characterize optimal networks and give some examples for stable networks, dependent on the relation between n and α. After that, we show several bounds on the PoA and PoS.

Intuitively, for small α the distance utility dominates the social welfare. Hence, the path should be the optimum since it maximizes the total distances. For large α, the edge utility dominates, which leads to the clique being optimal since it maximizes the number of edges. Now we show that this intuition is indeed true. Moreover, the optimal construction is unique.

Theorem 1. For α < n 3 , the unique social optimum is the path. For α > n 3 , the unique optimum is the clique. For α = n 3 , the clique and the path are the only social optima. Proof. Šoltés and Ľubomír [49] showed that for a fixed number of nodes and edges, the network maximizing the summed distances is unique and contains a clique and a path with at least two edges between one endpoint of the path and the clique. We call this a PathClique. (Note that the clique can be empty, resulting in just a path) For a visualization, we refer to Figure 1 . Note, that the social optimum has to be such a network, since for every other network, there is a PathClique with the same number of edges but larger summed distances and therefore a larger social welfare.

Let G be a PathClique with n nodes having a clique containing k nodes. Then the corresponding path contains n − k nodes. First, we show that, unless G is a path or a clique, we get a socially better network by adding or removing edges.

Let G be neither a clique nor a path and let v be the endpoint of the path that is connected to the clique. Observe that removing an edge between v and the clique results in a PathClique. (This is still true if there are only two edges connecting v to the clique: Removing one of these edges makes the remaining neighbor of v in the clique the new endpoint of the path and reduces the size of the clique by 1.) Therefore, this is the socially best way of removing an edge from G. Similarly, the best way of adding an edge to G is adding it between v and the clique unless v is already fully connected to the clique in which case it is best to add an edge between the neighbor of v on the path and the clique. We now make a case distinction.

If v is fully connected to the clique, adding an edge decreases distances by 2(n − k − 1) and deleting an edge increases distances by 2(n − k). This means, that G can only be optimal if α ≤ 2(n − k − 1) and α ≥ 2(n − k), which is a contradiction.

If v is not fully connected to the clique, adding an edge decreases distances by 2(n − k) and deleting an edge increases distances by 2(n − k). Thus, α = 2(n − k) is necessary for G to be socially optimal. But when α = 2(n − k), adding and deleting edges between v and the clique does not change the social welfare. Let G be the network obtained by fully connecting v to the clique. Then SW(G) = SW(G ). Additionally, G cannot be an optimum since it fulfills the conditions of the first case. Therefore, G cannot be socially optimal, too.

So, the social optimum network must be the path P n or the clique K n . We have

For α = n 3 , we see that

for α < n 3 we have SW(P n ) > SW(K n ), and α > n 3 yields SW(K n ) > SW(P n ). Next, we have a look at the existence of pairwise stable networks. Similar to the social optimum, for small α, agents prefer large distances over many incident edges and therefore should remove as many edges as possible, leading to only trees being stable. Interestingly, the restrictions of pairwise stability lead to all trees being stable for small α, even if the distances are very small (like in a star). This is shown by the next theorem.

(1) For α ≤ 1, every tree is pairwise stable. For α < 1, any pairwise stable network is a tree.

, the clique is the only pairwise stable network. Proof of (1). Let G be a tree. Since removing an edge from G would lead to G being disconnected, we only have to consider adding an edge. This would shorten the distances for both endpoints by at least 1. Since α ≤ 1, this is not an improvement for the agents.

Let G be a network. If G contains a cycle, we can remove an edge without disconnecting the network. By doing this, the distances for both endpoints increase by at least 1. Since α < 1, this is an improvement for both agents.

Proof of (2). Let G be a clique. Since adding an edge is not possible, we only look at removing an edge. This would result in a distance increase of 1 for both endpoints. Since α ≥ 1, this is not an improvement for either of the two incident agents.

Let G be the path consisting of n nodes v 1 , . . . , v n , in that order. Since G is a tree, no edge can be removed without disconnecting the network. Therefore, the only possible move is adding an edge. Let 1 ≤ i < j ≤ n be such that v i and v j are not adjacent (i.e., j − i ≥ 2). If j ≤ n 2 , adding the edge e := {v i , v j } shortens distances from v i to at least n 2 nodes. The same holds for i ≥ n 2 regarding node v j . It is easy to see that the distance decrease is minimal when j − i = 2. Intuitively, edge e has to be as central as possible. According to this, choosing i = n 2 and j = i + 2 yields

Thus, if α ≤ n−1 2 , adding edge e is not an improving move and G is pairwise stable.

Proof of (4). Let V be a set of n agents and G = (V, E) be a stable network. Let v ∈ V be a node having minimum total distances, i.e., for

is not a clique. Then there are two neighbors x, y of v with {x, y} / ∈ E. We observe that for each node z ∈ V , the distances d G (v, z) and d G (x, z) can only differ by 1, since v and x are neighbors. By choice of v, there are at least as many nodes that are closer to v than nodes that are closer to x. Therefore, there are at most n 2 many nodes that are closer to x. Adding an edge between x and y can, for node y, only shorten distances to nodes which are closer to x than to v. Thus, this edge shortens the distances from y by at most n 2 . The same holds for node x. Therefore, for α > n 2 , this edge would improve the utility of agents x and y and, thus, G would not be stable. This contradicts our assumption. Thus,

. By induction, since G is connected, it must be a clique.

Theorem 2 implies that socially optimal networks are also stable. In fact, they are stable for a wide range of α-values. The clique is stable for α ≥ 1, meeting the bound below which only trees are stable. Similarly, the path is stable for α ≤ n−1 2 , almost meeting the lower bound for only the clique being stable. Additionally, we observe that we only need two networks (path and clique) to provide pairwise stable networks for all possible values of α.

For further constructions, we need the following definition. Let G be a network. We call G a clique network of G, if it can be obtained by replacing each node of G by a clique of size at least 2 and for each edge of G connect the two corresponding cliques fully bipartite. By using only constant-size cliques, some properties of G (density, length of shortest paths) are preserved while the network is more stable against edge removal.

Proof. Removing any edge from G only effects the distances between its endpoints. These distances increase by 1.

Finally, we show that stable states may not be found by simply letting agents iteratively play improving moves, i.e., via a sequential process of improving strategy changes. Figure 2 provides an example of a cyclic sequence of improving moves. This also implies that both the K-SDNCG and the SDNCG do not belong to the class of potential games [44] , i.e., the existence of equilibria cannot be proven via potential function arguments.

Proof. This is shown by the existence of improving cycles. See Figure 2 for an example.

This figure shows a cyclic sequence of improving moves performed by n = 5 agents for α = 2.5. In each step, the nodes responsible for the next change are highlighted in orange. Note that the last step is isomorphic to the first step.

In this section, we give a series of bounds for the Price of Anarchy and the Price of Stability.

the Price of Anarchy is 1.

Proof of (1). Let α ∈ R >0 . Every connected network has at least n − 1 ∈ Ω(n) and at most n 2 ∈ O(n 2 ) edges. Furthermore, the summed distances are at least n(n − 1) ∈ Ω(n 2 ), since every node has at least distance 1 to every other node, and at most n 3 , since every node has at most distance n to every other node. With this, we get the trivial bound of

Proof of (2). For α ≤ 1, the social optimum is the path as shown in Theorem 1. It has social welfare of

Because of Theorem 2 and the star being a tree, it is pairwise stable for α ≤ 1. It has social welfare of

Together with (1), this yields

We construct a star-like network with cliques as leaves in the following way.

Let furthermore M be a clique of size n − cd. We now define our network G as

We essentially connect the outer cliques K 1 , . . . , K d to the center clique M via d 2-cliques and each connection is fully bipartite (see Figure 3 ). Since n = |V G |, G is a network of the desired size.

The figure shows a star-like clique network, where the center is formed by a clique M and each ray consists of two nodes vi, v i and a clique Ki.

We now show that G is pairwise stable. We see that G is a clique network. Because of Theorem 3 and α > 1, G is stable against edge removal. On the other hand, adding an edge shortens distances to at least |K i | = c − 2 ≥ α nodes which means a distance decrease of at least α for the two incident nodes. This also does not increase their utility. Therefore, G is pairwise stable.

For the center clique M , we see that

For α < √ n, the socially optimal network is the path. With the previous calculations, we can now bound the Price of Anarchy as

∈ Ω(n).

From (1), we have PoA ∈ O(n) and therefore PoA ∈ Θ(n).

denotes the Hamming Distance between v and x. Let G be a clique network for G H with |V G | = n such that the sizes of the cliques replacing the nodes of G H differ by at most 1. Observe, that each clique is of size 2 or 3 if 2 · 2 d ≤ n < 3 · 2 d and of size 3 or 4 if 3 · 2 d ≤ n < 4 · 2 d . By Theorem 3 and since α ≥ 1, we know that G is stable against edge removal. We now show that adding an edge shortens the total distances for the incident nodes by at least n

be the nodes corresponding to the cliques that contain v and x, respectively. Therefore, e :

.

Adding e to G H decreases the distances from v to another node y ∈ V G H if and only if

bits of the label since the remaining bits are equal for v and x . Let be the number of the first d H (v , x ) bits of y equal to 1.

The number of nodes where exactly of the first

Observe that the distances from v to all nodes in other cliques in G are exactly the same as the distances from v to all other nodes in G H . The same holds for G + e and G H + e , with the exception of the distances from v to the (at most 3) nodes in the same clique as x. We distinguish two cases:

If 2 · 2 d ≤ n < 3 · 2 d , each clique consists of 2 or 3 nodes. Therefore, we have a distance decrease of at least

If 3 · 2 d ≤ n < 4 · 2 d , each clique consists of 3 or 4 nodes. This means, we have a distance decrease of at least

Thus, edge additions are not beneficial for the incident agents if α ≤ n 6 − 3 and we conclude that the constructed network is stable for 1 ≤ α ≤ n 6 − 3. We also see that the number of edges m is in Θ(n log n) and the distance d(V G , V G ) is in Θ(n 2 log n). Since the social optimum for 1 ≤ α ≤ n 6 − 3 is the path P n , we get for the Price of Anarchy:

∈ Ω n log n .

Proof of (5). We construct a path of cliques in the following way. Let 2 ≤ d ≤ n−6 2 be some even number and c = n−6 d . Furthermore, let K 1 , . . . , K d be d cliques consisting of c or c + 1 3 , v 3 be 6 more nodes. We now define the network G as Figure 4 shows a sketch of G. We observe that G is stable against edge removal because of Theorem 3, since α > n 6 −3 ≥ 1 and G being a clique network for the path. We now show that adding an edge is also not an improving move.

We quickly see that, for a node v, adding an edge into the 2-neighborhood always shortens distances the least. We therefore only have to consider these edges. We observe that adding an edge between v 1 and v 3 (or because of symmetry, v 1 or v 3 ) decreases distances from v 1 to v 3 and all nodes in K d 2 +1 , . . . , K d and decreases distances from v 3 to v 1 and all nodes in K 1 , . . . , K d 2 by exactly 1. This means, we get

Every other edge we could add decreases distances to all the cliques of one side of the path, resulting in larger distance decreases. This means that adding an edge is not an improving move for α ≤ n 2 − 2. Therefore, G is pairwise stable for the desired values of α.

We now evaluate the number of edges. We have |E Ki | ∈ Θ(c 2 ). The number of edges between two neighboring cliques is also in Θ(c 2 ). This means that the total number of edges is |E G | ∈ Θ(dc 2 ). We also see that the diameter of G is d and therefore d G (V, V ) ∈ O(dn 2 ).

Since α ∈ Θ(n), we get for the Price of Anarchy

∈ Ω n 3 n 5 2 + n 5 2 = Ω √ n .

Proof of (6). This follows directly from the clique being socially optimal (see Theorem 1) and the only pairwise stable network (see Theorem 2).

We have established that the Price of Anarchy is relatively high for α ≤ n 2 . It even meets the trivial upper bound of O(n) for a large range of α. In contrast to the high PoA values, we observe that the Price of Stability is independent of α and best possible.

Proof. This follows directly from the path being stable and socially optimal for α ≤ n 3 and the clique being stable and socially optimal for α ≥ n 3 (see Theorem 1 and Theorem 2).

From an efficiency point-of-view, the huge gap between the PoA and the PoS suggests that having an outside force assigning an initial strategy to all players is beneficial. That way, stability and optimal social welfare can be guaranteed. Without such coordination, the players could end up in socially bad equilibria or in a cyclic sequence of improving moves.

We now analyze the SDNCG on arbitrary connected but not necessarily complete host networks. First, we analyze socially optimal networks and then we investigate the existence of pairwise stable networks. We prove our main result that establishes equilibrium existence on any connected host network for a wide parameter range of α. Finally, we derive bounds on the Price of Anarchy and the Price of Stability. Additionally, we show that computing the social optimum and the Maximum Routing-Cost Spanning Tree is NP-hard while computing a Swap-Maximal Routing-Cost Spanning Tree can be done in polynomial time.

While for the K-SDNCG, we only have two possible social optima (dependent on α), this gets more complicated for general host networks. Of course, if they exist on general host networks, then the optima for the K-SDNCG are still the most efficient networks. Intuitively, if the host network does not contain a Hamilton path, then the social optimum should be a tree if α is small enough. Since all trees have the same number of edges, the social welfare of a tree is only influenced by the total distances. Remember that the spanning tree maximizing the total distances is by definition the Maximum Routing-Cost Spanning Tree (MRCST). We now show, that this intuition is indeed correct. (3) For α > 1 24 (n − 2)n(n + 2), H itself is the unique social optimum.

Proof of (1). This follows directly from Theorem 1.

Proof of (2). Let T opt be the MRCST of H and G some state of H, that is, a spanning subnetwork of H. Furthermore, let T be a spanning tree of G. Then SW(G) ≤ SW(T ), since we can construct G by adding edges to T and for every edge added, the social welfare goes up by α ≤ 1 and down by at least 1, because of distances decreasing. Since T opt maximizes the total distances, we have SW(T ) ≤ SW(T opt ) and therefore SW(G) ≤ SW(T opt ).

We show that any edge added to any network shortens the total distances by at most 1 24 (n − 1)n(n + 1). It is easy to see that adding an edge between the two endpoints of a Hamilton path maximizes the distance decrease. This means, if α is larger than that, it is always socially better to add more edges to the network, resulting in H itself to be optimal.

We already know the social welfare of a Hamilton path P n from Section 2. After adding an edge between the two endpoints of P n , we get a cycle C n . In this cycle, we see that for each node there are exactly two nodes for every possible distance 1 ≤ d < n 2 and an additional node at distance n 2 , if n is even. This yields SW(P n ) = 2α(n − 1) + 1 3 (n − 1)n(n + 1),

for n even.

We then have SW(C n ) > SW(P n ) if and only if α > 1 24 (n − 1)n(n + 1) for n odd, α > 1 24 (n − 2)n(n + 2) > 1 24 (n − 1)n(n + 1)

Next, we discuss stable networks. In contrast to the K-SDCNG, it is not obvious that pairwise stable networks are guaranteed to exist for any connected host network. However, we can directly transfer the result that spanning trees are stable for small α. For large α, similar to the clique being the unique stable network for α > n 2 for complete host networks, as shown in Theorem 2, we show that the whole host network is pairwise stable. However, in contrast to the K-SDNCG, this is true only for much larger values of α. Proof of (1). The proof is exactly the same as for (1) of Theorem 2.

Proof of (2). Consider a network G on a host network H = (V, E H ). The largest distance decrease a node v ∈ V can suffer when forming an edge e ∈ E H is when G is a path and v one of its endpoints connecting to the other endpoint x ∈ V . This move decreases the distances by

Thus, since α > 1 4 (n − 1) 2 , forming edges is always beneficial for the incident nodes. Similarly, edge removal always decreases the utility of the incident nodes. Therefore, the host network H is the only pairwise stable network.

Contrasting statement (2) of Theorem 9, using H := C n for odd n and α < 1 4 (n − 1) 2 shows that the host network is not necessarily pairwise stable. This example also shows that the optimum is not necessarily stable: For α ≥ 1 4 (n − 1) 2 and H := C n as the host network, C n is the only pairwise stable network but it is not the optimum for α < 1 24 (n − 2)n(n + 2). This is another significant difference to the K-SDNCG. Now that we characterized stable networks for extreme α-values, the question remains whether stable states also exist for in-between values. For the K-SDNCG, the path is stable up to α < n−1 2 . This is, of course, still true for non-complete host networks if they contain a Hamilton path. Since a Hamilton path (if it exists) is the MRCST, it is natural to suspect that the MRCST properties at least partially ensure stability for some α ≥ 1. However, even if true, the MRCST is still NP-hard to compute. Hence, in quest of an efficiently computable stable network, we introduce a less strict variant of MRCSTs which is only locally optimal: Swap-Maximal Routing-Cost Spanning Trees. Remember, a SMRCST is a spanning tree whose summed distances cannot be increased by removing one edge and adding another edge.

As our main result, we now show that SMRCSTs (and therefore MRCSTs, too) are indeed stable beyond α ≤ 1. Note, that for the inverse model of the NCG on an arbitrary host network [19] , so far no equilibrium existence statement is known. Proof. Let G with V G = V H and E G ⊆ E H be a SMRCST. Since G is a tree, we only have to consider edge additions. We show that adding any edge decreases the summed distances for at least one of the edges endpoints by at least n 3 . This is sufficient to show the claim. Let e 1 ∈ E H \ E G be an edge not part of the SMRCST. Adding e 1 would form a cycle of length d ∈ N consisting of nodes v 1 , . . . , v d ∈ V with v 1 and v d being the nodes incident to e 1 . Let E C be the set of all edges on this cycle. Removing all edges in E C from G would create d trees rooted in v 1 , . . . , v d respectively. Let furthermore x 1 , . . . , x d be the number of nodes in each of the d trees. See Figure 5 for an illustration.

Since G is a tree, there is exactly one path between each pair of nodes (which is also the shortest). For each edge e ∈ E G , we define d G (e) as the number of paths between two nodes in G which include e. We then can express the total distances as

Note, that each path between two nodes contributes twice to the total distances (one for each node), which leads to the factor of 2.

Let x := (x 1 , . . . , x d ). We now define for each edge e ∈ E C on the cycle

This is the contribution of all the edges on the cycle to the total distances if we add e 1 to it and instead remove e from it. Note that c e1 is the value for the original network since G + e 1 − e 1 = G. We see that c e does not depent on the structure of the subtrees rooted in the v i but only on the number of nodes in each subtree. Since the number of paths going over an edge that is not on the cycle does not change when we add e 1 and remove e, we have

We know that G is a SMRCST of H. This means that d G (V, V ) ≥ d G (V, V ) for any other spanning tree G which can be obtained from G by a swap of one edge. We therefore also have

Now, we use the previous observations to formulate and solve a minimization problem which yields the desired bound. We start with some definitions.

We call

For each edge e ∈ E C , we call c e (x) (defined above) the cost of e. And lastly, we define the distance decrease ∆d as

The goal then is: Find a node distribution x that fulfills 1 and minimizes ∆d(x). Observe that this indeed yields a lower bound for the distance decrease when adding e to G. If we show that this is at least n 3 , we proved the statement. Let x = (x 1 , . . . , x d ) ∈ N d be a node distribution minimizing ∆d(x). We first show the claim for d = 3 and d = 4.

For d = 3, we have x 2 ≤ x 1 and x 2 ≤ x 3 from 1 and ∆d(x) = max{x 1 , x 3 }. Since

we see that max{x 1 , x 4 } ≥ n 4 . We conclude that ∆d(x) = 2 max{x 1 , x 4 } ≥ n 2 > n 3 . For d > 4, we make a case distinction between d being odd and d being even and simplify the problem by doing several relaxation steps. For d ≤ 4, it is easy to show that ∆d(x) ≥ n 3 . For further steps, we allow x ∈ R d ≥1 . Note, that this only allows for smaller minima and therefore still yields a lower bound for the original problem.

The high level idea of the following steps is that we can redistribute weights of the node distribution x without changing ∆(x) or violating 1 and thereby reducing the number of variables contained in x by setting most x i to 1. We now make a case distinction. Figure 5 (middle)) We will only consider the two constraints

where

This still yields a lower bound for the original problem since the constraints from 3 are a subset of the constraints from 1. We observe that the claim is trivially true for n 3 ≤ d+3 2 since ∆d(x) ≥ d+3 2 (tight for n = 5, d = 5, x = (1, 1, 1, 1, 1) ). This means, we can assume

In the following, we perform a series of relaxations. For that, we define the node distributions x (1) , x (2) , x (3) such that for all 1 ≤ i ≤ d

else, (Without loss of generality, we assume x

.

It is easy to see that ∆d(x) = ∆d x (1) = ∆d x (2) = ∆d x (3) . We show that c e1 x (i) ≥ c e2 x (i) and c e1 x (i) ≥ c e3 x (i) for 1 ≤ i ≤ 3. This means, x (3) is also a solution of the minimization problem.

First, we show this for x (1) . Let 1 < p < m − 1. Consider x * , a modification of x where the weight of x p is distributed among x 1 and x m−1 as follows:

To show that x * also fulfills (3), we show that

are exactly the same with the only difference being the sum indices (first sum goes to m and second sum starts at m + 1).

Because of symmetry, for m + 1 < p < d, we can similarly distribute weights from x p to x m+1 and x d . Using this for all 1 < p < m − 1 and m + 1 < p < d iteratively, we get exactly x (1) , which therefore still fulfills (3).

Next, we show that x (2) fulfills (3), too. We have

We can further assume that (2m − 3)x

m+1 . This is because if (without loss of generality) (2m − 3)x

1 without changing ∆d(x (1) ). This increases c e1 (x (1) ) more than c e2 (x (1) ) and c e3 (x (1) ).

Let (without loss of generality) x

(1)

else.

and therefore ∆x (2) (3), we see that c e1 (x (2) ) ≥ c e3 (x (2) ) holds, too. With x (2) being symmetric, c e1 (x (2) ) ≥ c e2 (x (2) ) holds as well. This shows that x (2) also fulfills (3) .

We see that

and obtain

The last step follows from x (2) m ≥ x 

1 .

We now prove the lower bound. We see that ∆d(x (3) ) is only dependent on x

1 which means we have to minimize x

1 . Since x (3) sums to n, we have n = 2x 

Observe that x (3) fulfills (3) if and only if ∆x (3) ≥ 0. Solving the quadratic equation, we see that this is only the case for

With (4), we see that the term under the root is larger than

This yields

and therefore 

and

where

We observe that the claim is trivially true for n 3 ≤ d+6 2 , since ∆d(x) ≥ d+6 2 (tight for n = 6, d = 6, x = (1, 1, 1, 1, 1, 1) ). This means we can assume n > 3(d + 6) 2 , or rather n > 3(m + 3).

Similar to the odd case, we perform a series of relaxations. For that, we define the node distributions x (1) , x (2) , x (3) , x (4) such that for all 1 ≤ i ≤ d

else, (Without loss of generality, we assume x

Again, it is easy to see that ∆d(x) = ∆d x (1) = · · · = ∆d x (4) . We show c e1 (1) ) follow the same way as in the odd case (with slight adjustments to the sum indices). This means that x (1) fulfills (5).

Next, we show that x (2) fulfills (5), too. We have

Similar to the odd case, we can further assume that (m − 1)x

m+2 . If this is not the case and (without loss of generality) (m−1)x

m+2 , we could move weight from x (1) m+2 to x (1) d without changing ∆d(x (1) ). This would increase c e1 (x (1) ) more than c e2 (x (1) ) and c e3 (x (1) ).

Let (without loss of generality) x

else.

We now show that

and

We have

and therefore

Furthermore, we see that

The inequality in the last step comes from m ≥ 3, y ≥ 0, x

Now, we show that x (3) fulfills (5), as well. Without loss of generality, we can assume x

and

and therefore

We also see that (3) does, too. This also means that the two inequalities from (5) are equivalent.

Finally, we show that x (4) fulfills (5). Let y :=

We have

and ∆ 3 x (4) = 4x

and therefore

The last step follows from x

1 , m ≥ 3 and y ≥ 0. Since x (3) and x (4) are symmetric, we also have ∆ 2 (x (4) ) − ∆ 2 (x (3) ) ≥ 0, and with x (3) fulfilling (5), the node distribution x (4) fulfills (5), too. Because of x (4) being a node distribution with only two variables (x (4) 1 = x (4) d and x (4) m = x (4) m+1 ), we can simplify ∆x (4) and ∆d(x (4) ) to ∆x (4) := ∆ 2 x (4) = ∆ 3 x (4) = 4x (4) We therefore obtain

and finally

which concludes the proof.

Next, we show that we can find a SMRCST in polynomial time via Algorithm 1 and even guarantee some bounds on the social welfare of the resulting network. Our algorithm employs Algorithm 1 Computes a SMRCST for a given connected host network.

Input: connected host network H Output: SMRCST T 1 P ← greedyLongPath (H); 2 T ← extend P to form a spanning tree of H; Proof. It is easy to see that, by construction, T is always a spanning tree of H. The condition in the while-loop ensures that all possible swaps are tried. This means that the while-loop ends if and only if T is a SMRCST. Therefore if the while-loop stops, the result is correct.

In every iteration in which the while-loop does not stop, the total distances of T increase by at least 1. Since the tree maximizing the total distances is the path, we get its total distances 1 3 (n − 1)n(n + 1) ∈ O(n 3 ) as an upper bound for the number of iterations. The runtime of Algorithm 1 is clearly dominated by the while-loop. Since T has n − 1 edges which can be removed and E H \ E T has O(m) possible edges to add, the number of possible swaps is in O(nm). For each swap, the total distances can be computed in O(n) [43] . Therefore computing the condition can be done in O(n 2 m). Altering the current solution in the body of the while-loop only takes O(n) when using adjacency lists. Since there are at most O(n 3 ) iterations of the while-loop, the overall runtime is in O(n 5 m).

For the K-SDNCG, the social optima were also stable. For general host networks, this is not necessarily the case. However, we can show that for α ≤ n 3 there are stable states which approximate the social welfare better than with the trivial factor of O(n). 

Proof of (2). Let V := V H and O := OPT H . We have 2α|E T | ≤ 2α|E O | ≤ 2αm ∈ O(n 2 ) and d O (V, V ) ∈ Ω(n 2 ). We also know that

Let V := V H and O := OPT H . Algorithm 1 uses the greedyLongPath algorithm to construct an initial tree that contains a path of at least length l ≥ m n [35] . Every node in that tree has a distance of at least l 3 to at least l 3 of the nodes on the long path. This means that the total distances are at least n l 3 2 ∈ Ω m 2 n . Since Algorithm 1 only increases the total distances, this is a lower bound for every solution found by the algorithm, and therefore also for the MRCST. With this, we get Since finding the MRCST is NP-hard [16] , these are only existence results. However, the next theorem yields a bound for dense networks and the computed SMRCST from Algorithm 1. This means, for α ≤ n 3 and a dense host network, we can compute a state which is pairwise stable and also has a favorable social welfare. 

We derive several bounds on the PoA and the PoS for the SDNCG. For the K-SDNCG, the PoA is already quite high for small α. The next Theorem shows that this gets even worse for general host networks since the PoA is linear up to α ≤ n and super-constant for α ∈ o(n 2 ).

(2) For α < 1, the Price of Anarchy is in Θ(n).

(3) For 1 ≤ α ≤ n, the Price of Anarchy is in Θ(n).

(4) For n < α ≤ n 2 , the Price of Anarchy is in Ω n 2 α . (5) For 1 4 (n − 1) 2 < α ≤ 1 24 (n − 2)n(n + 2), the Price of Anarchy is in Θ(1). (6) For α > 1 24 (n − 2)n(n + 2), the Price of Anarchy is 1.

Proof of (1) and (2) . This follows in the same way as (1) and (2) of Theorem 5.

Let W = (V W , E W ) be a wheel network on n := n 2 nodes, i.e., V W := {v 1 , . . . , v n } and

We then define the host network H as the clique network obtained by replacing every node of W by a clique of size 2. (See Figure 6 for an illustration.) For odd n, we instead replace the central node by a clique of size 3. We see that H contains n nodes, Θ(n) edges, and most importantly a Hamilton path. We also know that H is stable because of Theorem 3 and since no edge can be added. This yields the following lower bound for the Price of Anarchy

PoA ≥ SW (P ) SW (H) = α(n − 1) + Θ(n 3 ) αΘ(n) + Θ(n 2 ) ∈ Ω min n, n 2 α , which proves the claim.

Proof of (5 

Proof of (6). This follows directly from the host network being socially optimal and the only stable network (see Theorem 7 and Theorem 9).

(2) For α ≤ 1, the Price of Stability is 1.

(3) For 1 < α ≤ n 3 , the Price of Stability is in O( √ n). (4) For 1 4 (n − 1) 2 < α ≤ 1 24 (n − 2)n(n + 2), the Price of Stability is in Θ(1). (5) For α > 1 24 (n − 2)n(n + 2), the Price of Stability is 1.

Proof of (1), (4), and (5). This follows from Theorem 14 and the Price of Anarchy being an upper bound for the Price of Stability.

Proof of (2). This follows directly from the MRCST being socially optimal and stable (see Theorem 7 and Theorem 9). 

We introduced and analyzed a natural game-theoretic model for network formation governed by social distancing. Besides modeling this timely issue, our model resembles the inverse compared to the well-known (bilateral) Network Creation Game [23, 18] . Thus, via our analysis we could explore the impact of inverting the utility function in a non-trivial strategic game. We find that this inverts some of the properties, like the rough structure of optimum states, while it also yields non-obvious insights. First of all, for the variant with non-complete host networks we could show a strong equilibrium existence result, whereas no such result is known for the inverse model. Moreover, we established that the PoA is significantly higher in the (K-)SDCNG compared to the (bilateral) NCG. This demonstrates that the impact of the agents' selfishness is higher under social distancing, which calls for external coordination.

The most obvious open question for future work is to settle the equilibrium existence. Do pairwise stable states exist for all connected host networks H and α? Another research direction would be to consider the unilateral variant of the SDNCG. While this no longer realistically models the formation of social networks, it might still yield interesting insights and it allows for studying stronger solution concepts like the Nash equilibrium or strong Nash equilibria, similar to [33, 6] . Also, altering the utility function, e.g., to using the maximum distance instead of the summed distances, or the probability of infection, similar to [13] , seems promising. Finally, also considering weighted host networks, as in [11] , where the edge weight models the benefit of the social interaction, would be an interesting generalization.

Systemic risk and stability in financial networks

On nash equilibria for a network creation game

Financial contagion

Network creation games: Structure vs anarchy. CoRR, abs/1706.09132

On the price of anarchy for high-price links

Strong price of anarchy

The price of stability for network design with fair cost allocation

Near-optimal network design with selfish agents

A noncooperative model of network formation. Econometrica

Selfish creation of social networks

Geometric network creation games

On the tree conjecture for the network creation game. Theory of Computing Systems

Network formation in the presence of contagious risk

On plane geometric spanners: A survey and open problems

Fire sales in a model of complexity

On the complexity of finding multi-constrained spanning trees

Network formation under random attack and probabilistic spread

The price of selfish behavior in bilateral network formation

The price of anarchy in cooperative network creation games

The price of anarchy in network creation games

An improved bound for the tree conjecture in network creation games. CoRR, abs/2106.05175

Wiener index of trees: Theory and applications

On a network creation game

Efficiency and stability in euclidean network design

Efficient best response computation for strategic network formation under attack

On the approximability of some maximum spanning tree problems

Strategic network formation with attack and immunization

On the history of the minimum spanning tree problem

Systemic risk in banking ecosystems

Optimum communication spanning trees

Social and economic networks

A strategic model of social and economic networks

On strong equilibria and improvement dynamics in network creation games

The complexity of the network design problem

On approximating the longest path in a graph

On dynamics in selfish network creation

Worst-case equilibria

On dynamics in basic network creation games

Greedy selfish network creation

Network design and transportation planning: Models and algorithms. Transportation science

Tree nash equilibria in the network creation game

The price of anarchy in network creation games is (mostly) constant

How to compute the wiener index of a graph

Potential games

Geometric spanner networks

How bad is selfish routing

Structural determination of paraffin boiling points

A survey on graphs extremal with respect to distance-based topological indices. Match (Mulheim an der Ruhr

Transmission in graphs: A bound and vertex removing

We know that x (4) fulfills (5) if and only if ∆x (4) ≥ 0. Solving the quadratic inequality leaves us withBecause of (6), we see that the term under the root is larger than