key: cord-0567194-jeph0of4
authors: Mashkaria, Satvik; 'Odor, Gergely; Thiran, Patrick
title: On the robustness of the metric dimension of grid graphs to adding a single edge
date: 2020-10-21
journal: nan
DOI: nan
sha: fd4ffcf05584c9719c0e88361dd1e90afb51c957
doc_id: 567194
cord_uid: jeph0of4

The metric dimension (MD) of a graph is a combinatorial notion capturing the minimum number of landmark nodes needed to distinguish every pair of nodes in the graph based on graph distance. We study how much the MD can increase if we add a single edge to the graph. The extra edge can either be selected adversarially, in which case we are interested in the largest possible value that the MD can take, or uniformly at random, in which case we are interested in the distribution of the MD. The adversarial setting has already been studied by [Eroh et. al., 2015] for general graphs, who found an example where the MD doubles on adding a single edge. By constructing a different example, we show that this increase can be as large as exponential. However, we believe that such a large increase can occur only in specially constructed graphs, and that in most interesting graph families, the MD at most doubles on adding a single edge. We prove this for $d$-dimensional grid graphs, by showing that $2d$ appropriately chosen corners and the endpoints of the extra edge can distinguish every pair of nodes, no matter where the edge is added. For the special case of $d=2$, we show that it suffices to choose the four corners as landmarks. Finally, when the extra edge is sampled uniformly at random, we conjecture that the MD of 2-dimensional grids converges in probability to $3+mathrm{Ber}(8/27)$, and we give an almost complete proof.

The metric dimension (MD) of a finite, simple graph is a combinatorial notion first defined in 1975 by [41] and independently by [22] . It can be interpreted as the minimum number of landmark nodes that can distinguish every pair of nodes based on the graph distances from these landmark nodes. The MD of d-dimensional grid graphs with large side lengths is d, hence for these graphs the MD is consistent with our common-sense notions of dimension. On the theoretical side, the MD has deep connections to the automorphism group of the graph G [4, 8, 18] , and hence the graph isomorphism problem [3] . In applications, the MD is used to compute the minimum number of landmark nodes required in robot navigation [27, 40] , computational chemistry [11] , and network discovery [5] . A recent application that is gaining more and more interest is the problem of finding patient zero of an epidemic. Finding patient zero can be especially useful in the early stages of an epidemic, as it was in the case of COVID-19 in the beginning of 2020 in multiple countries including China [45] , Italy [10] and the Netherlands [2] . There are multiple mathematical models of the patient zero problem. The first model was introduced by [39] , who were interested in finding the source of a arXiv:2010.11023v2 [math.CO] 15 Nov 2021 rumour in a network. In this paper, we focus on the model of [35] , who introduced the problem of detecting the first node of an epidemic given the underlying graph and the time of infection of small subset of sensor nodes. In the case of a deterministically spreading epidemic, the minimum number of sensors required to detect patient zero has been connected to the MD by [43] . Indeed, in the deterministic case, if the time of infection of patient zero is also known, the times of infection of the sensor nodes can be converted to graph distances between the sensors and patient zero, and the number of sensors required to always detect patient zero equals the MD. In reality, epidemics are not deterministic and the time of infection of patient zero is not known, but the MD can still give information on the number of sensors required to detect patient zero [42] .

Since the MD is NP-hard to compute [27] and is approximable only to a factor of log(N ) [5, 23] , theoretical studies play an essential role in understanding the MD of large graphs. The MD of a wide range of combinatorial graph families have already been computed, we refer to [36] for a list of references. For applications on naturally forming networks like the patient zero detection problem, random graph models are the most appropriate tool for theoretical study. There are only a few results on the MD of random graphs, including Erdős-Rényi graphs [7] , a large class of random trees [32, 28] , and more recently random geometric graphs [29] . In the case of G(n, p) Erdős-Rényi random graphs, it has been shown that the MD goes through a non-monotone, zig-zag behavior as we vary the probability of connections p, and we let the number of nodes n tend to infinity [7] . Not only is the behaviour non-monotone, it is also not smooth in the parameters. For example for p = n − 1 2 we have MD ≈ log(n) but for p = n − 1 2 + we have MD ≈ √ n. For p = Θ(1) we have MD ≈ log(n) again. This surprising result raises the main question of the current paper: how robust is the notion of the MD to the addition or deletion of edges? This question has been already studied by [14] , who found that the MD was robust to edge deletions but not to edge additions (see more on the related work in combinatorics in Section 1.1). In this paper, we focus on more precise results on how large the increase of the MD can be if we add an edge to a general graph or a grid graph. We are interested both in the adversarial setting, where we look for an upper bound on the MD of the new graph no matter where the edge is added, and in the random setting, where we try determine distribution of the MD of the new graph on the addition of a uniformly randomly chosen edge.

Understanding the robustness of the MD to a singe edge addition or deletion has wide ranging practical implications. For the graphs whose MD is non-robust, the MD might not be a very informative notion for application purposes. This is especially true in the application settings where we only have a noisy estimate of the underlying network. For instance, in most papers on patient zero detection, the contact network is assumed to be completely known; an assumption which does not hold in reality. Indeed, the contact network is usually estimated [21] , which is a very challenging task [13] . With the exception of [44] , we are not aware of any theoretical work in the source detection community that addresses the question of robustness in the estimation or the number of required sensors, when our knowledge of the contact network is noisy. We note that robustness to node failures has been more extensively studied, see [24] and the several follow-up articles.

In different applications, where we know the underlying network exactly, non-robustness of the MD can hint at opportunities for improvement or vulnerabilities to malicious attacks depending on our goal in the specific application. For instance, in the source obfuscation problem, our goal is to spread some information in a network so that a few spy nodes are not able to detect the information source [17, 16] . These source obfuscation models are used to anonymize transactions on the Bitcoin network [6] . Similarly to the source detection problem, in source obfuscation the MD could serve as a proxy for how many spies are needed for the attacker to detect the source, and therefore the robustness of the MD translates to the robustness of privacy guarantees.

Our proofs rely on careful combinatorial analysis, and a detailed description of how the shortest paths change in a graph after adding an edge. In particular, when adding edge to graph, we study the set of node pairs between which the shortest paths are changed and unchanged. These sets depend on the extra edge, but otherwise they are highly structured. We are not aware whether this structure (described in Section 2) has been previously studied in the literature, but we believe it could bring an insight into different problems where the addition of a single edge is studied (i.e. wormhole attacks [25] and the dynamic all pairs shortest paths problem in data structures [12, 1] ).

The question of how much the MD of a graph can change on the addition of a single edge has been first studied for trees, where [11] found that on the addition of an arbitrary edge, the MD cannot increase by more than one, and cannot decrease by more than two. The result has been proved later in [15] . The work that is most similar to ours is [14] , where the change of the MD on a singe edge or vertex addition or deletion is studied in general graphs. The authors find that, similarly to trees, the decrease of the MD on edge additions cannot be more than two, however, the increase is not bounded by any constant in general graphs. The latter statement is supported by an example graph, where the addition of a single edge doubles the MD.

More distant but still relevant questions were studied by [33] and [44] . In [33] , the authors define the notion of the threshold dimension of a graph G as the minimum MD we can achieve by adding an arbitrary number of edges to G. Obviously, adding too many edges will bring G close to the complete graph, which has a very large MD, but the authors show that for some graphs G it is possible to add edges in a smart way to significantly reduce the MD. We note that in a different paper, Geneson and Yi have constructed connected graphs H and G such that H ⊂ G and the ratio of the metric dimensions of H and G is arbitrarily large [20] . The authors of [33] also connect the threshold dimension with the dimension of the Euclidean space in which the graph can be embedded.

In [44] , we are given k connected graphs and it is assumed that k − 1 edges are missing between them, which would connect all k components into a single one. The extended metric dimension is the number of landmarks we need to distinguish any pair of nodes, no matter where the k − 1 edges are. Note that as opposed to our setup, in the setup of [44] the landmarks are placed non-adaptively to the extra edges, in fact, the nodes must be distinguished without knowing the location of the extra edges.

Before summarizing the results we recall the rigorous definition of the metric dimension. Definition 1 (MD). Let G = (V, E) be a simple connected graph, and let us denote by d G (A, B) ∈ N the length of the shortest path (that is, the number of edges) between nodes A and B. A subset R ⊆ V is a resolving set in G if for every pair of nodes A = B ∈ V there is a distinguishing node X ∈ R for which d G (A, X) = d G (B, X). The minimum cardinality of a resolving set is the metric dimension of G, denoted by β(G).

The main contribution of our paper is a refined analysis on the increase of the MD on adding a single edge. In Section 3.1, we show an example graph where adding a particular edge increases the MD from Θ(log(N )) to Θ(N ), which is a much larger increase than in the example of [14] , where the MD only doubles. For a result in the opposite direction, in Section 3.2 we provide an upper bound on the MD of the graph with the extra edge in terms of the MD of two subgraphs of the original graph. We believe that this result can be used in several graph families to show that the exponential increase in Section 3.1 only happens for very special (in a sense very heterogeneous) graphs, and that in most cases the MD at most doubles. We prove this doubling upper bound for d-dimensional grid graphs in Section 4.1, and finally, we perform an even more refined analysis for the case of d = 2 in Section 4.2.

For the case d = 2, we conjecture that the limiting distribution of the MD after a uniformly random edge is added is 3 + Ber(8/27), where Ber is the Bernoulli distribution. The only part missing in proving this conjecture is a lower bound on the MD when the extra edge is in a specific configuration. Such lower bound proofs are especially tedious, since one must show that no set of landmark nodes of a certain size can distinguish every pair of nodes, which often leads to a long case-by-case analysis. Instead, we proved as much as we could reasonably write down in a paper, and state the rest of our results as a conjecture at the end of the paper (Conjecture 1). A similar approach was used in [30] when determining the MD of torus graphs.

2 Changes in the all-pairs shortest paths after adding an edge In this section we will develop tools to understand how the shortest paths change in a graph after adding an extra edge.

Let G = (V, E G ) be a connected simple graph, with vertex set V (we use the word vertex, node and point interchangeably) and edge set E G . We add an edge e between two non-adjacent vertices E and F to obtain a graph G = (V, E G ∪ {e}). Let d H (A, B) denote the length of the shortest path between vertices A and B in graph H. For simplicity, we will use the notation d G (A, B) = AB. Remark 1. If we want to reach vertex B from vertex A, there are three options: Either we do not use e at all, or we use e from E to F or we use e from F to E. Hence,

Clearly, we cannot increase the distance between two vertices by adding an edge, or in other words either d G (A, B) ≤ AB. Next, we describe the pairs of vertices whose distance decreased after adding the edge.

Definition 2 (special region). For any vertex A, R A = {Z ∈ V | d G (Z, A) < ZA}. We will refer R A as the special region of A.

The special region contains the vertices which will "use" the extra edge e to reach A. Formally, we can write this as Z ∈ R A is equivalent with d G (A, Z) = min(AE + 1 + F Z, AF + 1 + EZ) < ZA.

Definition 3 (normal region, normal vertex). N A = V \ R A will be referred as the normal region of A. We call the intersection of all normal regions as simply the normal region and we denote it by N . A vertex in the normal region is called a normal vertex.

The normal region can be succinctly expressed as N = {Z ∈ V | R Z = ∅}. For a normal vertex Z ∈ N we have d G (A, Z) = AZ for every vertex A, that is distances from or to these vertices Z are unchanged after adding edge e, which makes normal vertices the simplest type of vertices from the point of view of our analysis. The following claim helps us to characterize the normal region for any graph. Claim 1. The set of vertices V can be partitioned to the following three sets,

The intuition for Claim 1 is that if we are trying to reach A from some other node, we may want to use e in the EF direction if F is closer to A, we may want to use e in the F E direction if E is closer to A, and there is no gain in using e if E and F are almost equidistant to A. The three regions are illustrated in Figure 1 .

Proof. First assume that |AE − AF | ≤ 1. For an arbitrary vertex B in the graph, using triangular inequality,

Similarly,

Hence, by Remark 1, we have that d G (A, B) = AB. As this is true for any vertex B ∈ V , we must have A ∈ N .

Next assume AE − AF > 1. Then, by Remark 1,

which implies that A ∈ R E . The AE − AF < −1 case follows analogously. The usefulness of partitioning the vertices into R E , N and R F goes beyond just characterizing the normal region. Note that R E collects the vertices that use the edge in the F E direction (because F is closer to them), and R F collects the vertices that use the edge in the EF direction. There are no nodes that use the extra edge in both directions. Hence, if the two nodes are in the same special region R E or R F , they are using the extra edge in the same direction, and they cannot use the extra edge to reduce the distance between themselves. We formalize this intuition in the next claim. Proof. For both special and normal regions (anti-)reflexivity follows from the definition and symmetry follows from the symmetry of distances in both G and G . The anti-transitivity of special regions follows from Claim 2. Indeed, if A ∈ R B and B ∈ R C , then the pairs (A, B) and (B, C) are in different special regions R E or R F , which implies that A and C must be both in R E or R F and we cannot have A ∈ R C .

We are now ready to justify the illustration in Figure 1 .

Proof. For a vertex A ∈ R F , the statement F ∈ R A follows by the symmetric nature of special regions (Remark 2). The R A ⊆ R E is a simple consequence of anti-transitivity. Indeed, R A ∩ N is empty by definition, and R A ∩ R F is empty because we cannot have Z ∈ R F , Z ∈ R A and A ∈ R F all hold at the same time.

Next, we use the anti-transitivity property to make equation (1) more explicit. Claim 3. For any A, B ∈ V , we have

Proof. We consider only the case A ∈ R B and A ∈ R F ; the second case is symmetric, and the third holds by definition. By the definition of special regions, A ∈ R F is equivalent with d G (A, F ) = min(AF + 1 + EF, AE + 1 + F F ) < AF, which further implies AE + 1 < AF.

By the anti-transitivity of special regions, A ∈ R B and A ∈ R F together imply B ∈ R E , which is equivalent with

which further implies BF + 1 < BE.

which reduces to d G (A, B) = AE + 1 + F B since AE < AF and BF < BE.

We already used the intuition that vertices in special regions "gain" from the addition of the extra edge. We formalize this intuition in the next definition. Definition 4 (Gain, Gain max ). Let the decrease in the distance between two vertices due to edge e be denoted as

Let the maximum gain associated to a node A be denoted as We can also observe that, by Claim 1, A ∈ N if and only if Gain max (A) = 0. A similar statement hold for vertex F instead of E.

Proof. Suppose for contradiction that there is a node B ∈ V for which Gain(A, B) > Gain(A, E). Since A ∈ R E , this node B must be in R F , otherwise by Claim 2 we have Gain(A, B) = 0. Then, the following inequalities must hold:

The last inequality above contradicts the triangle inequality, and the proof is completed.

3 General graphs 3.1 An example with an exponential increase in the metric dimension

In this section, we give a construction for a graph G on 3n + log 2 (n) − 1 nodes with β(G ) ≤ log 2 (n) + 1, in which the increase in the metric dimension is at least n − log 2 (n) − 3 on adding a single (specific) edge. The idea is that in G , the vertices of R F can be efficiently distinguished only by some vertices in R E (but not by vertices in R F ). Then, after adding edge e, the vertices in R E can reach R F on new shortest paths, and they will not distinguish vertices in R F anymore. Hence R F will have to be distinguished by vertices in R F , which will require significantly more nodes. The construction is shown in Figure 2 for n = 8. 

1 . We connect all of the vertices of level 0 and 3 to F and E, respectively. We connect the vertices of level 1 (respectively, level 2) to the vertices of level 2 (resp., level 3) if and only if the vertices of both levels 1-2 (resp., 2-3) have the same index. Finally, we connect a vertex labelled i in level 1 to a vertex labelled j in level 0 if and only if the j th bit in the binary representation of i is one. For example, v

1 is connected only to v (0) log 2 (n) because binary representation of 1 is 0 . . . 01. This construction leads therefore to the following definition.

where bin(i) j denotes the j th bit of the binary representation of the number i.

Proof. We need to show that any pair of vertices in V \ S are distinguished. There are two possibilities for any pair of distinct vertices: either they are in different levels or in the same level. If they are on different levels, vertex F will distinguish them, because for any v

i , F ) = l + 1. If they are on the same level, the binary representations of their index i will differ at at least one position. Let the j th bit of both labels be different. Then, vertex v (0) j will distinguish them, because its distance to the vertex whose label has the j th bit equal to 1 is two hops shorter than its distance to the vertex whose label has the j th bit equal to 0. Therefore all pairs of points are distinguished, which completes the proof. Now we add an edge e between vertices E and F . The resulting graph G is shown in Figure 2b .

Proof. Notice that the set of nodes that can distinguish v

This is because all other nodes can reach both v k through E on their shortest path and E cannot distinguish any pair of nodes on level 3. Hence, distinguishing nodes on level 3 is equivalent to resolving a star graph, and the metric dimension of G is at least n − 2.

Combining Claims 4 and 5, we observe that the increase in the metric dimension of G on adding e is at least n − log 2 (n) − 3.

It has been shown in [14] that if G is obtained from G by adding an extra edge, then β(G ) ≥ β(G) − 2, and if there are no even cycles in G , then β(G ) ≤ β(G) + 1. However, in the previous section we saw an example where β(G ) was exponentially larger than β(G). In this section we provide an upper bound on β(G ) in terms of the MD of the subgraphs of G, which holds for all graphs G . Lemma 1. Let G = (V, E) be a connected graph, and let G be the graph obtained by adding edge e between vertices E and F as before.

and G 2 be subgraphs of G induced on vertex sets V 1 and V 2 , respectively. Then,

Proof. Let S 1 and S 2 be the resolving sets of minimum size of graphs G 1 and G 2 , respectively. We prove that

where N is the normal region. Consider two vertices X and Y . There are two cases:

Without loss of generality, let X, Y ∈ V 1 . Let A ∈ S 1 be the vertex which resolves X and Y in G 1 . Since

Without loss of generality, let X ∈ V 1 \ V 2 and Y ∈ V 2 \ V 1 . Note that by definition, V 1 \ V 2 and V 2 \ V 1 contain the nodes that are closer to E and F , respectively. Hence, we can always go through e when going from

Assume for contradiction that none of E and F distinguish X and Y . This implies that

Substituting values from (3) and (4) gives a contradiction. Hence, either E or F will distinguish these two vertices.

For every possible pair of vertices we showed a distinguishing vertex in S. Finally,

Next we present a graph G for which the upper bound of Lemma 1 is achieved. The graph has 74 vertices and it is drawn on Figure 3 . The four solid black nodes labelled as E in the figure represent a single vertex E in the graph G . Similarly, the four solid black nodes labelled F represent vertex F . All other nodes shown in the figure represent distinct nodes. The graph G is obtained by adding an edge between vertices E and F . In this setting, G 1 , defined in Lemma 1, will be the sub-graph induced by nodes having green and yellow outlines and G 2 will be the sub-graph induced by nodes having orange and red outlines. Proof. Notice that G 1 and G 2 are isomorphic, hence their metric dimensions must be equal as well. First we show β(G 1 ) ≤ 8. Indeed we have 8 triangles in G 1 , and selecting one degree 2 vertex in each triangle is enough to distinguish any two vertices. To show β(G 1 ) ≥ 8, observe that we need to select one vertex from each of the triangles. The equality β(G 1 ) = β(G 2 ) = 8 together with Lemma 1 proves β(G ) ≤ 18.

Next, we show that β(G ) ≥ 18. Again, notice that G contains 16 triangles, and we must select a vertex in each of them. Notice that even after we selected these 16 nodes, the solid colored pairs in Figure 3 are not distinguished. Moreover, it is not possible to distinguish all 4 of these solid colored pairs by adding a single vertex to the set. Indeed, if any of the green stroked nodes are selected, the green solid pair is not distinguished. A similar argument holds for all other colors. This shows that we must add at least two nodes to the initial 16, and the metric dimension of G is at least 18, which completes the proof.

The main technical result of this paper is on the metric dimension of the grid graph augmented with one edge. Definition 6. Let the d-dimensional grid graph with side lengths (n 1 , n 2 , . . . , n d ) be the Cartesian product of d paths indexed by i with length n i .

For grid G and vertices A and B, we denote the distance

We state and prove the general result for d-dimensional grid graphs in Section 4.1, and we focus on the case of the 2-dimensional grid for more precise results in Section 4.2.

We start by understanding the MD of the d-dimensional grid without any extra edges. The paper [27] claims that the MD of a d-dimensional grid is d, however, [9] shows by computer search that this statement is false for hypercubes of dimensions 5 ≤ d ≤ 8. It is not difficult to show that d is an upper bound, but it is believed asymptotically not to be tight when the side lengths are small. The paper [38] claims without proof that if all side lengths are n i = n, then

and they also prove

However, when the side length n is large, then the MD of d-dimensional grid is exactly d, which was shown in [19] . Before stating this lower bound on n for the MD to be exactly d, we include a non-asymptotic lower bound on the MD for grids with general side lengths. Lemma 2. Let G be a grid of dimension d with side lengths (n 1 , n 2 , . . . , n d ), and let us denote N Σ = i n i and N Π = i n i . Then

Proof. The distances in G range from 0 to i (n i − 1), which implies a total number of (N Σ − d + 1) β(G) possible distinct distance vectors. Since the distance vectors must be unique, the number of possible distinct vectors must be at least as large as the total number of vertices in G, or formally

Taking the logarithm of both sides and rearranging the terms gives the desired result.

The lower bound on n for the MD to be exactly d can be found as a corollary of Lemma 2. Corollary 1 (Theorem 5.1 [19] ). Let G be a grid of dimension d with equal side lengths (n, n, . . . , n).

Proof. The assumption n ≥ d d−1 is equivalent to n d d−1 ≥ nd, which, by taking the logarithm of both sides, gives

Combining inequalities (7) and (8) gives

Since it is well established that the MD of the d-dimensional grid is upper bounded by d, the proof is completed.

We need a slightly more technical lemma before stating our main results on the MD of the grid with an extra edge. Lemma 3. Let G = (V, E G ) be a grid graph of dimension d with side lengths (n 1 , n 2 , ..., n d ). Let E and F be the endpoints of the extra edge e. As defined in Lemma 1, let

We defer the proof to the end of the section, and we state and prove our main theorem for d-dimensional grid graphs.

For an edge e between any two vertices E and F in V ,

Moreover, the lower bound (7) in Lemma 2 holds for G as well.

Proof of Theorem 1.

Let G 1 and G 2 be the subgraphs of G induced on V 1 and V 2 , respectively. Lemma 3 implies that β(G 1 ) ≤ d, and β(G 2 ) ≤ d holds by symmetry. Finally, we apply Lemma 1 to arrive to

For the lower bound, since adding an edge only decreases the distances in the graph, the same proof as in Lemma 2 applies.

It is an interesting question, whether the upper bound in Theorem 1 can be improved to 2d by simply not including the two endpoints of the extra edge into the resolving set when applying Lemma 1. We saw in Claim 6, that the two endpoints are needed for general graphs, but we will see in the next section, that they are not needed for the 2-dimensional grid. We believe that the upper bound can be improved to 2d, but the proof is not straightforward. In the proof of the 2-dimensional case, we rely heavily on the observation that the normal region has a specific shape no matter where the extra edge is added. We show in Figure 4 that this is not true anymore even for d = 3. Indeed, the shape of the normal regions (and thus of sets V 1 and V 2 ) can be quite different for different configurations of the extra edge, which suggests that the number of cases can explode. We conclude the section by providing a proof for Lemma 3.

Proof of Lemma 3. The proof will consist of three parts. In the first part of the proof, we define our coordinate system so that the extra edge is oriented in a specific way. This part essentially breaks the symmetries of the grid, which will reduce the number of cases we need to inspect later in the proof. In the second part, we show that a set of d corners in V 1 , which we denote by O, resolves the grid G. Finally, in the third part of the proof, we show that the distance between any vertex X ∈ V 1 and any corner in O is the same in both G and G 1 . Hence, O will be a resolving set of G 1 as well, which proves that the MD of G 1 is upper bounded by d and completes the proof of the lemma.

Part 1: Without loss of generality, we can label the dimensions such that |x

i.e., the distance between E and F along the first dimension is the maximum among distances along all the dimensions. Now, again without loss of generality, we also assume that x

for any dimension j, we can reflect the grid along that dimension so that x

X , keeping all other coordinates unchanged. We summarize these assumptions, taken without loss of generality, below.

Assumption 1 (symmetry breaking). Without loss of generality, we assume that E and F satisfy

and x

(1)

The d-dimensional grid has 2 d d! symmetries for choosing a coordinate system (which form the hyperoctahedral group). Note that even after Assumption 1, we still have (d − 1)! ways of choosing the coordinates (each equation in (10) removes a factor of two, and equations (11) remove a factor of d). This is because we only require that |x

E | takes (one of) its maximum value(s) for i = 1, and we have no constraint on the order of the values for the other indices. Thus, Assumption 1 does not break all symmetries of the grid, only the ones necessary for the proof. This also means that although we exhibit only a single resolving set O, there are multiple sets of d corners in V 1 that resolve G. 

where O 1 is the all-ones vector of dimension d, and for j > 1 we get O j from O 1 by changing its j th entry to n j . Khuller et al. show that the set of the d corners of O form a resolving set of G, and we only need to show that all d corners of O belong to V 1 , that is O j E ≤ O j F holds for all j. Because of equations (10),

Next, we consider the corners O j for j > 1. Because of Assumption 1, we have

Reorganizing the terms and then adding n j − d + 1 to both sides of the inequality yields

Thus, all the corners in O lie inside V 1 .

Part 3: In this part of the proof, we show that

We show this by exhibiting a shortest path between X and O j in G such that all vertices on that path belong to V 1 . This will show that

The inequality in the opposite direction is trivial because G 1 is a subgraph of G, which means that we must have d G (X, O j ) = d G1 (X, O j ).

For j > 1, the shortest path between X and O j that we exhibit will have the following two parts:

1. decrease all the co-ordinates (in any order), except j, to 1 to reach X 1 = (1, ..., x (j) X , ..., 1).

2. increase the j th coordinate from x (j) X to n j in order to reach O j .

For j = 1, we simply decrease all the coordinates (in any order) to 1 to reach O 1 . Clearly, these define valid shortest paths in a grid graph, and next, we prove that we stay inside V 1 both throughout the first part (from X to X 1 ) and the second part (from X 1 to O j ) of the path.

We distinguish two cases based on the ordering of x (1) 

Since there are no other cases by Assumption 1, the inequality X 0 F − X 0 E ≥ 0 must always hold, which implies X 0 ∈ V 1 . Therefore, we showed that decrementing the first coordinate does not lead outside of V 1 , and the same argument works for any of the d coordinates.

Next, we show for the second part of the shortest path, that each vertex in the path from X 1 to O j with j > 1 belongs to V 1 . Let X y = (1, ..., y, .., 1) be a vertex with x (i)

Xy ≤ n j . Clearly, X y describes all intermediate vertices on the path between X 1 to O j . Then, since j > 1,

(12) Finally, by applying the triangle inequality to the right hand side of equation (12), we arrive to X y F − X y E ≥ 0, which implies that all the vertices X y in the path from X 1 from O j with j > 1 belong to V 1 . This concludes the proof of the lemma.

For the sake of simplicity, we slightly adjust our notation to the d = 2 case. Let G = (V, E G ) be a two-dimensional rectangle grid graph with m rows and n columns. Let the tuple (i, j) denote the vertex in i th column and j th row. The upper left, upper right, bottom right, bottom left corners are labeled as P = (1, 1), Q = (n, 1), R = (n, m), S = (m, 1), respectively (see Figure 5 ). Let e be the edge between vertices E = (x E , y E ) and F = (x F , y F ) with x E , x F ∈ {1, ..., n}, y E , y F ∈ {1, ..., m}, with the assumption that EF ≥ 2. Let G = (V, E G ∪ {e}) be the 2-dimensional grid augmented with one edge. Assumption 2 (symmetry breaking for d = 2). We assume that

Assumption 2 is just a special case of Assumption 1 for d = 2. Geometrically, it means that the edge is tilted right, F is below and to the left of E, and the angle between the edge and the horizontal axis is between 45 and 90 degrees (see Figure 5 ). As argued in the proof of Lemma 3, if the edge is in any other orientation, we can flip or rotate the grid horizontally and/or vertically to bring the edge in this orientation, hence Assumption 2 can be made without loss of generality. Proof. We start by making observations about which special regions the four corners P, Q, R, S belong to. First, notice that

Then, notice that

where the last inequality holds by Assumption 2. Claim 1 implies therefore that P belongs to either R F or N . Similarly, R belongs to either R E or N . In any case, by Claim 2, it can be deduced that

In fact, it turns out that R P ∪ R Q = R E and R R ∪ R S = R F , but we are not showing this because it is not needed in this proof. Instead, let Figure 5 ), and we note that the sets R P ∪ R Q , R S ∪ R R and R W partition the set of nodes V .

To prove the theorem, for any pair of nodes A, B, we are going to assign two of the corners {P, Q, R, S} in the resolving set, and we are going to show that one of the two must distinguish A and B. The assignment will depend on whether A and B belong to R S ∪ R R , R P ∪ R Q or R W . Moreover, we further divide the region R S ∪ R R to R R \ R S , R R ∩ R S and R S \ R S , and the region R P ∪ R Q to R Q \ R P , R Q ∩ R P and R P \ R Q , and we treat each subregion separately.

This would mean treating 7 · 7 = 49 cases, but we make some simplifications. Let us suppose that the first point A is in R W or in R Q . The cases when A falls in R P , R R or R S are very similar. We make no assumptions on where B falls, but combine similar cases. Finally, we arrive to 8 cases, which are presented in Table 1 . The table shows the various possibilities of regions where A and B can belong to (denoted by R 1 and R 2 ), the corresponding pair of corners which distinguish A and B, and the claim which proves this. We conclude the proof by stating and proving Claims 7 and 8. Since A ∈ R Q , A ∈ R S , B ∈ R S and B ∈ R Q , we have

Adding these equations gives BQ + AS = AF + EQ + BE + F S + 2.

Applying the triangle inequality to points B, E, Q and A, F, S and adding both the inequalities, we get BQ + AS ≤ BE + EQ + AF + F S, which contradicts (14) . A similar proof holds for A ∈ R P and B ∈ R R with corners P and R. Proof. The distances from A, B to P , Q in G are same as that in G, and we know [31] that the set of two adjacent corners is a resolving set of a rectangle grid.

Theorem 2 tells us that the MD of a grid and one extra edge must take a value from the set {2, 3, 4}, and in fact, all three values can occur. In Conjecture 1, we present a set of conditions, which we believe completely characterize the MD of a grid and one extra edge, but proving this conjecture seems tedious. Instead, we are interested in a probabilistic approach: what is the distribution of the MD when a uniformly randomly selected edge is added?

First we define some quantities which will be useful for the remaining section.

Gain = Gain(E, F ) = |y F − y E | + |x E − x F | − 1

The notion of Gain captures the maximum gain for any pair of nodes. The pair of vertices (E, F ) obviously have maximum gain, however, there can be other pairs which have the same gain. For two vertices X, Y , let us denote by Rec(X, Y ) the rectangle that has opposite corners X and Y , and sides parallel to the sides of the grid. Then, the pairs (A, B), with A ∈ Rec(E, Q), and B ∈ Rec(F, S) also have Gain(A, B) = Gain, since there is a shortest path between A and B in G that passes through both E and F . The notion of Gain has a very similar interpretation as Gain. We defined Gain as the gain between vertices F and (x F , y E ). Notice that (x F , y E ) is also a corner of Rec(E, F ). Therefore, while Gain is about the gain between the opposite corners, Gain is about the gain between the adjacent corners of the same rectangle (by symmetry the gain between E and (x E , y F ) is also Gain ). Similarly to Gain, there are many other pairs of vertex pairs (A, B) with Gain(A, B) = Gain . These are the pairs (A, B) with A ∈ Rec(P, (x F , y E )) and B ∈ Rec(F, R), and symmetrically the pairs with A ∈ Rec(R, (x E , y F )) and B ∈ Rec(E, P ). Roughly speaking, we could thus say that Gain is useful if we want to measure the distance between vertex pairs with one vertex close to S and the other close to Q, while Gain is useful if we want to measure the distance between vertex pairs with one vertex close to P and the other close to R.

One of the key steps of the main proof in this section will be about treating the case when Gain is very small. This is the case when the extra edge has (or is close to having) a 45 degree angle with the sides of the grid, and Rec(E, F ) is (or is close to being) a square. In the extreme case, when Gain = 0, no vertex pairs close to P and R use the extra edge, and the structure of the special and normal regions are different from the case when Gain ≥ 1. When Gain = 1, there are still some subtle but inconvenient structural differences compared to the Gain ≥ 2 case. Fortunately, since we are adopting a probabilistic framework, in the proof we will be able to ignore the cases with Gain ≤ 1, as these cases have a vanishing probability of occurring. Definition 8. Let P n be the probability distribution over potential extra edges e n = ((x E , y E ), (x F , y F )) that we can add to G n , where (x E , y E ) and (x F , y F ) are two uniformly random vertices of G n . Theorem 3. Let G n be the n × n grid and let G n = G n ∪ {e n } with e n sampled from distribution P n . Then, the following results hold:

According to Theorem 3, the asymptotic probability that the MD of the square grid with an extra edge is three is at least 19/27. We believe that it is also true that the MD is at least four when Gain is even and x E − x F ≥ Gain /2 + 2.

If we could prove this, we could state that the asymptotic probability of β(G ) being three is exactly 19/27, and β(G ) → Ber(8/27) + 3 in probability, where Ber(q) is a Bernoulli random variable with parameter q. We believe that a brute-force approach similar to the proof of Theorem 2 can work, but it requires a tedious case-by-case analysis that is out of scope of this paper.

The probabilistic formulation of Theorem 3 allows us to ignore the edge-cases that would be too tedious to check individually, but it introduces new challenges as well. In rest of this section, we explore these new challenges and we reduce equations (15)- (17) to technical Lemmas 4, 5 and 6, which are of deterministic nature. We give the proof of Theorem 3 at the end of this section, but we defer the proof of the technical lemmas to Section 4.2.4.

The specific edge-cases that we ignore using the probabilistic formulation are given in Assumption 3. Assumption 3 (edge-case removal). We assume that

3. none of E and F lie on the boundary of the grid.

In addition to Assumption 3, we are also going to make use of Assumption 2 as we did in the proof of Theorem 2. Assumptions 2 and 3 applied together have some additional implications. Remark 5. Assumption 2 and 3 together imply that

Using Assumption 2 in the probabilistic formulation is not as straightforward anymore, as symmetry breaking can also break the uniformity of the sampling of the extra edge. Indeed, sampling a random edge that satisfies Assumption 2 is not the same as sampling an edge from P n and rotating and reflecting it so that Assumption 2 is satisfied. In Claims 9 and 11, we are going to show that after removing only O(n 3 ) edges from V × V , and thus slightly changing the distribution P n , the symmetry breaking will not violate the uniformity of the sampling anymore.

Definition 9 (P,Q,P n , Q n ). Let P be the set of extra edges ((x E , y E ), (x F , y F )) that satisfy Assumption 3, and let Q the set of extra edges that satisfy both Assumptions 2 and 3. LetP n and Q n be the uniform probability distribution over P and Q, respectively.

In Claim, 9 we show that P n is close toP n , and in Claim 11 we show thatP n is close to Q n . These two claims allow us to use Q n instead of P n in the proof of Theorem 3.

Claim 9. For P n andP n given in Definitions 8 and 9, lim n→∞ P n −P n T V = 0.

Proof of Claim 9. The support of P n is V × V , and |V × V | = n 4 because each of the four coordinates x E , y E , x F and y F can take four values. Recall, that P ⊂ (V × V ), and the set (V × V ) \ P consists of the edges that do not satisfy Assumption 3. Therefore, to upper bound the cardinality of (V × V ) \ P, it is enough to upper bound the number of edges violating each of the conditions in Assumption 3. It is clear that the number of edges that violate the first condition is n 3 ; the coordinates x E , y E , y F can be chose arbitrarily n 3 different ways, and then setting x F = x E gives exactly one unique edge that violates the first condition. For a more insightful but less precise explanation, notice that the original set V × V had four degrees of freedom, and we lost one to violating the condition, hence we are left with three degrees of freedom and O(n 3 ) edges. It is not hard to see that we lose one degree of freedom to violate the second and third conditions as well, and therefore the number of edges violating these conditions are also O(n 3 ). We conclude that the number of edges in (V × V ) \ P are also of order O(n 3 ).

Then,

Definition 10 (H). Let us consider the following actions on the extra edges of the grid:

1. by h 1 the reflection along the vertical line through the midpoints of sides P Q and SR, 2. by h 2 the reflection along the horizontal line through the midpoints of sides P S and QR, 3. and by h 3 switching the two endpoints of the edge.

Let H be the group generated by h 1 , h 2 and h 3 acting on the edges.

Notice that group H acting on the edges is isomorphic to the Z 3 2 group. Indeed, all three actions have order two and commute with each other. Thus, H can be described as

. Also, notice that for e ∈ Q, applying h 1 , h 2 and h 3 flips the inequality labelled with the same index in Remark 5, and keeps the other two inequalities unchanged. Definition 11. Let h be a map, which for each edge e ∈ Q returns the set of edges that we get by applying the elements of H to e.

The sets h(e) can be seen as orbits of the edges under the action of H. Claim 10. With P, Q and h given in Definitions 9 and 11, the following three statements must hold:

1. |h(e)| = 8 for every e ∈ Q 2. the orbits of the edges in Q are disjoint, i.e., h(e 1 ) ∩ h(e 2 ) = ∅ for e 1 = e 2 ∈ Q 3. for every e ∈ P, there is an e 2 ∈ Q with e ∈ h(e 2 ).

Proof of Claim 10. Statement 1 follows from the observation that every non-trivial group action in H flips a different subset of the inequalities in Remark 5, and two edges cannot coincide if they satisfy different sets of inequalities. For statement 2, since H is a group, if two orbits h(e 1 ), h(e 2 ) have a non-empty intersection, we must have e 1 ∈ h(e 2 ). However, every non-trivial group action in H flips at least one of the inequalities of Remark 5, which implies that if we apply a non-trivial group action, the image of e 2 ∈ Q cannot be in Q. For statement 3, for edge e ∈ P, let v(e) ∈ {0, 1} 3 be a binary vector, whose i th entry indicates that e violates inequality i in Remark 5. Then h

is a group action that flips exactly the inequalities that are violated by e, and thus maps e into Q. Let the image of e under this action be e 2 , and then indeed, e ∈ h(e 2 ).

Claim 11. Let A n be a sequence of events defined on graph G n that are closed under the action of H. Then,

Proof of Claim 11. The three statements of Claim 10 together imply that the orbits h(e) of e ∈ Q partition P into sets of cardinality 8. A simple corollary is that |P| = 8|Q|.

Let us suppose that event A n is closed under the action of H, or formally as e ∈ A n implies h(e) ⊂ A n . This closedness property, combined with Claim 10 implies that the edges in A n can also be counted as 8 times the number of edges in A n ∩ Q. Then,P

Finally, we combine equation (18) with Claim 9 as

and the proof is completed.

Now we have all the ingredients to prove Theorem 3.

Proof of Theorem 3. Since all events in the statement of Theorem 3 are closed under the action of H on the square grid, Remark 11 shows that it is enough to prove equations (15)-(17) for distribution Q n . Note that because of statements 1 and 3 of Remark 5, min(|x E − x F |, |y E − y F |) = x E − x F for edges in Q. Hence, the second condition in (16) and (17) reduces to x E − x F < Gain /2 + 2 for distribution Q n .

The rest of the proof relies on Lemmas 4-6 given in Section 4.2.4, which have purely deterministic nature. Lemma 4 shows that for extra edges in Q (that is edges satisfying Assumption 2 and 3), the metric dimension of G will be at least three deterministically, which, combined with Theorem 2, gives equation (15) . Lemma 5 shows that there exists a resolving set of cardinality three for every extra edge in Q with an odd Gain . For the extra edges in Q with an even Gain and with x E − x F < Gain /2 + 2, there exist different resolving sets of cardinality three, which is proved in Lemma 6. Thus, Lemmas 4, 5 and 6 combined imply equation (16) .

Finally, we show equation (17) . Let us denote by C the subset of vertex pairs in Q that satisfy the condition in equation (17), i.e.,

Let the complement of C bē

Next, we calculate |C|/|Q|. Let x E − x F = a and y F − y E = b, which together with Assumption 3 gives b − a − 1 = Gain .

Then, the conditions on a, b that need to be satisfied for an edge to be inC can be reformulated as :

1. b − a is odd (equivalent to Gain is even)

as the extra edge is not horizontal nor vertical, and does not touch the boundary of the grid.

Let b − a = 2i + 1 with i ≥ 1. With this parameterization, the first two conditions are already obviously satisfied. Substituting a = b − 2i − 1 into a ≥ b/3 + 1 gives b ≥ 3(i + 1). Hence, for a fixed i, b can have values from 3(i + 1) to n − 2, and consequently, the maximum value that i can take is (n − 5)/2 . Note that for a given pair (a,b), there are (n − a − 1)(n − b − 1) possible edges in G n which do not touch the boundary. Therefore,

which reduces asymptotically to

Therefore,

Hence, Q n (C) = 1 − Q n (C) → 19/27, which shows equation (17) and completes proof of the theorem.

In the rest of this section we state and prove Lemmas 4-6, which we will do in Section 4.2.4. Before introducing these lemmas, we prove some claims that will be useful later. We start by simple claims in this subsection, then in Section 4.2.3 we prove more involved results that charaterize the normal and special regions of G .

The following claim shows that resolving sets must have nodes on the boundaries of the grid, which helps us reduce the number of subsets that we must prove are non-resolving. Claim 12. If R is any resolving set of G (the grid with extra edge EF) satisfying Assumption 2 and 3, there must be two vertices X and Y in R which satisfy following two properties:

1. They are on opposite boundaries of G 2. If one of them is a corner, the other one must be an adjacent corner.

Proof. Consider vertices A = (1, 2) and B = (2, 1). It is easy to see that only vertices on boundaries P Q and P S except corner P will be able to distinguish A and B as none of E and F is on the boundaries. So we need at least one vertex on the union of the boundaries P Q and P S, excluding P , in the resolving set. A similar argument holds for the other 4 corners, hence we can deduce the two required conditions. Proof. Note that Gain(A, B) > 0 indicates that A uses e to reach B. For this to happen, we must have A ∈ R F and B ∈ R E (or the other way around), in which case d G (A, B) = AE + 1 + F B. This gives

which has same parity as

which has the same parity as

Remark 6. Consider a vertex X and its 4 neighbouring vertices X 1 , X 2 , X 3 , X 4 . A single vertex in the graph cannot distinguish all of these 4 vertices.

Proof. Suppose that vertex A distinguishes all 4 vertices. By triangular inequality, AX i can only take 3 distinct values, namely AX − 1, AX and AX + 1. Hence, by pigeon hole principle, at least 2 vertices will have same distance to A.

We prove that in two dimensions, under Assumptions 2 and 3, the normal region takes a fairly regular shape. As shown in Figure 6 , we only have two cases based on the parity of Gain . This is in sharp contrast with higher dimensions, where the normal region can take very different shapes (see Figure 4 ).

The following quantities will be useful to describe the shape of the normal region. Definition 12 (α, β) . Let

Remark 7. We make the following observations about α and β under Assumptions 2 and 3.

1. Note that β − α = x E − x F . Assumptions 2 and 3 imply x E > x F , and since x E , x F are both integers, we know that β − α ≥ 1. Consequently, β > α holds.

2. Using Assumptions 2 and 3, we find the following useful equalities and inequalities:

and

(a) When Gain' is even, the height of N is 2.

(b) When Gain is odd, the height of N is 1. Figure 6 : Illustration for Claims 14 and 15. Brown region(including the boundary), which is just a set of line segments in the case when Gain is odd, is the normal region of the grid. Pink region(including the boundary) indicates the special region of a point A belonging to R E which lies on boundary P S of G .

Next, we express precisely the normal region of the grid. Claim 14. Under Assumptions 2 and 3,

Geometrically, the normal region will be the union of three strips of "height" 1 or 2: two horizontal strips with y-coordinates around α and β respectively, and a third strip at 45 degree angle joining the two horizontal strips (see Figure 6 ). By "height" here we mean the number of vertices corresponding to each x coordinate. The height of the strips depends on whether α and β are integers or not (and thus on the parity of Gain ): when Gain is even, α and β are integers and the height of the strip is 2; when Gain is odd, α and β are odd integers divided by two, and the height of the strip is 1.

The union of the three strips forms a single continuous strip, which separates the grid into two connected components along the y-axis. The y-coordinates of the strip lie completely between the y-coordinates of nodes E and F , which means that for every x-coordinate, the normal vertices are sandwiched between non-normal vertices along the y-axis.

Here we rely heavily on the inequality Gain ≥ 2 in Assumption 3; for Gain = 0 the normal region can touch the P Q and RS boundaries of the grid.

Proof of Claim 14. Let A = (x A , y A ) be a vertex in N . By Claim 1, being a normal vertex is equivalent to

First we show that we cannot have y A < y E . If y A < y E , equation (23) reduces to

Next, by the triangular inequality we have

and therefore by Definition 7,

Due to the assumption Gain ≥ 2, equations (24) and (25) contradict each other. Hence, we cannot have y A < y E .

Similarly it can be shown that we cannot have y A > y F . In short, for A to be a normal point, we must have y E ≤ y A ≤ y F (i.e., the y A coordinate must be between E and F ). This reduces equation (23) to

Now we are going to have three cases depending on whether

where α is given in Definition 12. This gives the first line of equation (22) . Similarly, it can be verified that for the other two possibilities x F ≤ x A ≤ x E and x E < x A , we get the remaining two lines. Now that N is explicitly written in terms of the coordinates of the nodes, we can leverage the partitioning in Claim 1 to do the same for R E and R F . However, we find it more instructive to express R E and R F implicitly using N , instead of explicit equations similar to equation (22) . Remark 8. Under Assumptions 2 and 3,

and

Proof. By Claim 14, the normal region splits V into two connected components, one containing E, which we denote by V F , and one containing F , which we denote by V E . Now we show that V E = R E and V F = R F . By Claim 1 it is clear that we cannot have two neighboring vertices A, B with A ∈ R E and B ∈ R F . Indeed the equations AE − AF > 1, BE − BF < −1, |AE − BE| ≤ 1 and |AF − BF | ≤ 1 cannot hold at the same time. By Claim 1, the vertices V \ N are partitioned into R E and R F , and since we cannot have two neighboring vertices split between R E and R F , each connected component V E and V F must be contained entirely in R E or R F . We also know that E ∈ R F and F ∈ R E , which implies that the only way to assign the vertices of V \ N into R E and R F is to have R E = V E and

In the next claim, we characterize the special regions of the nodes on the boundary P S of the grid. This will be useful in the subsequent results as we will be mainly dealing with nodes on the boundaries. 1. if A belongs to R E , i.e., k > α, with α given in Definition 12,

and

2. if A belongs to R F i.e. k < α − 1 and Gain ≥ 2,

and

Gain max (A) = Gain for k ≤ y E Gain − 2(k − y E ) for y E < k < α − 1.

By symmetry, the vertices A = (n, k) on boundary QR have a similar expression for their special region, however, we do not include this in the paper in the interest of space.

We will only cover the A ∈ R F case; the other case is analogous. We are interested in the nodes T ∈ R A , i.e., nodes that use edge e to reach A. By Remark 4, vertex E gets the maximum benefit from the extra edge, hence we expect R A to be a neighbourhood "centered" at E. However, R A cannot be a ball centered at E, because the directions are not equivalent. For instance, if T is in the rectangle formed by E and Q, then we can go to node E for "free", without sacrificing any of the gain we get by using the extra edge. This is because the shortest path from T to A in G passed through E anyways, so Gain(A, T ) = Gain max (A) (i.e., T also gets maximum benefit). Hence, all nodes in this rectangle will be in R A . For a different example, if T is in the rectangle formed by E and P , then going along the y axis towards E is "free", but going along the x axis towards is a "detour", hence there may be a threshold for x T below which the shortest path does not use the extra edge. We will make this intuition rigorous below.

Proof. First, we check that equation (30) agrees with the definition of Gain max . By Remark 4, we have

which for k ≥ y F implies

because of Definition 4, and for α < k < y F implies

Next, we need to find nodes T such that T E + 1 + F A < AT.

Combining equations (33) and (34) we get that

where T E x and T E y denote the distance along the x and y axes, respectively (e.g.,

There are five cases depending on where T could be:

Case 1: T is in the rectangle formed by nodes E and Q

In this case there exists a shortest path in grid from T to A which passes through E, and T will certainly use the edge e to reach A. Hence, this rectangle belongs to R A , which accounts for the first line of in equation (29) .

Case 2: T is in the rectangle formed by E and P

In this case AE x = AT x + T E x and AE y = AT y − T E y , which reduces equation (35) to

This accounts for the second set in the equation (29) .

Case 3: T is in the rectangle formed by E and R, and has y-coordinate less than k

In this case AE x = AT x − T E x and AE y = T E y + AT y , which reduces equation (35) to

This accounts for the third set in the (29) .

Case 4: T is in the rectangle formed by E and S, and has y-coordinate less than k

In this case AE x = AT x + T E x and AE y = T E y + AT y , which reduces equation (35) to

This accounts for the fourth set in the (29) .

Case 5: T has y-coordinate greater than or equal to k

In this case T E y = AE y + AT y , which reduces equation (35) to

where the last inequality follows from the triangle inequality. However, using equation (30) , for k ≥ y F we have

and for k < y F we have

which contradicts equation (36) . Therefore Case 5 is impossible.

Since the five cases cover the entire node set, the necessary and sufficient conditions for T ∈ R A are characterized, and this completes the proof.

Lemma 4. Under Assumptions 2 and 3, the metric dimension of G is at least 3.

Proof. Suppose that there exists two points X and Y that distinguish all points in the grid. By Claim 12, they have to be on opposite boundaries. Next, their maximum gains cannot exceed 1. Indeed, suppose for contradiction that Gain max (X) > 1, and that X ∈ R F . Then, by Remark 4 we have XF − (1 + XE) > 1, and thus the four neighboring vertices of F will all have have distance min(XF ± 1, XE + 2) = XE + 2 to X. By Remark 6, the four neighboring vertices of F cannot be distinguished by a single vertex Y , which contradicts our assumption that {X, Y } is a resolving set, and hence we must have Gain max (X) ≤ 1. By a symmetric argument, we also have Gain max (Y ) ≤ 1

We have two cases depending on the parity of Gain .

Case 1: Gain is even.

By Claim 13, we know that Gain max (X) is also even, and since Gain max (X) ≤ 1, it must equal to 0, which in turn implies that X is a normal vertex. By a symmetric argument, Y must be normal vertex too. Moreover, recall that X and Y must lie on opposite boundaries. Therefore, because of Claim 14, X is either (1, α − 1) or (1, α) and Y is either (n, β − 1) or (n, β), as these are the only normal vertices on the boundaries of G . As X and Y are normal vertices, edge e has no effect on the distances from any vertex of G to X and Y , which therefore remain the same as in the original grid G. But we know that the only resolving sets of the grid G that have cardinality 2 are two adjacent corners of G, which disqualifies X and Y from being a resolving set of G and thus G .

Case 2: Gain is odd.

Recall, that Gain max (X) ≤ 1, Gain max (Y ) ≤ 1, and both X and Y must lie on the boundary of G . Let us first rule out the possibility of X or Y being on the top/bottom boundaries P Q and RS. More specifically, we will show that there is no point X = (k, 1) with Gain max (X) ≤ 1. If X = (k, 1), then X ∈ R F and

where the inequality follows from the triangle inequality. Now, Assumption 3 states that Gain ≥ 2, and thus no point X = (k, 1) can have Gain max (X) ≤ 1 .

Now we consider the case when X and Y lie on P S and QR, respectively. We will check which vertices X = (1, k) and Y = (n, k) have Gain max ≤ 1. The Gain max of vertices X = (1, k) is expressed in equation (30) . Since Gain > Gain ≥ 2, only the Gain − 2(y F − k) term can equal 1. The term Gain − 2(y F − k) is an increasing linear function of k that takes the value 1 for only a single value of k, namely k = α + 1/2 = α + 1. Similarly, in (32) , the only value that satisfies Gain − 2(k − y E ) = 1 is k = α − 1. Consequently, the only vertices on P S that have Gain max (X) = 1 are X 1 = (1, α − 1) and X 3 = (1, α + 1). Similarly, the only vertices on QR that have Gain max (X) = 1 are Y 1 = (n, β − 1) and Y 3 = (n, β + 1). The only vertices on P S and QR that have Gain max (X) = 0 are the normal vertices X 2 = (1, α ) and Y 2 = (n, β ). Hence, we have

and,

To finish the proof, we are going to rule out the remaining nine resolving sets that can be formed by X 1 , X 2 , X 3 and Y 1 , Y 2 , Y 3 . Since Gain max (X 1 ) = Gain max (X 3 ) = 1, the expression for the special regions of X 1 and X 3 in Claim 15 simplifies to

and

By a symmetric argument,

and

Consider the rectangular sub-grid H formed by points X 1 , (n, α − 1), Y 3 , (1, β + 1) (see Figure 7) . The sub-grid H cannot intersect the special regions of X 1 , X 3 , Y 1 and Y 3 because (i) by equations (40) and (42), the special regions of X 1 and Y 1 have y-coordinate at least y F , (ii) by equations (41) and (43), the special regions of X 3 and Y 3 have y-coordinate at most y E , and (iii) the sub-grid H has y-coordinates more than y E and less than y F . The statement (iii) follows by equation (20) and the inequality Gain ≥ 2 from Assumption 3, since the lowest y-coordinate value of a vertex in H is

and by equation (21) and the inequality Gain ≥ 3 (which follows from the assumption that Gain is odd in addition to Assumption 3), since the highest y-coordinate value of a vertex in H is

Consequently, the distances in graph G between any point in H and X or Y are same as in G. By Remark 7, we also have α < β , which implies that X and Y cannot be adjacent corners of H, and they cannot resolve the sub-grid H.

Since we ruled out every pair of vertices X, Y for being a resolving set, the proof is concluded. Proof. By Claim 14, Y is a normal vertex. The only normal vertex on boundary PS is vertex (1, α ), and since by Remark 7 we have β > α , X cannot be a normal vertex. By Claim 8 we have X ∈ R E , and by Remark 3, vertex X has non-empty special region R X ⊆ R F (see the pink region in Figure 8 ).

Suppose for contradiction that there exist two distinct points A and B, which are not distinguished by the three points X, Y, Q in G . We separate three cases depending on the position of A and B:

Case 1: One of A and B is in R X , and the other is in N X Without loss of generality, we assume A ∈ R X and B ∈ N X . Since Y is a normal point and it does not distinguish A = (x A , y A ) and B = (x B , y B ), we have AY = BY , which can be expanded as

By the assumption that A and B are not distinguished by X, we have that d G (A, X) = d G (B, X). Since A ∈ R X and B ∈ N X , this yields that

Therefore,

where the last line follows form equation (45) . Equation (47) implies that Gain(A, X) must be even. By Claim 13, if Gain(A, X) is even then Gain must be even too, which contradicts our assumption that Gain is odd. 

Then, Gain should also be even by Claim 13, contradicting our assumption that Gain is odd.

We are left with the cases A, B ∈ N Q and A, B ∈ R Q . Notice that since we showed β + 1 < y F for odd Gain in equation (44) , and since by equation (21) we have β = y E + (Gain + 2)/2 > 1, neither F nor Q are on the horizontal line through X and Y . Hence, any pair of nodes A, B that are symmetric to the XY line are distinguished by both Q and F in graph G. We immediately see that if A, B ∈ N Q , the pair A, B is also by Q in G . If A, B ∈ R Q , by Claim 3 together with Q ∈ R F , and since F distinguishes A, B in G, we have d G (Q, A) = QE + 1 + F A = QE + 1 + F B = d G (Q, B) .

Hence, in every sub-case of Case 2 we showed that A, B must be distinguished by at least one of Q, X and Y in G .

Case 3: A, B ∈ R X :

By Claim 15, we have Q ∈ R X . The anti-transitivity property of special regions (Remark 2) implies that if A ∈ R X and Q ∈ R X , then A ∈ R Q and therefore A ∈ N Q . Similarly, we have B ∈ N Q , and we can deduce that the distances between Q and A, B are the same in graph G as in graph G. Moreover, since Y is a normal vertex, the distances distances between Y and A, B are the same in G as in G too.

Remark 3 together with X ∈ R E implies that we have R X ⊆ R F , and Remark 8 and Claim 14 together imply that every vertex in R F has y-coordinate at most β − 1 < β . Hence, both A and B are contained in the rectangular sub-grid with corners QY XP . Since Q and Y are adjacent corners of the sub-grid QY XP , they must resolve the entire sub-grid QY XP in graph G, including vertices A and B. Since distances from Y and Q to A, B are the same in graph G as in G, vertices Q and Y must distinguish A and B in G as well.

Thus, every vertex pair A, B is distinguished by some vertex in the set {X, Y, Q}, and the proof is concluded. Proof of Lemma 6. See Figure 9 for an illustration. Note that Y and Z are normal points, and that X ∈ R E . First we calculate Gain max (X). By equation (30) , since β − 1 ≤ y F because of equation (21),

Now we show that R X completely lies inside the rectangle P QT Z. Indeed, according to Claim 15, the largest y-coordinate of a point in special region of X will be

where the inequality follows by the assumption x E − x F < Gain /2 + 2. Hence, since we also have α < β by Remark 7, all points in the special region of X will have y-coordinate less than that of Z. Alternatively, denoting vertex (n, α − 1) by T , we have that R X is contained in the rectangle P QT Z.

Let us suppose for contradiction that there exist two distinct points A = (x A , y A ) and B = (x B , y B ) which are not distinguished by X, Y , Z. We distinguish three cases based on the positions of A and B: Figure 9 : Illustration for the proof of Lemma 6. Points X, Y , Z marked with red crosses form a resolving set. Y and Z are normal points, the pink region(including the boundary) is R X .

In this case all distances between X, Y, Z and A, B are the same in graph G as in graph G. It is easy to see that to be equidistant from X and Y , vertices A and B must be symmetric to the horizontal line through X and Y , in which case Z can distinguish A and B.

Case 2: A, B ∈ R X

In this case, we show that A and B cannot be equidistant from both Y and Z. Both A and B lie inside of R X , and thus the region P QT Z. Now we show that Y and Z resolve P QT Z in G, which implies that they resolve P QY Z in G because they are normal vertices. Our argument will be similar to the standard argument that shows that two adjacent corners resolve the grid. To be equidistant from Z, both of them should lie on a diagonal line parallel to P R, or equivalently,

(51) To be equidistant from Y , they should lie on a diagonal line parallel to QS, or equivalently,

However, equations (51) and (52) cannot hold simultaneously for A = B.

Case 3: One of A and B is in R X , and the other is in N X Without loss of generality, we assume that A ∈ R X and B ∈ N X . Since R X lies inside of P QT Z, we know that the y-coordinate of A is less than that of Z and Y , i.e., y A < α − 1 < β − 1. Since we have shown in Case 2 that Y and Z resolve P QT Z in G , B cannot lie in the region P QT Z. There are two other possibilities for where B could lie:

1. Let us assume that B lies in the region ZT Y X. Since Y does not distinguish A and B, we have AY = BY , which implies that x A + y A = x B + y B as in equation (52). Similarly, since Z does not distinguish A and B, we have AZ = BZ, which implies that x A + (α − 1 − y A ) = x B + y B − (α − 1). Subtracting the second equation from the first gives y A = α − 1 which contradicts equation (50).

2. Let us assume that B lies in the region XY RS, or equivalently, y B ≥ β − 1. Since we have assumed A and B to be equidistant from X and Z, we have AZ = BZ and BX = d G (A, X) = AX − Gain(A, X). Writing these equations in terms of the variables x A , y A , x B and y B gives

and

x A − 1 + (β − 1 − y A ) − Gain(A, X) = x B − 1 + y B − (β − 1).

Gain(A, X) = 2(β − α) = 2(x E − x F ).

(55)

Equations (55) and (49) together contradict the fact that Gain(A, X) ≤ Gain max (X).

We considered all cases and the proof is concluded.

Finally, we present our precise conjecture which completely characterizes metric dimension for any 2-dimensional grid graph augmented with one edge. We believe this can be proved by rigorous case-wise analysis but it is out of the scope of this paper. We have verified this conjecture for square grids with sizes up to 15 × 15 using simple C++ programs available at [37] . Note that the conjecture is stated not only for square grids but also for m × n rectangular grids, but for these graphs we only verified the conjecture for a few parameter values due to the increased number of cases. Conjecture 1. Let G be a 2-dimensional grid graph with m rows and n columns. Let e be the edge between vertices F = (x F , y F ) and E = (x E , y E ) with x F , x E ∈ {1, ..., n}, y F , y E ∈ {1, ..., m}, with the assumption that EF ≥ 2.

Let G = (V, E G ∪ {e}) be the grid augmented with one edge. Let Gain = |y E − y F | + |x F − x E | − 1 and Gain = ||y F − y E | − |x F − x E || − 1. • β(G ) = 2 if any of the following conditions is satisfied:

-Gain = 1 -Gain ≤ 1, Gain is odd and one of the endpoints is a corner of the grid.

-Gain ≥ 3, Gain is odd, Gain − Gain ≤ 2 and one of the endpoints is a corner of the grid.

-Gain is odd and both endpoints are corners of the grid.

• β(G ) = 3 for all other cases.

Fully dynamic all-pairs shortest paths with worst-case update-time revisited

Covid-19: patient zero in the netherlands

Base size, metric dimension and other invariants of groups and graphs

Mat Mihal'ák, and L Shankar Ram. Network discovery and verification

Dandelion: Redesigning the bitcoin network for anonymity

Metric dimension for random graphs

On the determining number and the metric dimension of graphs. the electronic journal of combinatorics

On the metric dimension of cartesian products of graphs

Covid-19: preparedness, decentralisation, and the hunt for patient zero

Resolvability in graphs and the metric dimension of a graph

A new approach to dynamic all pairs shortest paths

Six challenges in measuring contact networks for use in modelling

The effect of vertex or edge deletion on the metric dimension of graphs

A comparison between the metric dimension and zero forcing number of trees and unicyclic graphs

Hiding the rumor source

Spy vs. spy: Rumor source obfuscation

The difference between the metric dimension and the determining number of a graph

Extremal results for graphs of bounded metric dimension

Inferring networks of diffusion and influence

On the metric dimension of a graph

Approximation complexity of metric dimension problem

Fault-tolerant metric dimension of graphs

Wormhole attacks in wireless networks

Computing metric dimension and metric basis of 2d lattice of alpha-boron nanotubes

Landmarks in graphs

The metric dimension of critical galton-watson trees and linear preferential attachment trees

Localization game for random geometric graphs

Landmarks in torus networks

Metric bases in digital geometry

On the limiting distribution of the metric dimension for random forests

The threshold dimension of a graph

Sequential metric dimension for random graphs

Locating the source of diffusion in large-scale networks

On the metric dimension of HDN 3 and PHDN 3

Verification of a conjecture regarding metric dimension of a grid augmented with one edge

On metric generators of graphs

Rumors in a network: Who's the culprit?

On metric dimension in some hex derived networks

Leaves of trees

The effect of transmission variance on observer placement for source-localization

Network observability and localization of the source of diffusion based on a subset of nodes

Extending the metric dimension to graphs with missing edges

Strategies to trace back the origin of covid-19

The work presented in this paper was supported in part by the Swiss National Science Foundation under grant number 200021-182407.