key: cord-0058304-hlljec17 authors: Lange, Jan-Hendrik; Andres, Björn title: On the Lifted Multicut Polytope for Trees date: 2021-03-17 journal: Pattern Recognition DOI: 10.1007/978-3-030-71278-5_26 sha: 1d921a50df6d56f7760c979e8383e6d1c5b7e8a0 doc_id: 58304 cord_uid: hlljec17 We study the lifted multicut problem restricted to trees, which is np-hard in general and solvable in polynomial time for paths. In particular, we characterize facets of the lifted multicut polytope for trees defined by the inequalities of a canonical relaxation. Moreover, we present an additional class of inequalities associated with paths that are facet-defining. Taken together, our facets yield a complete totally dual integral description of the lifted multicut polytope for paths. This description establishes a connection to the combinatorial properties of alternative formulations such as sequential set partitioning. The lifted multicut problem [12] is a graph partitioning model based on realvalued costs attributed to pairs of nodes that are in distinct components. It has been applied successfully to inference tasks in areas such as image segmentation [2, 17] , object tracking [22] and motion segmentation [16] . In this paper we study the lifted multicut problem and its associated polyhedral geometry under the restriction that the underlying graph is a tree. This constitutes an extreme special case, since it is natural to consider only connected graphs and trees are minimally connected graphs. The resulting tree partition problem is equivalently formulated as the minimization of a multi-linear polynomial which exhibits a sparsity pattern that is determined by the tree. Therefore, it is clearly np-hard, as we show in Sect. 3 . In Sect. 4, we study the facial structure of the lifted multicut polytope for trees. We introduce a canonical relaxation in terms of node triplets and characterize under which circumstances the basic inequalities are facet-defining. Furthermore, we present another class of facet-defining inequalities, which we call intersection inequalities. Finally, we show that the facets that we present yield a complete totally dual integral description in the case of paths (Sect. 5). Overall, we contribute to the study of graph partition problems an analysis of the facial structure of the lifted multicut polytope for the extreme case of minimally connected graphs such as trees and paths. Our results establish connections between the lifted multicut problem on trees and pseudo-Boolean optimization as well as sequential set partitioning. Furthermore, our insights can accelerate solvers for the problem based on integer linear programming. Multicut Polytopes and Correlation Clustering. Partitions of a graph into an unconstrained number of components based on similarities or dissimilarities of neighboring nodes is referred to as weighted correlation clustering. Weighted correlation clustering has been studied for complete graphs [1] as well as for general graphs [6] . Due to the fact that any partition of a graph into connected components is characterized by the mathematical notion of a multicut, correlation clustering is closely related to the study of multicuts. The combinatorial polytopes associated with the multicuts of a graph have been studied, among others, most notably by [5, 7, 8, 10] . The more general case in which similarities or dissimilarities of non-neighboring nodes are taken into account as well has been introduced in [12] . The latter, more expressive formulation has found a number of applications in computer vision [2, 16, 17, 22] . Pseudo-Boolean Optimization. The equivalent formulation of the problem we study in the form of an unconstrained minimization of a multi-linear polynomial connects our work to the field of pseudo-Boolean optimization. The optimization of pseudo-Boolean functions plays an important role in machine learning, for instance, in MAP inference for computer vision. The general problem can be reduced to the quadratic case [3, 4] , which is responsible for the np-hardness of the problem. The combinatorial polytope associated with the linearization of quadratic pseudo-Boolean functions was studied extensively by [18] . Recent research also considers the linearization of more general multi-linear forms [19] . Computational approaches to pseudo-Boolean optimization based on the roof duality bound [11, 14] have been quite successful in practice [20] . A set partitioning problem where the elements are assumed to adhere to a linear order has been studied by Kernighan [15] , who devises a dynamic program to solve the problem in polynomial time. The algorithm essentially solves a shortest path problem in a directed acyclic graph. The corresponding integer linear programming formulation admits a totally unimodular constraint matrix [13] . We derive a complete polyhedral description for the equivalent formulation as a lifted multicut problem on paths. Let T = (V, E) be a tree. We use the short-hand notation uv = {u, v} for any pair of distinct vertices u, v ∈ V , which may or may not correspond to an edge. When convenient, we write n = |E| and m = | V 2 | = n(n−1)/2 for the number of edges and the total number of vertex pairs, respectively. For any pair of distinct nodes u, v ∈ V , denote by P uv the unique path from u to v in T . Moreover, we denote by d(u, v) the distance of u and v in T , i.e. the length of P uv . A multicut is a set of edges that are between the components of a partition of a graph [7] . The characteristic vector x ∈ {0, 1} E of a multicut (where x e = 1 if e ∈ E is cut) is lifted to the space {0, 1} ( V 2 ) by setting x uv = 1 if, and only if u and v are in distinct components of the underlying partition [12] . The lifted multicut problem on trees is to find a partition of a tree that minimizes a sum of costs associated with pairs of nodes that are in distinct components. It is formulated as an integer linear program as follows. Definition 1 (Lifted Multicut Problem). The lifted multicut polytope LMC w.r.t. T is defined as the convex hull of all x ∈ {0, 1} ( V 2 ) that satisfy the path and cut inequalities: The lifted multicut problem w.r.t. T and θ ∈ R ( V 2 ) is defined as Note that any vector x ∈ {0, 1} E is the characteristic vector of a multicut of T as any edge set of a tree defines a partition. The path and cut inequalities ensure that for any distinct pair of nodes u, v ∈ V it holds that x uv = 1 if, and only if the path P uv is cut at any of its edges. The lifted multicut problem on T can be equivalently formulated as the minimization of a particular multi-linear polynomial over binary inputs, which we refer to as tree partition problem. is called the instance of the tree partition problem w.r.t. T andθ. If T is a path, then we also refer to (TPP) as the path partition problem w.r.t. T andθ. It is straightforward to see, by a change of variables, that the problems (TPP) and (LMP) are equivalent (up to a constant). if, and only if, the unique x ∈ LMC such that x e = 1 − y e for all e ∈ E is a solution of problem (LMP) w.r.t. T and the cost vector θ = −θ. Proof. For any distinct pair of nodes u, v ∈ V , we set which implies Therefore, we can reformulate problem (TPP) in terms of the variables x uv by transforming the objective function according tō This leads to the linear combinatorial optimization problem where the definition of LMC captures the relationship (2). Apparently, problem (TPP) corresponds to the minimization of a certain class of pseudo-Boolean functions (PBF). More precisely, we call any n-variate PBF tree-sparse, if its multi-linear polynomial form can be aligned with a tree such that n = |E| and every non-zero coefficient corresponds to the edge set of a path in the tree. Similarly, we call it path-sparse if the tree is a path itself. Treesparse PBFs are exactly those PBFs that correspond to tree partition problems (TPP). Complexity. The tree partition problem, and thus problem (LMP), is np-hard in general (Lemma 2 below). However, the path partition problem is solvable in polynomial time [15] . Proof. If T is a star (see Fig. 1a for an example), then problem (TPP) is equivalent to the unconstrained binary quadratic program with |E| variables, which is well-known to be np-hard. In this section we study the facial structure of the lifted multicut polytope LMC. We characterize all trivial facets and offer an outer relaxation of LMC that is tighter than the standard relaxation given by [12] . In Sect. 5, we show that our results yield a complete totally dual integral (TDI) description of the lifted multicut polytope for paths. x 12 x 02 x 13 x 03 Fig. 1. a) A star with additional (thin, blue) edges between non-neighboring nodes corresponding to non-local variables. b) The node u(v) is the first internal node on the path Puv. c) A path of length at least three gives rise to an intersection inequality (10) . (Color figure online) We denote the standard outer relaxation of LMC by which is obtained by dropping the integrality constraints from the definition of LMC. In particular, any x ∈ P 0 satisfies all path and cut inequalities. Let u(v) be the first node on the path P uv that is different from both u and v (cf. Fig. 1b ) and consider the polytope This description is canonical in the sense that it only considers a quadratic number of node triplets, namely those which feature two neighboring nodes and an arbitrary third node. The following lemma states that P 1 is indeed an outer relaxation of LMC that is at least as tight as P 0 . Proof. We show first that LMC ⊆ P 1 . For this purpose, let This contradicts the fact that x satisfies all cut inequalities w.r.t. u(v), v and the path inequality w. ,v = 1 and x uv = 0. This contradicts the fact that x satisfies all cut inequalities w.r.t. u, v and the path inequality w.r.t. u(v), v. It follows that x ∈ P 1 . Now, we show that P 1 ⊆ P 0 . Let x ∈ P 1 . We need to show that x satisfies all path and cut inequalities. Let u, v ∈ V with d(u, v) ≥ 2. We proceed by induction on d (u, v) . If d(u, v) = 2, then the path and cut inequalities are directly given by the definition of P 1 (for the two possible orderings of u and v). If d(u, v) > 2, then the path inequality is obtained from x uv ≤ x u, u(v) + x u(v),v and the induction hypothesis for the pair Similarly, for any edge e on the path from u to v, we obtain the cut inequality w.r.t. e by using the induction hypothesis and x u(v),v ≤ x uv such that (w.l.o.g.) e is on the path from u(v) to v. It follows that x ∈ P 0 . In this section, we show which inequalities in the definition of P 1 define facets of LMC. Moreover, we present another type of inequalities associated with paths in T , which define facets of LMC. We note that further facets can be established by the connection of LMC to the multi-linear polytope and, as a special case, the Boolean quadric polytope [18] . (5) is satisfied with equality. We show that this implies Then the face of LMC induced by (5) has dimension at most m − 2 and hence cannot be a facet. In order to check that (6) holds, we distinguish the following three cases. If The inequality for some u, v ∈ V defines a facet of LMC if, and only if, v is a leaf of T . Proof. First, suppose v is not a leaf of T and let x ∈ LMC be such that (7) is satisfied with equality. Since v is not a leaf, there exists a neighbor w ∈ V of v such that P u(v),v is a subpath of P u(v),w We show that x additionally satisfies the equality and thus the face of LMC induced by (7) cannot be a facet. There are two possible cases: Either x uv = x u(v),v = 1, then x uw = x u(v),w = 1 as well, or ,w by contraction of the path P uv , so (8) holds. Now, suppose v is a leaf of T and let Σ be the face of LMC induced by (7) . We need to prove that Σ has dimension m−1. This can be done explicitly by showing that we can construct m−1 distinct indicator vectors 1 ww for w, w ∈ V as linear combinations of elements from the set 1} m be such that x uv ww = 0 for all w, w on the path P uv and x uv ww = 1 otherwise. Clearly, it holds that x uv ∈ LMC. Now, for u, v ∈ V we construct 1 uv recursively via the distance d (u, v) . Note that, except 1 u(v),v , all indicator vectors are constructed from vectors in S. Thus, we end up with m−1 vectors that are linearly independent and constructed as linear combinations of 1 − x uv ∈ S for u, v ∈ V . Hence, the set {1} ∪ {x uv | uv ∈ V 2 } is affine independent and the claim follows. Proof. We apply the more general characterization given by [12, Theorem 8] and [12, Theorem 9] . The nodes u, v ∈ V are a pair of ww -cut-vertices for some vertices w, w ∈ V (with at least one being different from u and v) if, and only if, u or v is not a leaf of V . Thus, the claim follows from [12, Theorem 9] . The second assertion follows from [12, Theorem 8] and the fact that we lift to the complete graph on V . Intersection Inequalities. We present another large class of non-trivial facets of LMC. For any u, v ∈ V with d(u, v) ≥ 3 consider the inequality which we refer to as intersection inequality. As an example consider the graph depicted in Fig. 1c with u = 0 and v = 3. Proof. Let x ∈ LMC ∩ Z m and suppose that either Then, since x satisfies all cut inequalities w.r.t. u, v(u), respectively u(v), v, and the path inequality w.r.t. u(v), v(u), it must hold that x u(v), v(u) = 0. Moreover, if even x u, v(u) = 0 = x u(v),v , then, by the same reasoning, we have x uv = 0 as well. Hence, x satisfies (10). Proof. Let Σ be the face of LMC induced by (10) for some u, v ∈ V with d(u, v) ≥ 3. We need to prove that Σ has dimension m − 1, which can be done explicitly by showing that we can construct m − 1 distinct indicator vectors 1 ww for w, w ∈ V as linear combinations of elements from the set We employ the same construction as used in the proof of Lemma 5. Observe that for any feasible x ∈ LMC ∩ Z m with x / ∈ S, we must have that ∈ S in the construction. Thus, we simply omit 1 u(v), v(u) from the construction. By the same reasoning, we conclude that the m constructed vectors } are affine independent, so the claim follows. In this section we show that the facets established in the previous section yield a complete description of LMC when T is a path. To this end, suppose that V = {0, . . . , n} and E = {i, i + 1} | i ∈ {0, . . . , n − 1} are linearly ordered. Therefore, T = (V, E) is path. We consider only paths of length n ≥ 2, since for n = 1, the polytope LMC = [0, 1] is simply the unit interval. Let P path be the polytope of all x ∈ R ( V 2 ) that satisfy x 0i ≤ x 0,i+1 ∀i ∈ {1, . . . , n − 1}, (13) Note that the system of inequalities (11)- (15) consists precisely of those inequalities which we have shown to define facets of LMC in the previous section. We first prove that P path indeed yields an outer relaxation of LMC. It holds that LMC ⊆ P path ⊆ P 1 . Proof. First, we show that LMC ⊆ P path . Let x ∈ LMC∩Z m , then x satisfies (11) and (14) by definition. Suppose x violates (12) , then x in = 1 and x i−1,n = 0. This contradicts the fact that x satisfies all cut inequalities w.r.t. i − 1, n and the path inequality w.r.t. i, n. So, x must satisfy (12) and, by symmetry, also (13) . It follows from Lemma 7 that x satisfies (15) as well and thus x ∈ P path . Next, we prove that P path ⊆ P 1 . To this end, let x ∈ P path . We show that x satisfies all inequalities (7) . Let u, v ∈ V with u < v − 1. We need to prove that both x u+1,v ≤ x uv and x u,v−1 ≤ x uv hold. For reasons of symmetry, it suffices to show only x u+1,v ≤ x uv . We proceed by induction on the distance of u from n. If v = n, then x u+1,n ≤ x un is given by (12) . Otherwise, we use (15) for j = u and k = v + 1 and the induction hypothesis on v + 1: It remains to show that x satisfies all inequalities (5) . Let u, v ∈ V with u < v−1. We proceed by induction on d(u, v) = u − v. If d(u, v) = 2, then (5) is given by (14) . If d(u, v) > 2, then we use (15) for j = u and k = v as well as the induction hypothesis on u, v − 1, which have distance d(u, v) − 1: Hence, x ∈ P 1 , which concludes the proof. As our main result, we prove that P path is in fact a complete description of LMC and, moreover, it is totally dual integral. For an extensive reference on the subject of total dual integrality we refer the reader to [21] . Definition 3. A system of linear inequalities Ax ≤ b with A ∈ Q k×m , b ∈ Q k is called totally dual integral (TDI) if for any c ∈ Z m such that the linear program max{c x | Ax ≤ b} is feasible and bounded, there exists an integral optimal dual solution. Total dual integrality is an important concept in polyhedral geometry as it provides a sufficient condition on the integrality of polyhedra according to the following fact. Proof. We rewrite system (11)-(15) more compactly in the following way. Introduce two artificial nodes −∞ and ∞ where we associate −∞ to any index less than 0 and ∞ to any index greater than n. Moreover, we introduce variables x ii for all 0 ≤ i ≤ n as well as x −∞,i and x i,∞ for all 1 ≤ i ≤ n − 1 and finally x −∞,∞ (cf. Fig. 2) . Now, the system (11)-(15) is equivalent to the system given the additional equality constraints (26) Let the system defined by (22)-(26) be represented in matrix form as Ax ≤ a, Bx = b. Note that P path is non-empty and bounded. Thus, to establish total dual integrality, we need to show that for any θ ∈ Z m+3n the dual program has an integral optimal solution. Here, the y variables, indexed by j, k, correspond to the inequalities (22) and the z variables, indexed by pairs of i, −∞ and ∞, correspond to the equations (23)-(26). Then, the equation system A y + B z = θ translates to where (28)-(30) hold for all 0 ≤ i < i + 1 < ≤ n and (31), (32) hold for all 1 ≤ i ≤ n − 1. Observe that (28) includes only y variables with indices of distance 2, (29) couples y variables of distance 3 with those of distance 2 and (30) couples the remaining y variables of any distance d > 3 with those of distance d − 1 and d − 2. Hence, any choice of values for the free variables z ii completely determines all y variables. This means we can eliminate y and reformulate the dual program entirely in terms of the z variables, as follows. It holds that We studied the lifted multicut polytope for the special case of trees lifted to the complete graph. We characterized a number of its facets and provided a tighter relaxation compared to the standard linear relaxation. Our analysis establishes a connection between the lifted multicut problem on trees and pseudo-Boolean optimization. Moreover, the described facets yield a complete totally dual integral description of the lifted multicut polytope for paths. This main results relates the geometry of the path partition problem to the combinatorial properties of the sequential set partitioning problem. Moreover, our insights can accelerate solvers for the tree and path partition problem based on integer linear programming. Correlation clustering Multicut brings automated neurite segmentation closer to human performance On quadratization of pseudo-Boolean functions Pseudo-Boolean optimization The partition problem Correlation clustering in general weighted graphs Complete descriptions of small multicut polytopes Geometry of Cuts and Metrics. AC A min-max relation for submodular functions on graphs A cutting plane algorithm for a clustering problem Roof duality, complementation and persistency in quadratic 0-1 optimization Analysis and optimization of graph decompositions by lifted multicuts Partitioning of sequentially ordered systems using linear programming Generalized roof duality Optimal sequential partitions of graphs Higher-order minimum cost lifted multicuts for motion segmentation Efficient decomposition of image and mesh graphs by lifted multicuts The Boolean quadric polytope: some characteristics, facets and relatives A polyhedral study of binary polynomial programs Optimizing binary MRFs via extended roof duality Theory of Linear and Integer Programming Multiple people tracking by lifted multicut and person re-identification for all 0 ≤ i ≤ ≤ n. Substituting the y variables in (31)-(33) yields the following equivalent formulation of the dual program (27):The variables z −∞,i , z i,∞ and z −∞,∞ occur only in a single equation each. Furthermore, the matrix corresponding to the inequality constraints satisfies the consecutive-ones property w.r.t. its rows. Therefore, the constraint matrix of the whole system is totally unimodular, which concludes the proof.Remark. The constraint matrix corresponding to the system (11)-(15) is in general not totally unimodular. A minimal example is the path of length 4. Proof. This is immediate from Lemma 9, Fact 1 and Theorem 1.Remark. The path partition problem admits a more efficient representation as a set partitioning problem as follows. For each 0 ≤ i ≤ ≤ n, letthen taking the dual of problem (35) and simplifying yields the problem0≤i≤k≤ ≤nEach variable λ i, corresponds to the component containing nodes i to . Problem (41) is precisely the sequential set partitioning formulation of the path partition problem as used by [13] . It admits a quadratic number of variables and a linear number of constraints (opposed to a quadratic number of constraints in the description of LMC). The integrality constraint need not be enforced, since the constraint matrix is totally unimodular.