key: cord-0058676-d0x44sc7
authors: Muts, Pavlo; Nowak, Ivo; Hendrix, Eligius M. T.
title: A Resource Constraint Approach for One Global Constraint MINLP
date: 2020-08-20
journal: Computational Science and Its Applications - ICCSA 2020
DOI: 10.1007/978-3-030-58808-3_43
sha: c1e3f3be05322f52b83482526454f1b7373e01f9
doc_id: 58676
cord_uid: d0x44sc7

Many industrial optimization problems are sparse and can be formulated as block-separable mixed-integer nonlinear programming (MINLP) problems, where low-dimensional sub-problems are linked by a (linear) knapsack-like coupling constraint. This paper investigates exploiting this structure using decomposition and a resource constraint formulation of the problem. The idea is that one outer approximation master problem handles sub-problems that can be solved in parallel. The steps of the algorithm are illustrated with numerical examples which shows that convergence to the optimal solution requires a few steps of solving sub-problems in lower dimension.

Mixed Integer Nonlinear Programming (MINLP) is a paradigm to optimize systems with both discrete variables and nonlinear constraints. Many real-world problems are large-scale, coming from areas like machine learning, computer vision, supply chain and manufacturing problems, etc. A large collection of realworld MINLP problems can be found in MINLPLib [13] .

In general, the problems are hard to solve. Many approaches have been implemented over the last decades. Most of the nonconvex MINLP deterministic solvers apply one global branch-and-bound (BB) search tree [3, 4] , and compute a lower bound by a polyhedral relaxation, like ANTIGONE [8] , BARON [11] , Couenne [1] , Lindo API [7] and SCIP [12] . The challenge for these methods is to handle a rapidly growing global search tree, which fills up the computer memory. An alternative to BB is successive approximation. Such methods solve an optimization problem without handling a single global search tree. The Outer-Approximation (OA) method [5, 6, 9, 15] and the Extended Cutting Plane method [14] solve convex MINLPs by successive linearization of nonlinear constraints. One of the challenges we are dealing with is how to handle this for nonconvex MINLP problems.

In this paper, we focus on practical potentially high dimension problems, which in fact consist of sub-problems that are linked by one coupling constraint. This opens the opportunity to reformulate the problem as a resource constraint bi-objective problem similar to the multi-objective view used in [2] for integer programming problems. We investigate the potential of this approach combining it with Decomposition-based Inner-and Outer-Refinement (DIOR), see [10] .

To investigate the question, we first write the general problem formulation with one global constraint and show that this can be approached by a resource constrained formulation in Sect. 2. Section 3 presents a possible algorithm to solve such problems aiming at a guaranteed accuracy. The procedure is illustrated stepwise in Sect. 4. Section 5 summarizes our findings.

We consider block-separable (or quasi-separable) MINLP problems where the set of decision variables x ∈ R n is partitioned into |K| blocks

The dimension of the variables x k ∈ R n k in block k is n k such that n = k∈K n k .

The vectors x, x ∈ R n denote lower and upper bounds on the variables. The linear constraint a T x ≤ b is called the resource constraint and is the only global link between the sub-problems. We assume that the part a k corresponding to block k has a k = 0, otherwise the corresponding block can be solved independently. The constraints defining feasible set X k are called local. Set X k is defined by linear and nonlinear local constraint functions, g ki : R n k → R, which are assumed to be bounded on the set [x k , x k ]. The linear objective function is defined by

The Multi-objective approach of [2] is based on focusing on the lower dimensional space of the global constraints of the sub-problems rather than on the full n−dimensional space. We will outline how they relate to the so-called Bi-Objective Programming (BOP) sub-problems based on a resource-constrained reformulation of the MINLP.

If the MINLP (1) has a huge number of variables, it can be difficult to solve it in reasonable time. In particular, if the MINLP is defined by discretization of some infinitely dimensional variables, like in stochastic programming or in differential equations. For such problems, a resource-constrained perspective can be promising.

The idea is to view the original problem (1) in n-dimensional space from two objectives; the objective function and the global resource constraint. Define the 2 × n k matrix C k by

and consider the transformed feasible set:

The resource-constrained formulation of (1) is

The approach to be developed uses the following property.

of finding the appropriate values of v k1 where v * k0 is the optimal value of subproblem RCP k given by

Proof. From the definition it follows that the optimum of (6) coincides with

This illustrates the idea, that by looking for the right assignment v k1 of the resource, we can solve the lower dimensional sub-problems in order to obtain the complete solution. This provokes considering the problem as a potentially continuous knapsack problem. Approaching the problem as such would lead having to solve the sub-problems many times to fill a grid of state values with a value function and using interpolation. Instead, we investigate the idea of considering the problem as a bi-objective one, where we minimize both, v k0 and v k1 .

A bi-objective consideration of (5) changes the focus from the complete image set V k to the relevant set of Pareto optimal points. Consider the sub-problem BOP k of block k as min

The Pareto front of BOP k (8) is defined by a set of vectors, v k = C k x k with x k ∈ X k with the property that there does not exist another feasible solution w = C k y k with y k ∈ X k , which dominates v k , i.e., for which w i ≤ v ki for i = 0, 1, and w i < v ki for i = 0 or i = 1. We will call an element of the Pareto front a nondominated point (NDP). In other words, a NDP is a feasible objective vector for which none of its components can be improved without making at least one of its other components worse. A feasible solution x k ∈ X k is called efficient (or Pareto optimal) if its image v k = C k x k is a NDP, i.e. it is nondominated. Let us denote the Pareto front of NDPs of (8) as

Proof. Assume there exist parts ofv * k / ∈ V * of an optimal solution v * , i.e the parts are dominated. This implies

Considerv the corresponding solution where in v * the parts v * k are replaced by w k . Thenv is feasible for RCP given k∈Kv k1 ≤ k∈K v * k1 ≤ b and its objective value is at least as good, k∈Kv k0 ≤ k∈K v * k0 , which means that the optimum is attained at NDP pointv ∈ V * .

In bi-objective optimization, a NDP can be computed by optimizing a weighted sum of the objectives

For a positive weight vector d ∈ R 2 , an optimal solution of (10) is an efficient solution of (8), i.e., its image is nondominated. Such a solution and its image are called a supported efficient solution and a supported NDP, respectively. Thus, an efficient solution x k is supported if there exists a positive vector d for which x k is an optimal solution of (10), otherwise x k is called unsupported.

To illustrate the concepts, we use a simple numerical example which can be used as well to follow the steps of the algorithms to be introduced.

and the local constraints are given by g 11 

The optimal solution is x = (1, 1.5, 2, 2.5) with objective value −8.5. The corresponding points in the resource space are v 1 = (−4, 3.5) and v 2 = (−4.5, 6.5). Figure 1 sketches the resource spaces V 1 and V 2 with the corresponding Pareto front. In blue, now the dominated part of V k not covered by the Pareto front V * k is visible. The number of supported Pareto points is limited.

The example suggests that we should look for an optimal resource v k in a twodimensional space, which seems more attractive than solving an n-dimensional problem. Meanwhile, sub-problems should be solved as few times as possible. 

The optimization of problem (1) according to Proposition 1 reduces to finding v * k in the search space which according to Proposition 2 can be found in the box

First of all, if the global constraint is not binding, dual variable μ = 0 and the sub-problems can be solved individually. However, usually μ > 0 gives us a lead of where to look for v * k . Notice that if v * k is a supported NDP, it can be found by optimizing sub-problem

for β = μ. In that case, v * k = C T k y k . However, we do not know the dual value μ beforehand and moreover, v * k can be a nonsupported NDP. Notice that the resulting solution y k is an efficient point and C k y k is a supported NDP for β ≥ 0.

To look for the optimum, we first create an outer approximation (OA) W k of V * by adding cuts using an LP master problem to estimate the dual value and iteratively solving 11. Second, we will use the refinement of W k in a MIP OA, which may also generate nonsupported NDPs.

Initially, we compute the Ideal and Nadir point v k and v k for each block k. This is done by solving (11) with β = ε, β = 1 ε respectively. Let r 1 = C k y k (ε) and

These vectors bound the search space for each block and initiate a set P k of outer cuts. We use set R k = {(r, β)} of supported NDPs with a corresponding weight β, to define local cut sets

An initial cut is generated using the orthogonal vector of the plane between r 1 and r 2 , i.e. find y k (β k0 ) in (11) with

Notice that if r 1 is also a solution of the problem, then apparently there does not exist a (supported) NDP v at the left side of the line through r 1 and r 2 , i.e. (1, β)v < (1, β)r 1 . The hyperplane is the lower left part of the convex hull of the Pareto front. Basically, we can stop the search for supported NDPs for the corresponding block. An LP outer approximation of (5) is given by

It generates a relaxed solution w and an estimate β of the optimal dual μ. Then β is used to generate more support points by solving problem (11) . Notice that for several values of β the same support point C k y k (β) may be generated. However, each value leads to another cut in P k . Moreover, the solution w of (13) will be used later for refinement of the outer approximation. A sketch of the algorithm is given in Algorithm 1.

An outer approximation W k of the Pareto front V * k is given by the observation that the cone {v ∈ R 2 , v ≥ v} contains the Pareto front. Basically, we refine this set as a union of cones based on all found NDPs. We keep a list P k = {p k0 , p k1 , . . . , p k|P k | } of NDPs of what we will define as the extended Pareto Algorithm 1. Generate OA 1: function initOA(ε, qmax) 2:

for k ∈ K do 3:

stop k ← false 4:

Use (8) to determine v k and v k via r1 and r2 5:

Determine β k0 of (12). Solve (11) with β k0 6:

if r1 ∈ argmin of (11) then 7:

stop k ← true, stop searching for supported NDPs 8:

q ← 1 10:

(w, βq) ← (primal, dual) solution (13) 11:

repeat 12:

for k ∈ K do 13:

if not stop k and βq = β k0 then 14:

y k (βq) ← solution (11), Store (C k y k (βq), βq) in R k

q ← q + 1, (w, βq) ← (prima, dual) solution (13) using R k to define P k 16:

until (∃j = 1, . . . , q − 1, |βq − βj| < ε) or (q = qmax) or (∀k ∈ K, stop k ) 17:

return R k , w, stop k front ordered according to the objective value p k00 < p k10 < . . . < p k|P k |0 . Initially, the list P k has the supported NDPs we found in R k . However, we will generate more potentially not supported NDPs in a second phase. Using list P k , the front V * k is (outer) approximated by W k = ∪W ki with

We solve a MIP master problem (starting with the last solution found by the LP-OA) to generate a solution w, for which potentially not for all blocks k we have w k ∈ V * k . Based on the found points, we generate new NDPs v k to refine set W k , up to convergence takes place. Moreover, we check whether v k is supported NDP, or generate a new supported NDP in order to add a cut to P k .

The master MIP problem is given by

which can be implemented as

Basically, if the solution w of (14) corresponds to NDPs, i.e. w k ∈ V * k for all blocks k, we are ready and solved the problem according to Proposition 2. If w k / ∈ V * k , we refine W k such that w k is excluded in the next iteration. In order to obtain refinement points p, we introduce the extended Pareto front, which also covers the gaps. Define the extended search area as

Then the extended Pareto front is given by

To eliminate w k in the refinement, we perform line-search v = w k +λ(1, β) T , λ ≥ 0 in the direction (1, β) T , with

A possible MINLP sub-problem line search is the following:

and taking v k = w k + λ k (1, β) T as NDP of the (extended) Pareto front. Notice that if λ k = 0, apparently w k ∈ V * k .

1: function OAsubs(ε) 2: Take the results from Algorithm 1 3:

For all k, initiate list P k with the NDPs in R k 4:

w ← solution LP OA master (13) 5: repeat 6:

for k ∈ K do 7:

if w k / ∈ P k then 8:

find v k solving (19) 9:

Insert v k in list P k and update W k 10:

if (w k = v k ) and (not stop k ) then 11:

12: Solve (11) and add (C k y k (β), β) to R k , update P k 13:

w ← solution MIP master (14) 14:

until ∀k ∈ K, ∃p ∈ P k , w k − p < ε Moreover, we try to generate an additional cut by having a supported NDP. The complete idea of the algorithm is sketched in Algorithm 2. Let z k = w k0 , if w k ∈ P k and z k = v k0 else. Then z k provides an upper bound on the optimal function value. In this way, an implementation can trace the convergence towards the optimum. Based on the concept that in each iteration the MIP-OA iterate w is cut out due to the refinement in v k , it can be proven that the algorithm converges to an optimal resource allocation v * .

In this section, we use several instances to illustrate the steps of the presented algorithms. All instances use an accuracy ε = 0.001 and we focus on the number of times OA-MIP (14) is solved and sub-problem (11) and line-search subproblem (19). First we go stepwise through example problem. Second, we focus on a concave optimization problem known as ex 2 1 1 of the MINLPlib [13] . At last, we go through the convex and relatively easy version of ex 2 1 1 in order to illustrate the difference in behaviour. First of all, we build OA W k of the Pareto front following Algorithm 1.

Ideal and Nadir point for each block, W 1 = −8 0 , 0 11.5 and W 2 = −6 5 , −3 11 are found by minimizing cost and resource. Based on these extreme values, we run sub-problem (11) for step 8 in Algorithm 1 using direction vectors β 1,0 = 0.6957 and β 2,0 = 0.5 according to (12) . For this specific example, we reach the optimal solution corresponding to (v 1,0 , v 1,1 , v 2,0 , v 2,1 ) = (−4, 3.5, −4.5, 6.5) T which still has to be proven to be optimal. One can observe the first corresponding cuts in Fig. 2 . We run the LP OA which generates a dual value of β k1 = 0.6957, which corresponds to the angle of the first cut in the V 1 space. This means that sub-problem (11) does not have to be run for the first block, as it will generate the same cut. For the second block, it finds the same support point v 2 = (−4.5, 6.5) T , but adds a cut with a different angle to P 2 , as can be observed in the figure.

Algorithm 2 first stores the found extreme points and optimal sub-problem solutions in point sets P k . The used matlab script provides an LP-OA solution of Due to the errors, both blocks do not consider v k = w k and add a cut according to step 12 in the algorithm. Interesting enough is that the MIP OA in the space as drawn in Fig. 2 reaches an infeasible point w, i.e. w k ∈ W k , but w k / ∈ V k further away from the optimum. This provides the incentive for the second block to find v 1 in the extended Pareto set as sketched in Fig. 3 . This helps to reach the optimum with an exact accuracy.

For this instance, the MIP OA was solved twice and in the end contains 6 binary variables. Both blocks solve two sub-problems to reach the Ideal and 2 times the line search problem (19). The first block solves sub-problem (11) 3 times and the second block once more to generate an additional cut. In general, in each iteration at most two sub-problems are solved for each block. The idea is that this can be done in parallel. Notice that the refinement causes the MIP OA to have in each iteration at most |K| additional binary variables. This instance can be characterised as a worst-case type of knapsack problem, where the usual heuristic to select the items with the best benefit-weight ratio first provides the wrong solution. As we will observe, a similar behaviour can be found using the OA relaxations. All variables are continuous and we are implicitly minimizing a concave quadratic function. In our description n = 10, c = (0, 1, 0, 1, 0, 1, 0, 1, 0, 1), a = (20, 0, 12, 0, 11, 0, 7, 0, 4, 0), b = 40, K = {1, 2, 3, 4, 5} and divides vector x into 5 equal blocks x k = (x 2k−1 , x 2k ), k ∈ K. The local constraints are given by g k (y, z) = q k y − 50y The optimal solution is x = (1, −8, 1, −6, 0, 0, 1, −3, 0, 0) with objective value −17. However, the LP OA relaxation provides a first underestimate of the objective of −18.9, where all subproblems take point r 1 apart from the first one, where w 1 = (−2.4, 6). The solution and its consequence for the first iteration is sketched in Fig. 4 . One can also observe the resulting line-search to find the first solution v 1 to come to the first refinement. The bad approximation of the first solution corresponds to a result of a greedy knapsack algorithm. For this instance, an iterative refinement is necessary that generates the points v k up to convergence is reached.

Another typical characteristic is the concave shape of the Pareto front. This implies that the first cut is in fact the green line between r 1 and r 2 . This is detected in step 6 of Algorithm 1 and implies that there are no more supported NDPs than r 1 and r 2 , so that one does not have run sub-problem (11) anymore. However, the algorithm requires to solve for more and more sub-problems the line search problem (19) in each MIP OA iteration.

In total, the algorithm requires 9 steps of the MIP OA algorithm to reach the accuracy. In the end, the problem contains 26 binary variables δ over all subproblems. The intensity is best depicted by the refinement of W 1 in Fig. 5 .

The algorithm requires 3 times solving sub-problem (11) to generate r 1 , r 2 and to find the cut for each block. In total it solved the line search (19) 21 times. Fig. 6 . At the left, the Pareto front of all sub-problems combined. The minimum resource point r2 = (0, 0) T is common for all sub-problems and the minimum cost solution is given by a blue circle. The first cut is illustrated by a green line. At the right, OA for V1 after one iteration. Actually, it stays the same during future iterations, as w1 coincides numerically with a found NDP. (Color figure online) Convex problems are usually considered easy to solve. The idea of having a so-called zero duality gap is captured by solving the problem already by the LP OA using the dual value to generate cuts. We use the data of the former instance, but are now implicitly minimizing a convex quadratic function. Again n = 10, c = (0, 1, 0, 1, 0, 1, 0, 1, 0, 1), a = (20, 0, 12, 0, 11, 0, 7, 0, 4, 0), but now taking b = 15 in order to have a binding global constraint. The local constraints are given by g k (y, z) = −q k y + 50y 2 − z, where q = (42, 44, 45, 47, 47.5). The bounds on the variables are given by [x k1 ,

The optimal objective function value is −45.62. The Pareto fronts do not exhibit gaps, but are smooth quadratic curves. This means that solving the weighted problem with different values of β provides different optima. In Fig. 6 , all Pareto fronts are depicted together with the first generated cut. However, the LP OA relaxation of Algorithm 1 provides a first relatively sharp underestimate of the objective of −45.7. This illustrates the idea of decomposition for convex problems where a good value for the dual provides the optimum for the primal problem.

The big difference with the concave case is that now in each iteration in principle new cuts can be generated. Considering the first block V 1 , this does not really happen, as the optimum w 1 coincides with a point v 1 on the Pareto front found by the line search. Figure 6 illustrates at the right the MIP OA based on the points in set P 1 on the Pareto front. Further refinement does not improve the bound, as the optimum has already been approximated in the first iteration.

Mixed Integer Nonlinear Programming (MINLP) is a strong concept for formulating practical optimization problems. Solvers based on the branch and bound concept, usually suffer from the number of variables n of the problem. For instances having one global inequality, we investigated the potential of using decomposition requiring sub-problems of a smaller size n k , k ∈ K to be solved using a MIP master problem exploiting the concept of resource constraint programming and a Pareto front.

The result of our investigation is a decomposition algorithm aiming at convergence up to a guaranteed accuracy. In each iteration, a MIP master problem is solved and at most 2|K| sub-problems, that can be solved in parallel. The process is illustrated with graphical examples which are solved up to an accuracy of ε = 0.001 after a few iterations.

At the moment, we are working on designing algorithms for instances with more than one global constraint, resulting in a Pareto set in higher dimension. The approach presented here can be extended and proven to converge. However, to obtain an algorithm with reasonable practical performance requires rethinking the cut generation in a resource space of higher dimension. We will report on this topic in future papers.

Branching and bounds tightening techniques for non-convex MINLP

Decomposition of loosely coupled integer programs: a multiobjective perspective

Non-convex mixed-integer nonlinear programming: a survey

An outer-approximation algorithm for a class of mixedinteger nonlinear programs

Solving mixed integer nonlinear programs by outer approximation

The global solver in the LINDO API

ANTIGONE: Algorithms for coNTinuous/Integer Global Optimization of Nonlinear Equations

The decomposition-based outer approximation algorithm for convex mixed-integer nonlinear programming

Decompositionbased Inner-and Outer-Refinement Algorithms for Global Optimization

A polyhedral branch-and-cut approach to global optimization

Decomposition in multistage stochastic programming and a constraint integer programming approach to mixed-integer nonlinear programming

An extended cutting plane method for solving convex MINLP problems

Une methode d'optimisation nonlineare en variables mixtes pour la conception de procedes

Acknowledgments. This paper has been supported by The Spanish Ministry (RTI2018-095993-B-100) in part financed by the European Regional Development Fund (ERDF) and by Grant 03ET4053B of the German Federal Ministry for Economic Affairs and Energy.