• Non ci sono risultati.

Advanced Decomposition Methods Part I: all is one, one is all

N/A
N/A
Protected

Academic year: 2022

Condividi "Advanced Decomposition Methods Part I: all is one, one is all"

Copied!
55
0
0

Testo completo

(1)

Advanced Decomposition Methods Part I: all is one, one is all

Antonio Frangioni

Dipartimento di Informatica, Universit`a di Pisa

COST/MINO Ph.D. School on Advanced Optimization Methods

Roma — June 8, 2016

(2)

Outline

1 Block-Structured (Mixed-Integer NonLinear) Programs

2 Dual decomposition (Dantzig-Wolfe/Lagrangian/Column Generation)

3 Primal decomposition (Benders’/Resource)

4 The Integer Case

5 Example Applications (the Reformulation before the Reformulation)

6 Conclusions (for now)

(3)

Block-Structured Programs

Many applications of Mixed-Integer NonLinear Programming are large-scale: millions/billions of variables/constraints

Good news: (almost) all large-scale problems areblock-structured Usually several nested forms of structure, but two main ones:

E1

A

E1

block-diagonal staircase-structured

≡complicating constraints ≡ complicating variables

Relaxingconstraints /fixing variables yieldsindependent subproblems

=⇒ much easierbecause of size and/or structure (integrality, . . . )

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 3 / 42

(4)

Example I: Two-stage Stochastic (Linear) Programs

Problems involvingdecisions over timeanduncertainty First-stage (here-and-now) decisions x , constraints E0x ≤ b0

Set S of scenarios, realization known only after deciding x Recoursedecisions zs,differentfor each scenario s ∈ S , constraints E0sx + Eszs ≤ bs

Minimize here-and-now cost plus average cost of reserve actions min

n

c0x +P

s∈Sπscszs : E0x ≤ b0 , E0sx+ Eszs ≤ bs s ∈ S o Extends to multi-stage(structure repeats “fractally” into each Es) Oftenother structures inside E ,network a common one

Extends to nonlinear risk measures (CVaR, . . . ), integer variables, . . . Many applications: energy[1], water, logistics, telecom, finance, . . .

[1] Tahanan, van Ackooij, F., Lacalandra “Large-scale Unit Commitment under uncertainty” 4OR 2015

(5)

Example II: (Linear) Multicommodity Network Design

Graph G = (N, A),multicommodity network design model minP

k∈K

P

(i , j )∈Adkcijkxijk+P

(i , j )∈Afijzij (1)

X

(i , j )∈A

xijk − X

(j ,i )∈A

xjik =

( 1 if i = sk 1 if i = tk 0 otherwise

i ∈ N , k ∈ K (2) P

k∈Kdkxijk ≤ uijzij (i , j ) ∈ A (3)

xijk ∈ [0, 1] (i , j ) ∈ A , k ∈ K (4)

zij ∈ {0, 1} (i , j ) ∈ A (5)

K ≡ commodities ≡ (sk, tk, dk) (not completely generic) Pervasive structure in most of combinatorial optimization Many applications: logistic, transportation, telecom, energy, . . . Nonlinear objective function/constraints(energy, delay[2], . . . )

[2] F., Galli, Scutell`a “Delay-Constrained Shortest Paths: Approximation Algorithms and SOCP Models” JOTA, 2015

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 5 / 42

(6)

Dual decomposition, a.k.a.

Inner Approximation

Dantzig-Wolfe decomposition Lagrangian Relaxation

Column Generation

(7)

Block-diagonal Convex (Linear) Program

Block-diagonal program: convex X, n “complicating” constraints (Π) max { cx : Ax = b, x ∈ X }

e.g, X = { x : Ex ≤ d } =N

k∈K Xk = { xk : Ekxk ≤ dk} (|K | large =⇒ (Π) verylarge), Ax = b linking constraints We can efficiently optimize upon X, for different reasons:

a bunch of (many, much) smaller problems instead of a large one X has (the Xk have)structure (shortest path, knapsack, . . . ) (much more so than solving the whole of (Π), anyway)

In other words wecould efficiently solve (Π) if linking constraints were removed: how to exploit it?

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 7 / 42

(8)

Dantzig-Wolfe reformulation

Dantzig-Wolfe reformulation[3]: X convex =⇒ represent it by points X = x = Px ∈X¯ x θ¯ x¯ : P

¯

x ∈X θx¯= 1 , θx¯≥ 0 x ∈ X¯ then reformulate (Π)in terms of the convex multipliers θ

(Π)





max c P

¯

x ∈X x θ¯ x¯



A P

¯

x ∈X x θ¯ x¯

 = b P

¯

x ∈X θ¯x = 1 , θ¯x ≥ 0 x ∈ X¯ only n + 1 rows (but how many columns?)

note that“¯x ∈ X ” is an index, not a constraint (θ is the variable)

A rather semi-infiniteprogram, but“only” ¯x ∈ ext X needed Not that this makes it any less infinite, unless

X is a polytope (compact polyhedron) =⇒finite set of vertices

[3] Dantzig, Wolfe “The Decomposition Principle for Linear Programs” Operations Research 1960

(9)

Dantzig-Wolfe reformulation (cont.d)

Could this ever be a good idea? Actually, it could:

polyhedra may have few facesandmany vertices . . . orvice-versa n-cube |xi| ≤ 1 ∀ i 2n faces 2n vertices n-co-cube P

i|xi| ≤ 1 2n faces 2n vertices Except, most oftenthe number of vertices is too large

AXe = b AXe = b ee = 1 ee = 1 Ax = b Ax = b

Ex ) d Ex ) d

AXe = b AXe = b ee = 1 ee = 1

a (linear) program with (exponentially/infinitely) many columns But,efficiently optimize over X =⇒ generate vertices (≡ columns)

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 9 / 42

(10)

Dantzig-Wolfe decomposition ≡ Column Generation

B ⊂ X (small), solve restriction of (Π)with X → B, i.e.,

B)





max P

¯

x ∈B(c ¯x ) θ¯x

P

¯

x ∈B(A¯x ) θ¯x = b P

¯

x ∈B θx¯ = 1 , θx¯ ≥ 0 x ∈ B¯

“master problem” (B small, not too costly)

note how the parentheses have moved: linearityis needed (for now)

If B contains the “right” columns, x =P

¯

x ∈B x θ¯ x¯ optimal for (Π) How do I tell if B contains the “right” columns? Use duality

(∆B)

min  yb + v : v ≥ c ¯x − y (A¯x ) x ∈ B¯

= min fB(y ) = max { c ¯x + y (b − A¯x ) : ¯x ∈ B } one constraint for each ¯x ∈ B

(11)

Dantzig-Wolfe decomposition ≡ Lagrangian relaxation

Dual of (Π): (∆) ≡ (∆X) (many constraints) fB = lower approximation ofLagrangian function

y) f (y ) = max { cx + y (b − Ax ) : x ∈ X } Assumption: optimizing over X is “easy” foreachobjective =⇒

obtaining ¯x s.t. f (y ) = c ¯x + y (b − A¯x ) is “easy”

Important: (Πy)Lagrangian relaxation[4],f (y ) ≥ v (Π) = v (∆) ∀y provided (Πy) is solved exactly(or at least af ≥ f (y )¯ is used) Thus, (∆B) outer approximation of theLagrangian dual

(∆) min f (y ) = max { cx + y (b − Ax) : x ∈ X }

[4] Geoffrion “Lagrangean relaxation for integer programming” Mathematical Programming Study 1974

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 11 / 42

(12)

Lagrangian duality vs. Linear duality

Note about the LP case (X = { x : Ex ≤ d }):

(∆) min yb + max { (c − yA)x : Ex ≤ d }

≡ min yb + min { wd : wE = c − yA , w ≥ 0 }

≡ min yb + wd : wE + yA = c , w ≥ 0

≡ exactly the linear dual of (Π)

y “partial” duals: duals w of Ex ≤ d “hidden” in the subproblem There is only one duality

Will repeatedly come in handy

(13)

Dantzig-Wolfe decomposition ≡ Dual row generation

Primal/dual optimal solution x/(v, y) out of (ΠB)/(∆B) x feasible to (Π), so optimal ⇐⇒ (v, y) feasible to (∆)

⇐⇒ v≥ (c − yA)x ∀x ∈ X

⇐⇒ v≥ max { (c − yA)x : x ∈ X } In fact: v≥ (c − yA)¯x ≡ yb + v ≥ f (y) =⇒

v (Π) ≥ cx = yb + v≥ f (y) ≥ v (∆) ≥ v (Π) =⇒

x/(v, y) optimal

Otherwise, B = B ∪ { ¯x }: add new column to (ΠB) / row to (∆B), rinse, repeat

Clearly finite is ext X is, globally convergent anyway:

the cutting plane algorithm for convex programs[5] (applied to (∆))

[5] Kelley “The Cutting-Plane Method for Solving Convex Programs” J. of the SIAM, 1960

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 13 / 42

(14)

Geometry of the Lagrangian dual

y f x2

y*

fB x3

x4 x1

x5

x6

v*

v = fB(y) lower bound on v (ΠB)

Optimal solutionx¯ gives separatorbetween (v, y) and epi f (c ¯x , A¯x ) = new row in (∆B) (subgradient of f at y)

(15)

Geometry of the Lagrangian dual

y f x2

y*

fB x3

x4 x1

x5

x6

v = fB(y) lower bound on v (ΠB)

Optimal solutionx¯ givesseparator between (v, y) and epi f

(c ¯x , A¯x ) = new row in (∆B) (subgradient of f at y)

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 14 / 42

(16)

Geometry of the Lagrangian dual

y f

x2 fB

x3

x4 x1

x5

x6

v = fB(y) lower bound on v (ΠB)

Optimal solutionx¯ givesseparator between (v, y) and epi f (c ¯x , A¯x ) = new row in (∆B) (subgradient of f at y)

(17)

Dantzig-Wolfe decomposition ≡ Inner Approximation

“Abstract” view of (ΠB): conv (B) inner approximation of X (ΠB) max { cx : Ax = b , x ∈ conv (B) } x solves the Lagrangian relaxation of (ΠB) with y, i.e.,

x ∈ argmax (c − yA)x : x ∈conv (B)

=⇒ (c − yA)x ≤ (c − yA)x for each x ∈conv (B) ⊆ X (c − yA)¯x = max{ (c − yA)x : x ∈ X } ≥ (c − yA)x Column x has¯ positive reduced cost

(c − yA)(¯x − x) = (c − yA)¯x − cx+ yb = (c − yA)¯x − v> 0

=⇒ ¯x∈/conv (B) =⇒ makes sense to add ¯x to B

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 15 / 42

(18)

Geometry of Dantzig-Wolfe/Column Generation

Ax = b Ax = b

c c < y

*

A

c − yA separatesconv (B) ∩ Ax = b from all x ∈ X better than x

Thus, optimizing it allows finding new points (if any) Issue: conv (B) ∩ Ax = b must be nonempty

(19)

Geometry of Dantzig-Wolfe/Column Generation

Ax = b Ax = b

c c < y

*

A

c − yA separatesconv (B) ∩ Ax = b from all x ∈ X better than x Thus, optimizing it allows finding new points (if any)

Issue: conv (B) ∩ Ax = b must be nonempty

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 16 / 42

(20)

Geometry of Dantzig-Wolfe/Column Generation

Ax = b Ax = b

c c < y

*

A

c − yA separatesconv (B) ∩ Ax = b from all x ∈ X better than x Thus, optimizing it allows finding new points (if any)

Issue: conv (B) ∩ Ax = b must be nonempty

(21)

Extension I: the Unbounded Case

X unbounded ⇐⇒ rec X ⊃ {0} =⇒ f (y ) = v (Πy) =∞happens X = conv ( ext X = X0) +cone( ext rec X = X)

B = ( B0 ⊂ X0) ∪ (B⊂ X) = { points ¯x } ∪ { rays ¯χ } =⇒

B)









max c

P

¯

x ∈B0x θ¯ x¯+P

¯

χ∈Bχθ¯ χ¯ A

P

¯

x ∈B0x θ¯ x¯+P

¯

χ∈Bχθ¯ χ¯



= b P

¯

x ∈B0 θx¯ = 1

θx¯≥ 0 x ∈ B¯ 0 , θχ¯≥ 0 χ ∈ B¯

In (∆B),constraints y (A ¯χ) ≥ c ¯χ (a.k.a. “feasibility cuts”) (Πy) unbounded ⇐⇒ (c − yA) ¯χ > 0 for some ¯χ ∈ rec X (violated constraint) =⇒ B= B∪ { ¯χ}

(∆) = min{ f (y ) : y ∈ Y}, (Πy) provideseither subgradients of f (a.k.a. “optimality cuts”), orviolated valid inequalities for Y[5]

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 17 / 42

(22)

Extension II: the Nonlinear Case

Nonlinear case: c(·) concave, A(·) component-wise convex (Π) max  c(x) : A(x)≤b , x ∈ X

(∆) max  f (y ) = yb + max { c(x) − yA(x) : x ∈ X } : y ≥ 0 Any ¯x ∈ X still gives f (y ) ≥ c(¯x ) + y (b − A(¯x )), same(∆B) / (ΠB) c(P

¯

x ∈Bx θ¯ ¯x) ≥P

¯

x ∈Bc(¯x )θx¯ (c(·) concave), A(P

¯

x ∈Bx θ¯ x¯) ≤P

¯

x ∈BA(¯x )θx¯≤ b (A(·) convex) =⇒

B) safe inner approximation (v (ΠB) ≤ v (Π)) Basically everything keeps working, but you may need constraint qualification[6] (usually easy to get)

[6] Lemar´echal, Hiriart-Urrity “Convex Analysis and Minimization Algorithms” Springer, 1993

(23)

Primal decomposition, a.k.a.

Outer Approximation Benders’ decomposition Resource decomposition

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 19 / 42

(24)

Staircase-structured Convex (Linear) Program

Staircase-structuredprogram: convex X,“complicating” variables (Π) max { cx + ez : Dx + Ez ≤ d , x ∈ X } e.g, Dx + Ez ≤ d ≡ Dkx+ Ekzk ≤ dk k ∈ K (|K | large) =⇒

Z (x ) = { z : Ez ≤ d − Dx }

= N

k∈K Zk(x ) = { zk : Ekzk ≤ dk − Dkx } We can efficiently optimize upon Z (x ), for different reasons:

a bunch of (many, much) smaller problems instead of a large one Z (x ) has (the Zk(x ) have)structure(shortest path, knapsack, . . . ) (much more so than solving the whole of (Π), anyway)

In other words wecould efficiently solve (Π) if linking variables were fixed: how to exploit it?

(25)

Benders’ reformulation

Benders’ reformulation: define the convex value function

(B) max  cx +v (x )= max{ ez : Ez ≤ d − Dx } : x ∈ X (note: clearly v (x ) =−∞ happens)

Clever trick[7]: use dualityto reformulate the inner problem

v (x ) = min w (d − Dx) : w ∈ W = { w : wE = e , w ≥ 0 } so that W does not depend on x

As usual, W = conv ( ext W = W0) + cone( ext rec W = W) =⇒

(B) max cx + v

v ≤ ¯w (d − Dx ) w ∈ W¯ 0 0 ≤ ¯ω(d − Dx ) ω ∈ W¯

x ∈ X

still very large, butwe can generate ¯w / ¯ω by computing v (x )

[7] Benders “Partitioning Procedures for Solving Mixed-Variables Programming Problems” Numerische Mathematik, 1962 A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 21 / 42

(26)

Benders’ decomposition

Select (small) B = ( B0 ⊂ W0) ∪ ( B⊂ W), solvemaster problem (BB) max cx + v

v ≤ ¯w (d − Dx ) w ∈ B¯ 0 0 ≤ ¯ω(d − Dx ) ω ∈ B¯

x ∈ X

= max  cx + vB(x ) : x ∈ X ∩ VB , where

vB(x ) = min{ ¯w (d − Dx ) : ¯w ∈ B0} ≥ v (x), VB ⊇ dom v

Find (primal) optimal solution x, compute v (x), get either ¯w or ¯ω, update either B0 or B, rinse & repeat

Benders’ decomposition ≡ Cutting Plane approach to (B)[5]

Spookily similar to the Lagrangian dual, ain’t it?

Except, constraints are now attached todual objects w / ¯¯ ω

(27)

Benders is Lagrange . . .

Block-diagonal case

(Π) max { cx : Ax = b , Ex ≤ d }

(∆) min yb + wd : wE + yA = c , w ≥ 0 Think of y as complicating variables in (∆), you get

(Π) max { cx : Ax = b , Ey ≤ d }

(∆) min yb + min{ wd : wE = c − yA , w ≥ 0 }

= min yb + max{ (c − yA)x : Ex ≤ d } i.e., the Lagrangian dual of (Π)

The value function of (∆) is the Lagrangian function of (Π)

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 23 / 42

(28)

. . . Lagrange is Benders . . .

Dual of (Π) (linear case X = { x : Ax = b })

(Π) max { cx + ez : Dx + Ez ≤ d , Ax = b }

(∆) min { yb + wd : yA + wD = c , wE = e , w ≥ 0 } Lagrangian dual of the dual constraints yA + wD = c (multiplier x ):

(∆) max  min{ yb + wd + (c − yA + wD)x : wE = e , w ≥ 0 }

= max  cx + min{ y (b − Ax) + w (d − Dx) : wE = e , w ≥ 0 }

= max  cx + min{ y (b − Ax) } +

min{ w (d − Dx ) : wE = e , w ≥ 0 }

= max  cx + min{ ez : Dx + Ez ≤ e } : Ax = b i.e., Benders’ reformulation of (Π)

The Lagrangian function of (∆) is the value function of (Π)

(29)

. . . and Both are the Cutting Plane Algorithm

Both Lagrange and Benders boil down to min  φ(λ) : λ ∈ Λ

with Λ and φ convex,nondifferentiable, both only implicitlyknown by means of a (potentially costly) oracle that, given ¯λ, provides:

either φ(¯λ) < ∞ and ¯g ∈ ∂φ(¯λ) ≡ φ(λ) ≥ φ(¯λ) + ¯g (λ − ¯λ) or φ(¯λ) = ∞ and a valid cut for Λ violated by ¯λ

“Natural” algorithm: the Cutting Plane method[5]

revised simplex method with mechanized pricing in the discrete case Many other variants/algorithms possible (cf. Part II)

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 25 / 42

(30)

The Nonlinear Case

Each f (x , ·) and G (x , ·) concave, Z convex:

(Π) max { f (x , z) : G (x , z) ≥ 0 , x ∈ X , z ∈ Z } (B) max  v (x) : x ∈ X

where v (x ) = max{ f (x , z) : G (x , z) ≥ 0 , z ∈ Z } (B) ≡ (Π) without assumptionson f (·, z), G (·, z) and X (hard) Which dualitywould you use? Lagrangian[8], of course

v (x ) = min max{ f (x, z) + λG (x, z) : z ∈ Z } : λ ≥ 0 Under appropriate constraint qualification, two cases occur:

either ∃ ¯λ ≥ 0 , ¯z ∈ Z s.t. v (x) = f (x, ¯z) + ¯λG (x, ¯z) > −∞

or v (x) = −∞ =⇒ { z ∈ Z : G (x, z) ≥ 0 } = ∅ =⇒ ∃ ¯ν ≥ 0 , ¯z ∈ Z s.t. max{ ¯νG (x, z) : z ∈ Z } = ¯νG (x, ¯z) < 0

[8] Geoffrion “Generalized Benders Decomposition” JOTA, 1972

(31)

The Nonlinear Case (cont.d)

General form of the master problem (B) max v

v ≤max{ f (x, z) + ¯λG (x , z) : z ∈ Z } λ ∈ Λ¯ 0 0 ≤max{ ¯νG (x, z) : z ∈ Z } ν ∈ Λ¯

x ∈ X

Er . . . how on Earth do youmanage those nasty “max”?

Must be that the “max” can be done independently of x ! Example: f (zi) concave, univariate

max P

ixif (zi) : P

ixizi ≤ c , zi ≥ 0 , Ax ≤ b , x ≥ 0 v (x ) = minλP

imax{xi(f (zi) − λzi) : zi ≥ 0 + λc v (x ) ≤P

iximax{ (f (zi) − ¯λzi) : zi ≥ 0 + ¯λc can optimize on the z independently from the x =⇒

“normal” linear cuts

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 27 / 42

(32)

The Integer Case

(33)

Block-structured Integer Programs

What if X combinatorial(e.g. , X = { x ∈ Zn : Ex ≤ d })?

(Π) max { cx : Ax = b , x ∈ X } The Lagrangian dual is

(∆) min yb + max { (c − yA)x : x ∈ X }

nothing changesif we can still efficiently optimize over X , e.g. due to size (decomposition) and/or structure (integrality)

. . .except we are solving a different problem:

( ¯Π)





max c P

¯ x ∈X x θ¯ x¯



A P

¯ x ∈X x θ¯ x¯

 = b P

¯

x ∈X θx¯ = 1 , θx¯≥ 0 x ∈ X¯

≡ max { cx : Ax = b , x ∈ conv (X ) } i.e., a (potentiallygood) relaxation of (Π)

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 29 / 42

(34)

Block-structured Integer Programs (cont.d)

Good news: ( ¯Π)better(not worse) than continuous relaxation (conv (X ) ⊆ { x ∈ Rn : Ex ≤ d })

Bad news: if (Πy) “too easy” (conv (X ) = { x ∈ Rn : Ex ≤ d }, a.k.a. integrality property), then ( ¯Π) sameas continuous relaxation Trade-off: (Πy) must beeasy, but not too easy(no free lunch) Anyway, at best gives good bounds=⇒

Branch & Bound with DW/Lagrangian/CG ≡ Branch & Price Branching nontrivial: may destroy subproblem structure

=⇒ branch on x (but (ΠB) is onθ)

Lamentably little support from off-the-shelf tools: master problem gives a valid bound only at termination, although subproblem always gives one(but not associated to continuous feasible solution)

(35)

Digression: How to Choose your Lagrangian relaxation

There may bemany choices

(Π) max  cx : Ax = b , Ex ≤ d , x ∈ Zn

0y) max  cx + y (b − Ax) : x ∈ X0 = { x ∈ Zn : Ex ≤ d } (Π00w) max  cx + w (d − Ex) : x ∈ X00= { x ∈ Zn : Ax = b } Thebest between (∆0) and (∆00) depends on integrality of X0, X00:

ifboth have it, both (∆0) and (∆00) ≡ continuous relaxation ifonly one has it, the one that doesnot, but ifboth don’t have it?

Here comes Lagrangian decomposition[9] (scale by 1/2) (Π) ≡ max  cx0+ cx00 : x0∈ X0 , x00 ∈ X00, x0 = x00

λ) max { (c + λ)x0 : x0∈ X0} + max { (c − λ)x00 : x00∈ X00} ( ¯∆) ≡ ( ¯Π) max  cx : x ∈ conv (X0) ∩ conv (X00)

better than both(but need to solve two hard subproblems)

[9] Guignard, Kim “Lagrangean decomposition: a model yielding stronger lagrangean bounds” Math. Prog., 1987

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 31 / 42

(36)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation

Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒ the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

(37)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints

Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒ the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 32 / 42

(38)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part

Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒ the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

(39)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints

Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒ the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 32 / 42

(40)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part

Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒ the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

(41)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints)

Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒ the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 32 / 42

(42)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒

the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

(43)

Geometry of Lagrangian Decomposition

Intersection betweenredandblue≡ grey ≡ continuous relaxation Lagrangian relaxation of blueconstraints shrinks thered (=⇒ grey) part Lagrangian relaxation of redconstraints shrinks theblue(=⇒ grey) part Lagrangian decomposition (bothredandblueconstraints) shrinks both=⇒

the grey partmore

But theintersection of convex hullsis larger (bad) than theconvex hull of the intersection

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 32 / 42

(44)

Digression: Alternative Good Formulations for conv (X )

(Under mild assumptions) conv (X ) is a polyhedron =⇒

conv (X ) = x ∈ Rn : E x ≤ ˜˜ d

There are good formulationsfor the problem

Except, practically all good formulations are too large

Ex ) d Ex ) d Ex ) d Ex ) d ~ ~ ~ ~ ~ ~ ~ ~ Ex ) d Ex ) d Ax = b Ax = b

Ax = b Ax = b

Very few exceptions (integrality property≈ networks) Good part: working in thenatural variable space But a few more variables do as a lot more constraints

(45)

Row generation/polyhedral approaches

The good news is: rows can be generated incrementally

Ax = b Ax = b

Ex ) d

Ax = b A

Ex ) d Ax A Ax A

c

Relevant concept: separator

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 34 / 42

(46)

Row generation/polyhedral approaches

The good news is: rows can be generated incrementally

Ax = b Ax = b

Ex ) d

Ax = b A

Ex ) d Ax A Ax A

c

Relevant concept: separator

(47)

Row generation/polyhedral approaches

The good news is: rows can be generated incrementally

E

1

x ) d

1

E

1

x ) d

1

Ax = b

E E

1

x ) d d

1

Ax = b c

Relevant concept: separator

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 34 / 42

(48)

Branch & Cut

R = (small) subset of row( indice)s, ERx ≤ dR reduced set Solve outer approximation to ( ¯Π)

( ¯ΠR) max { cx : Ax = b , ERx ≤ dR} feed the separator withprimal optimal solution x

Separator for (several sub-families of) facets of conv (X ) Several general approaches, countless specialized ones

Most often separators arehard combinatorial problemsthemselves (though using general-purpose MIP codes is an option[10])

May tail off,branchinguseful far before having solved ( ¯ΠX)

[10] Fischetti, Lodi, Salvagnin “Just MIP it!” MATHEURISTICS, Ann. Inf. Syst., 2009

(49)

Branch & Cut vs. Branch & Price

Which is best?

Row generation naturally allows multiple separators Very well integrated in general-purpose solvers (but harder to exploit “complex” structures)

Column generation naturally allows very unstructured separators Simpler to exploit “complex” structures

(but much less developed software tools) Column generation is row generation in the dual Then, of course, Branch & Cut & Price

(nice, but software issues remain and possibly worsen)

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 36 / 42

(50)

Staircase-structured Integer Programs

What if X = { x ∈ Zn : Ex ≤ d } combinatorial?

( ¯Π) max { cx + ez : Ax + Bz ≤ b , x ∈ X } Nothing changes. . . except(BB) now is combinatorial =⇒ hard However(BW) now is equivalent to ( ¯Π)=⇒ no branching needed (unless for solving (BB)) =⇒no Branch & Benders’

Conversely, everything breaks down if z ∈ Zm: there is no (workable, exact) dual of an Integer Program

Can do with “approximated” duals (strong formulations, RLT[11], . . . ) butequivalence lost =⇒ branching again

[11] Sen, Sherali “Decomposition with branch-and-cut approaches for two-stage stochastic MIP” Math. Prog., 2006

(51)

Example Applications, a.k.a.

the Reformulation

before the Reformulation

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 38 / 42

(52)

(Very) Classical decomposition approaches for (2SILP)

Here-and-now decisions are naturally complicating variables The (expected) value functiondecomposes by scenario

v (x ) = c0x +P

s∈Sπsmin cszs : Eszs ≤ bs− E0sx Alternative approach: split variables, introduce copy constraints

min c0x +P

s∈Sπscszs E0x ≤ b0

E0sxs+ Eszs ≤ bs , xs = x s ∈ S relax them in a Lagrangian fashion

Lagrangian approach chooses all variables for all scenarios (no unfeasibility), tries to make here-and-now agree by changing prices Difference more pronounced in multi-stage programs

(53)

Classical decomposition approaches for (MCND)

Design (z) variables are “naturally” linking / complicating What remains is flow/paths: convex even if integer

Benders’ cuts are metric inequalities[12]defining the multiflow feasibility Resource decomposition[13]: add artificial linking variables

dkxijk ≤uijk , P

k∈Kukij ≤ uij Different possible linking constraints:

(3): =⇒ flow (shortest path) relaxation (integrality property≡ “easy”) (2): =⇒ knapsack relaxation (only one integer variableper problem) different efficiency (algorithm-dependent[14,15]), others possible

[12] Costa, Cordeau, Gendron “Benders, metric and cutset inequalities for multicommodity capacitated network design”

Comput. Optim. Appl., 2009

[13] Kennington, Shalaby “An Effective Subgradient Procedure for Minimal Cost Multicomm. Flow Problems” Man. Sci. 1977 [14] Crainic, F., Gendron “Bundle-based relaxation methods for multicommodity capacitated fixed charge network design”

Disc. Appl. Math. 2001

[15] F., Gendron, Gorgone “On the computational efficiency of subgradient methods: a case study in combinatorial optimization”

Math. Prog. Comp. (submitted), 2015

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 40 / 42

(54)

Conclusions

(for now)

(55)

Conclusions (part I)

Structured (Integer) Programs are challenging, butstructure can be exploited: main tools are reformulation + duality

Two different approaches, “primal” and “dual”

Different twists, different conditions to work:

who is complicating(constraints vs. variables), buttricks(≡ other reformulations) can be used to create the desired structure who is reformulated(subproblem vs. master problem)

where integer/nonconvexitycan be (subproblem vs. master problem) where branching/cuttingis done (subproblem vs. master problem) where/which nonlinearities can be easily dealt with

(For linear programs)Lagrange is Benders’ in the dual, and vice-versa Both boil down to the 50+-years old Cutting Plane algorithm[5]

Has it aged well? We’ll see tomorrow

A. Frangioni (DI — UniPi) Advanced Decomposition Methods I Roma 2016 42 / 42

Riferimenti

Documenti correlati

Nerve Sheath Tumor of the Vagus. Hertzog P., Toty L., Personne CI., Colchen A., Belami J.: Tumeurs Nerveuses du Thorax. P.: Les tu- meurs Intra-thoraciques du Nerve Vague. Leur

Head and neck squamous cell carcinoma in Iranian patients and risk factors in young adults: a fifteen-year study. Data show ‘alarming’ rise in oral cancers among people in

Excluding these Data Deficient habitat types, the highest percentage of threatened marine habitats for the EU28 was in the Black Sea (78%) and for the EU28+, in

According to a line of thought maintained steadfastly in the literature by Doster, the so-called protein dynamical transition (PDT) of hydrated proteins is caused by

If the relaxation curve is produced by a known jump of temperature dT, being the R, T, ∆ε and dT values known, the enthalpy variation of the overall process is calculated from the

In this study, we use Pd/Ge for n-type, Pt/Ti/Pt/Cu for p+type ohmic contacts, and Ti/Pt/Cu for interconnect metals with platinum as the diffusion barrier to fabricate the Au-free,

Among patients with ILD associated with systemic sclerosis, the annual rate of decline in FVC was lower with nintedanib than with placebo; no clinical benefit of nintedanib

The main task semioticians of the present and future generations will have to face is twofold: on the one hand, weaving the semiotic tradition of the ‘School of Tartu’ with