Académique Documents
Professionnel Documents
Culture Documents
Column Generation
One of the recurring ideas in optimization is that of decomposition. The idea of a decom-
position method is to remove some of the variables from a problem and handle them in a
master problem. The resulting subproblems are often easier and more efficient to solve.
The decomosition method iterates back and forth between the master problem and the sub-
problem(s), exchanging information in order to solve the overall problem to optimality.
From the Decomposition Theorem 3.16 and Corollary 3.61 we know that any point x ∈ P2
is of the form # #
x= λk xk + µj rj (10.2)
k∈K j∈J
$
with k∈K λk = 1, λk " 0 for k ∈ K, µj " 0 for j ∈ J. The vectors xk and rj are the
extreme points and extreme rays of P2 . Using (10.2) in (10.1) yields:
⎛ ⎞
# #
max cT ⎝ λk xk + µj rj ⎠ (10.3a)
k∈K j∈J
⎛ ⎞
# #
A1 ⎝ λk xk + µj rj ⎠ ! b1 (10.3b)
k∈K j∈J
#
λk = 1 (10.3c)
k∈K
J
λ ∈ RK
+ , µ ∈ R+ (10.3d)
144 Column Generation
Rearranging terms, we see that (10.3) is equivalent to the following Linear Program:
! !
max (cT xk )λk + (cT µj )rj (10.4a)
k∈K j∈J
! !
(A1 xk )λk + (A1 rj )µj ! b1 (10.4b)
k∈K j∈J
!
λk = 1 (10.4c)
k∈K
J
λ ∈ RK
+ , µ ∈ R+ (10.4d)
Let us compare the initial formulation (10.1) and the equivalent one (10.4):
Formulation number of variables number of constraints
(10.1) n m1 + m2
(10.4) |K| + |J| m1 − 1
In the transition from (10.1) to (10.4) we have decreased the number of contstraints by m2 −
1, but we have increased the number of variables from n to |K| + |J| which is usually much
larger than n (as an example that |K| + |J| can be exponentially larger than n consider the
unit cube {x ∈ Rn : 0 ! x ! 1} which has 2n constraints and 2n extreme points). Thus, the
reformulation may seem like a bad move. The crucial point is that we can still apply the
Simplex method to (10.4) without actually writing down the huge formulation.
Let us abbreviate (10.4) (after the introduction of slack variables) by
" #
max wT η : Dη = d, η " 0 , (10.5)
The pricing step works as follows. Given a basis B and correspondig basic solution η =
(ηB , ηN ) = (D−1 T T
B d, 0) the Simplex method solves y DB = wB to obtain y. If wN −
T
y DN ! 0, then y is feasible for the dual (10.6) and we have an optimal solution for (10.5),
since
wT η = wTB ηB = yT DB ηB = yT d = dT y,
and for any pair (η ′ , y ′ ) of feasible solutions for (P) and (D), respectively, we have
wT η ! (DT y ′ )T η ′ = (y ′ )T Dη ′ = dT y ′ .
If on the other hand, wi − dTi y > 0 for some i ∈ N, we can improve the solution by adding
variable ηi to the basis and throwing out another index. We first express the new variable in
terms of the old basis, that is, we solve DB z = di . Let η(ε) be defined by ηB (ε) := ηB −εz,
ηi (ε) := ε and zero for all other variables. Then,
Thus, for ε > 0 the new solution η(ε) is better than the old one η. The Simplex method
now chooses the largest possible value of ε, such that η(ε) is feasible, that is η(ε) " 0. This
operation will make one of the old basic variables j in B become zero. The selection of j is
usually called the ratio test, since j is any index in B such that zj > 0 and j minimizes the
ratio ηi /zi over all i ∈ B having zi > 0.
Thus, xk will be the variable entering the basis in the Simplex step (or in other words,
#A1 xk $
1 will be the column entering the Simplex tableau).
Case 2: The problem (10.7) is unbounded.
Then, there is an extreme ray rj for some j ∈ J with (cT − ȳT A1 )rj > 0. The reduced
cost of rj is given by
! 1 j"
A r
w|K|+j − yT D·,|K|+j = cT rj − yT = cT rj − ȳT A1 rj > 0.
0
#A1 rj $
So, rj will be the variable entering the basis, or 0 will be the column entering
the Simplex tableau.
We have decomposed the original problem (10.1) into two problems: the master prob-
lem (10.4) and the pricing problem (10.7). The method works with a feasible solution
for the master problem (10.4) and generates columns of the constraint matrix on demand.
Thus, the method is also called column generation method.
We now turn to integer programs. Decomposition methods are particularly promising if the
solution set of the IP which we want to solve has a “decomposable structure”. In particular,
we will be interested in integer programs where the constraints take on the following form:
A1 x1 +A2 x2 + . . . Ak xk = b
D1 x1 ! d1
D2 x2 ! d2 (10.8)
.. .. ..
. . .
DK xK ! dK
#K k k
are independent except for the linking constraint k=1 A x = b. Our integer program
which we want to solve is:
$ K K
&
% %
z = max (ck )T xk : Ak xk = b, xk ∈ Xk for k = 1, . . . , K . (10.9)
k=1 k=1
In the sequel we assume that each set Xk contains a large but finite number of points:
! "
Xk = xk,t : t = 1, . . . , Tk . (10.10)
An extension of the method also works for unbounded sets Xk but the presentation is even
more technical since we also need to handle extreme rays. In the Dantzig-Wolfe decompo-
sition in Section 10.1 we reformulated the problem to be solved by writing every point as
a convex combination of its extreme points (in case of bounded sets there are no extreme
rays). We will adopt this idea to the integer case.
Substituting into (10.9) gives an equivalent integer program, the IP master problem:
Tk
K !
!
(IPM) z = max ((ck )T xk,t )λk,t (10.11a)
k=1 t=1
Tk
K !
!
(Ak xk,t )λk,t = b (10.11b)
k=1 t=1
Tk
!
λk,t = 1 for k = 1, . . . , K (10.11c)
t=1
λk,t ∈ {0, 1} for k = 1, . . . , K and t = 1, . . . , Tk
(10.11d)
In order to solve the IP master problem (10.11) we first solve its Linear Programming
relaxation which is given by
Tk
K !
!
(LPM) zLPM = max ((ck )T xk,t )λk,t (10.12a)
k=1 t=1
Tk
K !
!
(Ak xk,t )λk,t = b (10.12b)
k=1 t=1
Tk
!
λk,t = 1 for k = 1, . . . , K (10.12c)
t=1
λk,t ! 0 for k = 1, . . . , K and t = 1, . . . , Tk
(10.12d)
The method to solve (10.12) is the same we used for the reformulation in Section 10.1: a
column generation technique allows us to solve (10.12) without writing down the whole
Linear Program at once. We always work with a small subset of the columns.
Observe that (10.12) has a column
0
⎛ ⎞
⎜ .. ⎟
⎜ . ⎟
(ck )T xk,t
⎛ ⎞
⎜ ⎟
⎜ 1 ⎟ ←k
⎝ Ak xk,t ⎠ with ek = ⎜
⎜ 0 ⎟
⎟
ek ⎜ . ⎟
⎜ ⎟
⎝ .. ⎠
0
Let πi , i = 1, . . . , m be the dual variables associated with the linking constraints (10.12b)
and µk , k = 1, . . . , K be the dual variables corresponding to the constraints (10.12c). Con-
straints (10.12c) are also called convexity constraints. The Linear Programming dual
of (10.1) is given by:
m
! K
!
(DLPM) min bi πi + µk (10.13a)
i=1 k=1
π (Ak xk ) + µk ! (ck )T xk
T
for all xk ∈ Xk , k = 1, . . . , K (10.13b)
Suppose that we have a subset of the columns which contains at least one column for
each k = 1, . . . , K such that the following Restricted Linear Programming Master Problem
is feasible:
(RLPM) z̄RLPM = max c̄T λ̄ (10.14a)
Āλ̄ = b̄ (10.14b)
λ̄ ! 0 (10.14c)
The matrix Ā is a submatrix of the constraint matrix, λ̄ denotes the restricted set of variables
and c̄ is the corresponding restriction of the cost vector to those variables. Let λ̄∗ be an
optimal solution of (10.14) and (π, µ) ∈ Rm × RK be a corresponding dual solution (which
is optimal for the Linear Programming dual to (10.14)).
Clearly, any feasible solution to the (RLPM) (10.14) is feasible for (LPM) (10.12), so we
get
m
! K
!
z̄RLPM = c̄T λ̄∗ = πi bi + µk " zLPM " z.
i=1 k=1
The pricing step is essentially the same as in Section 10.1. We need to check whether for
each x ∈ Xk the reduced cost is nonpositive, that is whether (ck )T x − πT Ak x − µk " 0
(equivalently, this means to check whether (π, µ) is feasible for the Linear Programming
dual (DLPM) (10.13) of (LPM) (10.12)). Instead of going through the reduced costs one
by one, we solve K optimization problems. While in Section 10.1.2 the corresponding op-
timization problem was a Linear Program (10.7), here we have to solve an integer program
for each k = 1, . . . , K:
ζk = max ((ck )T − πT Ak )x − µk (10.15a)
" #
n
x ∈ Xk = xk ∈ Z+k : Dk xk " dk (10.15b)
Two cases can occur, ζk " 0 and ζk > 0 (the case that (10.15) is unbounded is impossible
since we have assumed that Xk is a bounded set).
Case 2: ζk ! 0 for k = 1, . . . , K
In this case, the dual solution (π, µ) which we obtained for the dual of (RLPM) (10.14)
is also feasible for the Linear Programming dual (DLPM) (10.13) of (LPM) (10.12)
and we have
m
! K
!
z̄LPM ! πi bi + µk = c̄T λ̄∗ = z̄RLPM ! z̄LPM .
i=1 k=1
We can derive an upper bound for zLPM during the run of the algorithm by using Linear
Programming duality. By definition, we have
ζk " ((ck )T − πT Ak )xk − µk for all xk ∈ Xk , k = 1, . . . , K.
Let ζ = (ζ1 , . . . , ζK ). Then, it follows that (π, µ+ζ) is feasible for the dual (DLPM) (10.13).
Thus, by Linear Programming duality we get
m
! K
! K
!
zLPM ! πi bi + µk + ζk . (10.16)
i=1 k=1 k=1
Finally, we note that there is an alternative stopping criterion to the one we have already
derived in Case 2 above. Let (x̄1 , . . . , x̄K ) be the K solutions of (10.15), so that ζk =
((ck )T − πT Ak )x̄k − µk . Then
K
! K
! K
! K
!
(ck )T x̄k = πT Ak xk + µk + ζk . (10.17)
k=1 k=1 k=1 k=1
"K
So, if (x̄1 , . . . , x̄K ) satisfies the linking constraint k k
k=1 A x = b, then we get from (10.17)
that
K
! K
! K
!
(ck )T x̄k = πT b + µk + ζk . (10.18)
k=1 k=1 k=1
The quantity on the right hand side of (10.18) is exactly the upper bound (10.16) for
the optimal value zLPM of the Linear Programming Master. Thus, we can conclude that
(x̄1 , . . . , x̄K ) is optimal for (LPM).
K
!
zIP = min xk
0 (10.19a)
k=1
K
!
xk
i ! bi , i = 1, . . . , m (10.19b)
k=1
!m
wi xk k
i " Wx0 , k = 1, . . . , K (10.19c)
i=1
xk
0 ∈ {0, 1}, k = 1, . . . , K (10.19d)
k
xi ∈ Z+ , i = 1, . . . , m, k = 1, . . . , K (10.19e)
Assume that W = 20 and you have the following orders for paper:
i width wi number of orders bi
1 9 511
2 8 301
3 7 263
4 6 383
"
Then i bi = 1458. If we apply a branch-and-bound technique (see Chapter 8), the first
step would be to solve the LP-relaxation of (10.19). It turns out that the optimum solution
value of this LP is 557.34 The optimum value zIP equals 601 which means that the LP-
relaxation is rather weak.
Let us define a pattern to be a vector p = (ap p m
1 , . . . , am ) ∈ Z + which
"mindicates that ap
i rolls
p
of width wi should be cut from a roll. A pattern is feasible, if i=1 ai " W. By P we
denote the set of all feasible patterns. We introduce the new variables λp ∈ Z+ for p ∈ P,
where λp denotes the number of rolls of width W which are cut according to pattern p.
Then, we can also formulate the cutting-stock problem alternatively as
!
zIP = min λp (10.20a)
p∈P
!
ap
i λp ! bi , i = 1, . . . , m (10.20b)
p∈P
λp ! 0, p∈P (10.20c)
λp ∈ Z, p ∈ P. (10.20d)
Of course, the formulation (10.20) contains a huge number of variables. Again, the cor-
responding Linear Programming Relaxation is obtained by dropping the integrality con-
straints (10.20d) and we denote by zLP its value:
!
(LPM) zLP = min λp (10.21a)
p∈P
!
ap
i λp ! bi , i = 1, . . . , m (10.21b)
p∈P
λp ! 0, p∈P (10.21c)
As in the previous section, let us first solve the LP-relaxation of the LP-master problem.
We start with a subset P ′ ⊆ P of the columns such that the Restricted Linear Programming
Master Problem
!
(RLPM) z ′ = min λp (10.22a)
p∈P
!
ap
i λp ! bi , i = 1, . . . , m (10.22b)
p∈P
λp ! 0, p ∈ P′ (10.22c)
is feasible. Such a subset P ′ can be found easily. Let us show this for our example.
We have four widths which we need to cut: 9, 8, 7, 6. If we cut only rolls of width 9,
then we can cut two rolls out of any roll of width 20. So, a pattern would be (2, 0, 0, 0)T .
Similarly, for the widths 9 and 7 we obtain the patterns (0, 2, 0, 0)T and (0, 0, 2, 0)T . Finally
for width 6 we can cut three rolls which gives us the pattern (0, 0, 0, 3)T . So, the initial
restricted LP-master is
min λ1 + λ2 + λ3 + λ4 (10.23a)
2 0 0 0 511
⎛ ⎞⎛ ⎞ ⎛ ⎞
λ1
⎜0 2 0 0⎟ ⎜λ2 ⎟ ⎜ 301 ⎟
⎝0 0 2 0⎠ ⎝λ3 ⎠ ! ⎝ 263 ⎠
⎜ ⎟⎜ ⎟ ⎜ ⎟ (10.23b)
0 0 0 3 λ4 383.
λ ! 0. (10.23c)
In order to emulate the pricing step of the Simplex method, we need to look at the LP-dual
of (10.21) which is
m
!
(DLPM) max bi πi (10.24a)
i=1
!m
ap
i πi " 1, p∈P (10.24b)
i=1
π ! 0. (10.24c)
We need to look at the reduced cost for all patterns p and check, whether they are nonneg-
ative. The reduced cost of a pattern p is given by
m
!
c̄p = 1 − ap
i πi . (10.26)
i=1
Again, observe that checking whether c̄p ! 0 for all patterns p is the same as checking
whether the optimum solution π for the restricted primal is feasible for the dual master
(DLPM).
If the optimum solution of (10.27) is greater than 1, we have found a pattern with negative
reduced cost. If the optimum solution is at most 1, then there is no pattern with nega-
tive reduced cost. Problem (10.27) is a Knapsack Problem! That this is the case is not a
coincidence of our example.
For the cutting-stock problem, using (10.26), a pattern with minimum reduced cost is ob-
tained by solving the Knapsack problem:
m
!
max πi ai (10.28a)
i=1
!m
wi ai ! W (10.28b)
i=1
a ∈ Zm
+. (10.28c)
Thus, we can continue and obtain an optimum solution of (LPM) (10.21). For the cutting-
stock problem and many other problems, this optimum solution is quite useful and the
relaxation (LPM) is much stronger than that of the first relaxation. We will work on the
exact strength in the next Section 10.2.3.
For the moment, let us have a closer look at the optimum solution for our example. It is
given by the following patterns with corresponding values:
zLP = 600.375.
What can we learn from this solution? By the fact that (LPM) is a relaxation of (IPM) we
get zIP " zLP = 600.375. Since the optimum solution of (IPM) must have an integral value,
we can even round up and get
zIP " 601. (10.30)
Suppose now that we round up our values from (10.29). This gives us the vector x ′ :=
(256, 88, 132, 126). Clearly, x ′ must be feasible for (IPM), since all demands are satisfied
by the rounding-process. The solution value of x ′ is 256 + 88 + 132 + 126 = 601. So,
by (10.30) we have
zIP ∈ {601, 602}.
The estimate zIP ! zLP +m in (10.31) is pessimistic. In our example, the cost was increased
only by 1 and not by 4.
We can also round down the entries of the optimum LP-solution, thereby obtaining an
infeasible solution. For the example we get the solution
of value 598 and we are missing 1 roll of width 9, 2 rolls of width 8, 1 roll of width 7 and
2 rolls of width 6. If we cut each of the additional three patterns
once, we obtain a feasible solution of value 598 + 3 = 601 which must be optimal! Of
course, in general such a solution is not that easy to see.
We first investigate what kind of bounds the Master Linear Program will provide us with.
Theorem 10.1 The optimal value zLPM of the Master Linear Program (10.12) satisfies:
! K K
#
" "
LPM k T k k k k k
z = max (c ) x : A x = b, x ∈ conv(X ) for k = 1, . . . , K .
k=1 k=1
$Tk
Proof: We obtain the Master Linear Program (10.12) by substituting xk = t=1 λk,t xk,t ,
$Tk k
t=1 λk,t = 1 and λt,k " 0 for t = 1, . . . , Tk . This is equivalent to substituting x ∈
k
conv(X ). ✷
The form of the integer program (10.9) under study suggests an alternative approach to
solving the problem. We could dualize the linking constraints to obtain the Lagrangean
dual (see Section 6.3)
wLD = minm L(u),
u∈R
where
! K
#
"
k T k T k k k k
L(u) = max (c ) x + u (b − A x ) : x ∈ X for k = 1, . . . , K
k=1
! K
#
"
k T T k k T k k
= max (((c ) − u A )x + u b) : x ∈ X for k = 1, . . . , K
k=1
Observe that the calculation of L(u) decomposes automatically into K independent sub-
problems:
"K $ %
L(u) = uT b + max ((ck )T − uT Ak )xk : xk ∈ Xk .
k=1
Theorem 6.17 tells us that
! K K
#
" "
k T k k k k k
wLD = max (c ) x : A x = b, x ∈ conv(X ) for k = 1, . . . , K .
k=1 k=1
Comparing this result with Theorem 10.1 gives us the following corollary:
Corollary 10.2 The value of the Linear Programming Master and the Lagrangean dual
obtained by dualizing the linking constraints coincide:
zLPM = wLD .
✷
where $ %
n
Xk = xk ∈ B+k : Dk xk " dk
for k = 1, . . . , K. As in Section 10.2 we reformulate& the problem. Observe ' that in the binary
case each set Xk is automatically bounded, Xk = xk,t : t = 1, . . . , Tk . so that
⎧ ⎫
⎨ Tk Tk ⎬
" "
Xk = xk ∈ Rnk : xk = λk,t xk,t , λk,t = 1, λk,t ∈ {0, 1} for t = 1, . . . , Tk .
⎩ ⎭
t=1 t=1
"Tk
Let x̄k = t=1 λ̄k,t xk,t and x̄ = (x̄1 , . . . , x̄k ). Since all the xk,t ∈ Xk are different binary
vectors, it follows that x̄k ∈ Bnk if and only if all λ̄k,t are integers.
So, if λ̄ is integral, then x̄ is an optimal solution for (10.33). Otherwise, there is k0 such that
k
/ Bnk0 , i.e, there is t0 such that x̄t00 ∈
x̄k0 ∈ / B. So, like in the branch-and-bound scheme in
k
Chapter 8 we can branch on the fractional variable x̄t00 : We split the feasible set S of all
feasible solutions into S = S0 ∪ S1 , where
# $
k
S0 = S ∩ x : xj00 = 0
# $
k
S1 = S ∩ x : xj00 = 1 .
This is illustrated in Figure 10.1(a). We could also branch on the column variables and
split S into S = S0′ ∪ S1′ where
# $
S0′ = S ∩ λ : λk1 ,t1 = 0
# $
S1′ = S ∩ λ : λk1 ,t1 = 1 ,
where λkt1 is a fractional varialbe in the optimal solution λ̄ of the Linear Programming
1
Master (see Figure 10.1(b)). However, branching on the column variables leads to a highly
unbalanced tree for the following reason: the branch with λk1 ,t1 = 0 exludes just one col-
umn, namely the t1 th feasible solution of the k1 st subproblem. Hence, usually branching
on the original variables is more desirable, since it leads to more balanced enumeration
trees.
S S
xk
t =0 xk
t =1 λk
t =0 λk
t =1
S0 S1 S0′ S1′
(a) Branching on the original variables (b) Branching on the column variables
k
We return to the situation where we branch on an original variable xj00 . We have
T
k0
k
! k ,t
xj00 = λk0 ,t xj00 .
t=1
k ,t
Recall that xj00 ∈ B, since each vector xk0 ,t ∈ Bnk0 .
k k ,t
If we require that xj00 = 0, this implies that λk0 ,t = 0 for all t with xj00 > 0. Put in
k k ,t
Similarly, if we require that xj00 = 1, this implies that λk0 ,t = 0 for all t with xj00 > 0.
k
In summary, if we require in the process of branching that xj00 = δ for some fixed δ ∈ {0, 1},
k ,t
then λk0 ,t > 0 only for those t such that xj 0 = δ. Thus, the Master Problem at node
" #
k
Sδ = S0 = S ∩ x : xj00 = δ is given by:
Tk
!! !
(IPM)(Sδ ) max ((ck )T xk,t )λk,t + ((ck0 )T xk0 ,t )λk0 ,t (10.36a)
k̸=k0 t=1 k ,t
t:xj 0 =δ
0
Tk
!! !
(Ak xk,t )λk,t + (Ak0 xk0 ,t )λk0 ,t = b (10.36b)
k̸=k0 t=1 k ,t
t:xj 0 =δ
0
Tk
!
λk,t = 1 for k ̸= k0 (10.36c)
t=1
λk,t ∈ {0, 1} for k = 1, . . . , K and t = 1, . . . , Tk (10.36d)
The new problems (10.36) have the same form as the original Master Problem (10.34). The
k ,t
only difference is that some columns, namely all those where xj00 ̸= δ, are excluded. This
means in particular that for k ̸= k0 each subproblem