The Decision Rule Approach to Optimisation Under Uncertainty

© All Rights Reserved

7 vues

The Decision Rule Approach to Optimisation Under Uncertainty

© All Rights Reserved

- 138uf_ME_T_Indus.pdf
- Chance-Constrained Problems With Stochastic Quadratic
- Hydrothermal Scheduling Using Benders Decomposition- Accelerating Techniques
- Linear Programming
- Scheduling Optimisation of Chemical Process Plant
- elenco_abil
- Adaptive Robust Optimization for the Security Constrained Unit Commitment Problem.pdf
- Draft Ujian 1 (Teknik Kuantitatif
- Optimization
- 38-44.pdf
- 2008
- LINEAR PROGRAMMING.pptx
- CEC2015 Stochastic Portfolio
- tcad09-rumble.pdf
- or_ug_proj_7310
- Module 1 - Introduction to Or
- Ch3 and 4 - Group two PPT.pptx
- Sensitivity Analysis of Linear Programming
- 19 Linear Progamming.pptx
- Cad

Vous êtes sur la page 1sur 40

180 Queens Gate, London SW7 2AZ, United Kingdom.

Abstract

Decision-making under uncertainty has a long and distinguished history in operations

research. However, most of the existing solution techniques suffer from the curse of

dimensionality, which restricts their application to small and medium-sized problems,

or they rely on simplifying modelling assumptions (e.g. absence of recourse actions).

Recently, a new solution technique has been proposed, which we refer to as the deci-

sion rule approach. By approximating the feasible region of the decision problem, the

decision rule approach aims to achieve tractability without changing the fundamental

structure of the problem. In this paper, we survey the major theoretical results relating

to this approach, and we investigate its potential in operations management.

1 Introduction

Operations managers frequently take decisions whose consequences extend well into the

future. Inevitably, the outcomes of such choices are affected by significant uncertainty:

changes in customer taste, technological advances and unforeseen stakeholder actions all have

a bearing on the suitability of these decisions. It is well documented in theory and practice

that disregarding this uncertainty often results in severely suboptimal decisions, which can

in turn lead to the underperformance or complete breakdown of production processes. Yet,

1

researchers and practitioners frequently neglect uncertainty and instead focus on the expected

or most likely market developments. We argue in the following that this is caused by the

inherent limitations of the mainstream approaches to decision-making under uncertainty.

While our argument applies to other methods as well, we will restrict our discussion to

stochastic programming and robust optimisation. The remainder of the paper advocates the

decision rule approach as a promising candidate to remedy some of these shortcomings.

random vector that follows a known distribution. The two major types of stochastic programs

are recourse problems and chance-constrained problems. In the classical two-stage recourse

problem, the decision maker selects a here-and-now decision before the realisations of the

uncertain parameters are known. This decision is complemented by a wait-and-see action

that is chosen after the parameter realisations are observed. The goal is to select here-and-

now and wait-and-see decisions that optimise a risk functional, such as the expected value

or the variance of a cost function. For example, a production planning problem may ask for

a production schedule that minimises the expected inventory holding and backlogging costs

when customer demands are uncertain. A two-stage recourse formulation of this problem

could involve a here-and-now manufacturing decision that is complemented by a sales decision

once the customer demands are observed.

problems, where a sequence of uncertain parameters is observed over time. In this setting,

the decision maker chooses a recourse action whenever some of the parameter realisations are

observed. For instance, a multi-stage formulation of our production planning problem could

accommodate a weekly revision of the production schedule to avoid inventory depletions

or overruns. Although two-stage and multi-stage recourse problems are very similar, their

computational complexity differs considerably. While the exact solution of linear two-stage

recourse problems is #P-hard [24], approximate solutions can be found quite efficiently via

Monte Carlo sampling techniques [13,54]. Multi-stage recourse problems, on the other hand,

are believed to be computationally intractable already when medium-accuracy solutions are

sought [57]. For this reason, the majority of recourse problems studied in the operations

management literature are based on two-stage formulations.

2

here-and-now decisions. The goal is to optimise a deterministic objective, subject to the

requirement that constraints involving uncertain parameters are satisfied with at least a pre-

specified probability. Going back to our production planning problem, a chance-constrained

model could ask for a minimum cost production plan that satisfies the uncertain demands

with a probability of at least 95%. Unfortunately, apart from some benign special cases,

chance-constrained problems are notoriously difficult to solve [52]. Indeed, verifying the

satisfaction of a chance constraint requires multidimensional integration, which becomes

computationally demanding if there are more than a few uncertain problem parameters. As

a result, one typically resorts to sampling approximations [14, 49] or reformulations using

conditional value-at-risk constraints [53]. Chance-constrained problems involving wait-and-

see decisions are studied in [26, 47]. A more detailed review of stochastic programming can

be found in [13, 54].

problems where the constraints must be satisfied with probability one. Thus, a robust opti-

misation problem asks for a here-and-now decision that optimises a deterministic objective

and at the same time satisfies a constraint set for every possibly realisation of the uncertain

parameters. In our production planning example, a robust formulation may be appropriate

if lost sales should be avoided at all costs, or if the decision maker is unable to assign a proba-

bility distribution to the uncertain customer demands. Robust optimisation has gained wide

popularity in recent years due to its favourable computational complexity. Many classes

of optimisation problems, including linear programs, conic-quadratic programs and mixed-

integer linear programs, allow for robust formulations than can be solved with essentially

the same effort as the corresponding deterministic problems [4,10]. Robust optimisation has

been extended to accommodate chance constraints [16] and various measures of risk [46].

Similar to chance-constrained stochastic programming, however, robust optimisation tra-

ditionally only accounts for here-and-now decisions. The robust optimisation literature is

surveyed in [4, 11].

that they either do not account for recourse actions or that they are computationally demand-

ing. Neglecting recourse possibilities can lead to overly conservative decisions as operations

managers typically have the option to revise plans once new information becomes available.

Likewise, solution techniques must scale gracefully with the size of the problem to be useful

3

for practitioners. To date, this dichotomy between tractability and modelling accuracy has

proved to be a major obstacle to the successful application of stochastic programming and

robust optimisation to real-life operations management problems.

In this paper, we make a case for the decision rule approach to decision-making under

uncertainty. The decision rule approach allows us to approximately solve two-stage and

multi-stage recourse problems, chance-constrained problems and robust optimisation prob-

lems in a computationally efficient way. The approach inherits the tractability of robust

optimisation, but it permits a faithful modelling of the dynamic aspects of a decision prob-

lem. The purpose of this paper is twofold. Firstly, we survey the theoretical findings relating

to the decision rule approach. To date, these results are scattered over several papers, and

due to their technical terminology they attracted little attention outside the mathematical

optimisation community. We aim to present the key findings of this field in a comprehensible

and unified framework, and we extend the methodology to problems with integer here-and-

now decisions. The second purpose of the paper is to critically examine the potential of

the decision rule approach in operations management. To this end, we apply the approach

to two stylised case studies, and we compare decision rules with alternative methods to

decision-making under uncertainty.

management. Recourse problems and chance-constrained problems have been developed

for the design of supply chains [33, 43, 55], the planning of facility layouts [22, 39, 58], pro-

duction planning and inventory control [3, 45], production scheduling [50, 62] and project

management [21, 34]. There are much fewer applications of robust optimisation techniques

to operations management problems. The paper [12] studies the optimal control of a multi-

echelon supply chain that is defined on a network. A robust inventory control problem with

ambiguous demands is studied in [56]. Extension to simultaneous inventory control and

dynamic pricing, as well as two-echelon supply chains with flexible commitments contracts,

can be found in [2, 5]. The papers [35, 40] develop robust formulations of well known de-

terministic production scheduling problems, and robust project management problems are

studied in [19, 63].

Besides stochastic programming and robust optimisation, decision problems with un-

certain parameters are often formulated as stochastic dynamic programs. For textbook

4

introductions to stochastic dynamic programming, as well as its tractable approximation via

neuro-dynamic programming, the reader is referred to [6, 51].

decision problems that we are interested in. We introduce linear decision rules in Section 3,

and we generalise the approach to nonlinear decision rules in Section 4. Section 5 describes

an extension to integer here-and-now decisions. We close with two operations management

case studies in Section 6.

Notation For a square matrix A Rnn we denote by Tr (A) the trace of A, that is,

the sum of its diagonal entries. For A, B Rmn the inequalities A B and A B

are understood to hold component-wise, and A denotes the transpose of A. For any real

number c we define c+ = max{c, 0}. We define conv X as the convex hull of a set X.

We study dynamic decision problems under uncertainty of the following general structure.

A decision maker first observes an uncertain parameter 1 Rk1 and then takes a decision

x1 (1 ) Rn1 . Subsequently, a second uncertain parameter 2 Rk2 is revealed, in response

to which the decision maker takes a second decision x2 (1 , 2 ) Rn2 . This sequence of alter-

nating observations and decisions extends over T stages, where at any stage t = 1, . . . , T the

decision maker observes an uncertain parameter t and selects a decision xt (1 , . . . , t ). We

emphasise that a decision taken at stage t depends on the whole history of past observations

1 , . . . , t , but it may not depend on the future observations t+1 , . . . , T . This feature reflects

the non-anticipative nature of the dynamic decision problem and ensures its causality. To

t

simplify notation, we define the history of observations up to time t as t = (1 , . . . , t ) Rk ,

P

where k t = ts=1 ks . Moreover, we let = (1 , . . . , T ) Rk denote the vector concatenation

of all uncertain parameters, where k = k T .

5

We assume that the decision taken at stage t incurs a linear cost ct ( t ) xt ( t ), where the

vector of cost coefficients depends linearly on the observation history, that is, ct ( t ) = Ct t

t

for some matrix Ct Rnt k . We also assume that the decisions are required to satisfy a

set of linear inequality constraints to be detailed below. The decision makers objective is

to select the functions or decision rules x1 (), . . . , xT (), which map observation histories to

decisions, such that the expected total cost is minimised while all inequality constraints are

satisfied. Formally, this decision problem can be represented as follows.

T

!

X

minimise E ct ( t ) xt ( t )

t=1

!

T

X

(P)

subject to E Ats xs ( s ) t bt ( t )

s=1

, t = 1, . . . , T

xt ( t ) 0

Here, E () denotes expectation with respect to the random parameter , while stands for

the range of all possible values that can adopt. Below, we will refer to as the uncertainty

set and to any as a scenario. We will henceforth assume that is a bounded polyhedron

of the form = { Rk : W h} for some W Rlk and h Rl . Note that the stage-t

decisions xt ( t ) in problem P are parameterised in t , that is, every observation history t

corresponding to some scenario gives rise to nt ordinary decision variables. Since the

polyhedron typically contains infinitely many scenarios, problem P accommodates in fact

infinitely many decision variables.

matrices Ats Rmt ns and uncertainty-affected right-hand side vectors bt ( t ) Rmt . We

t

assume that bt ( t ) = Bt t for some matrices Bt Rmt k . Note that the stage-t constraints

are conditioned on the stage-t observation history t , where E (| t) denotes conditional

expectation with respect to given t . Hence, E (| t) treats 1 , . . . , t as deterministic

variables and takes the expectation only with respect to the future observations t+1 , . . . , T .

This implies that the stage-t constraints are parameterised in t in a similar fashion like

the stage-t decisions. Indeed, every t corresponding to some scenario gives rise to

mt ordinary linear constraints. Since has typically infinite cardinality, problem P thus

accommodates infinitely many constraints. Intuitively, problem P can therefore be viewed

as an infinite-dimensional generalisation of the standard linear program.

6

2.2 Expressiveness of Problem P

The general decision problem under uncertainty described in Section 2.1 provides consider-

able modelling flexibility. Indeed, as we will show in this section, problem P encapsulates

conventional deterministic and stochastic linear programs, robust optimisation problems and

tight convex approximations of chance-constrained programs as special cases.

Deterministic linear programs If the uncertainty set contains only one single scenario,

that is, if = { }, then problem P reduces to a deterministic linear program. In this case

only the decisions and constraints corresponding to = are relevant, while all (conditional

and unconditional) expectations become redundant and can thus be eliminated. Introducing

the finite problem data

A11 A1T b1 (1) c1 (1 )

. .. .. . .

A=

.. . . , b=

.

. and c =

. ,

.

AT 1 AT T bT (T ) cT (T )

minimise c x

(LP)

subject to Ax b, x 0,

Remark 2.1 (Deterministic Decisions and Constraints). Deterministic decisions and con-

straints can conveniently be incorporated into the general problem P by requiring that 1 is

equal to 1 for all . This can always be enforced by appending the equation 1 = 1 to the

definition of and has the effect that, in the first stage, only the decisions and constraints

corresponding to the scenario 1 = 1 have physical relevance. From now on, we will always

assume that k1 = 1 and that any = (1 , . . . , T ) satisfies 1 = 1.

stochastic program with recourse if we ensure that the stage-t constraints are not affected by

the future decisions xt+1 ( t+1 ), . . . , xT ( T ). This is achieved by setting Ats = 0 for all t < s

and has the effect that the term inside the conditional expectation of the stage-t constraint

7

becomes independent of t+1 , . . . , T . Since E (| t ) treats t as a constant, the conditional

expectation thus becomes redundant and can be omitted. Therefore, problem P reduces to

the following multistage stochastic program in standard form [13, 36].

T

!

X

minimise E ct ( t ) xt ( t )

t=1

t

X (SP)

subject to Ats xs ( ) bt ( t )

s

s=1

, t = 1, . . . , T

xt ( t ) 0

known or if the decision maker is very risk-averse, then it is not possible or unreasonable

to minimise expected costs. In these situations a rational decision maker will minimise the

worst-case costs, where the worst case (maximum) is evaluated with respect to all possible

scenarios ; see e.g. [30] for a formal justification. As we have discussed in the introduc-

tion, such worst-case (robust) optimisation problems traditionally only involve here-and-now

decisions [4, 11]. Problem P allows us to formulate a multi-stage generalisation of robust

optimisation problems as follows.

T

!

X

t t

minimise max ct ( ) xt ( )

t=1

t

X (RO)

subject to Ats xs ( ) bt ( t )

s

s=1

, t = 1, . . . , T

xt ( t ) 0

In order to see that RO is a special case of P, we consider an epigraph reformulation of the

worst-case objective,

T

! ( T

! )

X X

t t t t

max ct ( ) xt ( ) = min : max ct ( ) xt ( )

R

t=1 t=1

( T

)

X

= min : ct ( t ) xt ( t ) , (2.1)

R

t=1

case objective in RO with (2.1) transforms the robust optimisation problem RO into a

variant of the stochastic programming problem SP with a particularly simple objective

function (given by ). As RO is a special case of SP and SP is a special case of P, we

conclude that RO is indeed a special case of P.

8

Chance-constrained programs Let P be the distribution of . We can then formulate

a multi-stage generalisation of chance-constrained programs as

T

!

X

minimise E ct ( t ) xt ( t )

t=1

!

T

X (CC)

subject to P a t

it xt ( ) bi () 1 i i = 1, . . . , I,

t=1

xt ( t ) 0 , t = 1, . . . , T,

where ait Rnt , bi () R and i (0, 1]. Here, the ith constraint requires that the inequality

PT t

t=1 ait xt ( ) bi () should be satisfied with probability at least 1 i . Chance constraints

of this type are useful for modelling risk preferences and safety constraints in engineering

applications. Note that a chance constraint with i = 1 reduces to a robust constraint that

must hold for all . Therefore, chance constraints with i < 1 can be viewed as soft

versions of the corresponding robust constraints.

P

this end, we introduce the loss functions Li () = bi () Tt=1 a t

it xt ( ). The ith chance

constraint is therefore equivalent to the requirement that the smallest (1 i )-quantile of the

loss distribution, which we denote by VaRi (Li ()), is nonpositive. To obtain a conservative

approximation for the chance constraint, we introduce the conditional value-at-risk (CVaR)

1

of Li () at level i , which is defined as CVaRi (Li ()) = mini {i + E ([Li ()

i

i ]+ )}.

Due to its favourable theoretical and computational properties, CVaR has become a popular

risk measure in finance. Rockafellar and Uryasev [53] have shown that the optimal i which

solves the minimisation problem in the definition of CVaR coincides with VaRi (Li ()) and

that the CVaR al level i coincides with the conditional expectation of the right tail of the

loss distribution above VaRi (Li ()). Thus, the following implication holds; see also Figure 1.

As pointed out by Nemirovski and Shapiro [48], the CVaR constraint on the left-hand side

represents the tightest convex approximation for the chance constraint on the right-hand

side of the above expression. By linearising the term [Li () i ]+ in the definition of CVaR,

the ith CVaR constraint can be re-expressed as the following system of linear inequalities

T

X

1

i + E (zi ()) 0, zi () bi () a t

it xt ( ) i , zi () 0, (2.2)

i t=1

9

1 i

Li()

Figure 1: Relationship between VaRi (Li ()) and CVaRi (Li ()) for each constraint i, at level i .

which involve the deterministic (first stage) variable i R and a new stochastic (stage-

T ) variable zi () R. Replacing each chance constraint in CC with the corresponding

system (2.2) of linear inequalities thus results in a problem of type P with expectation

constraints. Therefore, chance-constrained problems of the type CC have tight conservative

approximations within the class of problems P.

We remark that the general decision problem P is flexible enough to cover also hy-

brid models which combine various aspects of deterministic, stochastic, robust and chance-

constrained programs in the same model.

the space of the decision rules xt (), t = 1, . . . , T , to those that exhibit a linear dependence

on the observed problem parameters t . The second part of the section explains how we can

efficiently measure the optimality gap that we incur through this simplification.

stochastic programs. It is therefore clear that problem P is severely computationally in-

10

tractable itself. A simple but effective approach to improve the tractability of problem P is

to restrict the space of the decision rules xt (), t = 1, . . . , T , to those that exhibit a linear

dependence on the observation history t . Remember that we stipulated in Remark 2.1 that

k1 = 1 and 1 = 1 for all . This implies that we actually optimise over all affine

(i.e., linear plus a constant) decision functions of the non-degenerate uncertain parameters

2 , . . . , T if we optimise over all linear functions of = (1 , . . . , T ).

In the rest of the paper we will assume that the conditional expectations E (| t) are linear

t

in the sense that there exist matrices Mt Rkk such that E (| t) = Mt t for all .

This assumption is non-restrictive. It is automatically satisfied, for instance, if the random

parameters t are mutually independent. In this case the conditional expectations reduce

to simpler unconditional expectations, and thus we find E (| t) = (1 , . . . , t , t+1 , . . . , T ),

where t denotes the unconditional mean value of t . As 1 = 1 for all , we thus have

E (| t) = (1 , . . . , t , t+1 1 , . . . , T 1 ) .

The last expression is manifestly linear in t . It is easy to verify that the conditional ex-

pectations remain linear when the process of the random parameters t belongs to the large

class of autoregressive moving-average models.

t

For the further argumentation, we define the truncation operator Pt : Rk Rk through

Pt = t . Thus, Pt maps any scenario to the corresponding observation history t up to

stage t. If we model the decision rule xt ( t ) as a linear function of t , it can thus be expressed

t

as xt ( t ) = Xt t = Xt Pt for some matrix Xt Rnt k . Substituting these linear decision

rules into P yields the following approximate problem.

T

!

X

minimise E ct ( t ) Xt Pt

t=1

!

T

X t

(P u )

subject to E Ats Xs Ps bt ( t )

s=1

, t = 1, . . . , T

Xt P t 0

The objective function of P u can be simplified and re-expressed in terms of the second order

moment matrix M = E( ) of the random parameters. Interchanging summation and

11

expectation and using the cyclicity property of the trace operator, we obtain

T

! T

X X

t

E ct ( ) Xt Pt = E ( Pt Ct Xt Pt )

t=1 t=1

XT

= E (Tr[Pt Pt Ct Xt ])

t=1

XT

= Tr(Pt MPt Ct Xt ).

t=1

Similarly, we can reformulate the conditional expectation terms in the constraints of P u as

T

! T

X t X

E

Ats Xs Ps = Ats Xs Ps Mt Pt .

s=1 s=1

u

Thus, the linear decision rule problem P is equivalent to

T

X

minimise Tr(Pt MPt Ct Xt )

t=1

!

T

X (3.3)

subject to Ats Xs Ps Mt Pt Bt Pt 0

s=1

, t = 1, . . . , T.

Xt P t 0

Although problem (3.3) has only finitely many decision variables, that is, the coefficients

of the matrices X1 , . . . , XT encoding the linear decision rules, it is still not suitable for

numerical solution as it involves infinitely many constraints parameterised by . The

following proposition, which captures the essence of robust optimisation, provides the tools

for reformulating the -dependent constraints in (3.3) in terms of a finite number of linear

constraints [4, 11].

Proposition 3.1. For any p N and Z Rpk , the following statements are equivalent.

Proof. We denote by Z the th row of the matrix Z. Then, statement (i) is equivalent to

Z 0 for all subject to W h

0 min Z : W h = 1, . . . , p

(3.4)

0 max h : W = Z , 0 = 1, . . . , p

with W = Z , h 0, 0 = 1, . . . , p

12

The equivalence in the third line follows from linear programming duality. Interpreting

as the th row of a new matrix Rpl shows that the last line in (3.4) is equivalent to

assertion (ii). Thus, the claim follows.

Using Proposition 3.1, one can reformulate the inequality constraints in (3.3) to obtain

T

X

minimise Tr(Pt MPt Ct Xt )

t=1

T

X

(P u )

subject to Ats Xs Ps Mt Pt Bt Pt = t W, t h 0, t 0

t = 1, . . . , T.

s=1

Xt Pt = t W, t h 0, t 0

t

The decision variables in P u are the entries of the matrices Xt Rnt k , t Rmt l and

t Rnt l for t = 1, . . . , T . Note that the objective function as well as all constraints are

linear in these decision variables. Thus, P u constitutes a finite linear program, which can be

solved efficiently with off-the-shelf solvers such as IBM ILOG CPLEX [1].

A major benefit of using linear decision rules is that the size of the approximating linear

program P u grows only moderately with the number of time stages. Indeed, the number of

P P

variables and constraints is quadratic in k, l, m = Tt=1 mt , and n = Tt=1 nt . Note that

these numbers usually scale linearly with T , and hence the size of P u typically grows only

quadratically with the number of decision stages.

We close this section with two remarks about alternative approximation methods to

convert problem P to a finite linear program that is amenable to numerical solution.

Remark 3.2 (Scenario tree approximation). Instead of approximating the functional form

of the decision rules xt (), t = 1, . . . , T , we can can improve the tractability of problem P by

replacing the underlying process 1 , . . . , T of random parameters with a discrete stochastic

process. The resulting process can be visualised as a scenario tree, which ramifies at all

time points at which new problem data is observed (Figure 2). Scenario tree approaches to

stochastic programming have been studied extensively over the past decades; see e.g. the survey

paper [23] that accounts for the developments up to the year 2000. More recent contributions

are listed in the official stochastic programming bibliography [60]. In contrast to the decision

rule approach, scenario tree methods typically scale exponentially with the number of decision

stages. Figure 3 compares the scenario tree and the decision rule approximations.

13

Figure 2: Scenario tree with samples e1s , . . . , e9s .

lem P in two steps. First, we restricted the decision rules xt (), t = 1, . . . , T , to be linear

functions of the observation histories t . Afterwards, we used linear programming duality to

obtain a finite problem. We can derive a different approximation for problem P if we enforce

the semi-infinite constraints in P u only over a finite subset of samples es , . . . , es .

1 K

It has been shown in [14, 61] that a modest number K of samples suffices to satisfy the

semi-infinite constraints in P u with high probability. The advantage of such sampling-based

approaches is that they allow us to model more general dependencies between the problem

data (A, b, c) and the random parameters. However, we are not aware of any methods to

measure the optimality gap that we incur with sample-based methods.

The price that we have to pay for the favourable scaling properties of the linear decision

rule approximation is a potential loss of optimality. Indeed, the best linear decision rule can

result in a substantially higher objective value than the best general decision rule (which is

typically nonlinear). The difference u = min P u min P between the optimal values of

the approximate and the original decision problem can be interpreted as the approximation

error associated with the linear decision rule approximation. As P u is a restriction of the

minimisation problem P, u is necessarily nonnegative. Modellers should estimate u in

14

x() x()

x(es)

es

Figure 3: Comparison of the scenario tree (left) and the decision rule approximation (right). Sce-

nario trees replace the process 1 , . . . , T of random parameters with a discrete stochastic process.

The decision rule approach retains the original stochastic process, but it restricts the functional

form of the decision rules xt (), t = 1, . . . , T .

order to assess the appropriateness of the linear decision rule approximation for a particular

problem instance: a small u indicates that implementing the solution of P u will incur a

negligible loss of optimality, while a large u may prompt us to be more cautious and to try

to improve the approximation quality (e.g. by using more flexible piecewise linear decision

rules; see Section 4).

Generally speaking, there are two ways to measure the approximation error u . We can

derive generic a priori bounds on the maximum value of u that can be incurred over a

class of instances, or we can measure u a posteriori for a specific problem instance.

A priori bounds on u have a long history. In particular, linear decision rules have been

proven to optimally solve the linear quadratic regulator problem [6], while piecewise linear

decision rules optimally solve two-stage stochastic programs [28]. More recently, linear de-

cision rules have been shown to optimally solve a class of one-dimensional robust control

problems [9] and two-stage robust vehicle routing problems [32]. On the other hand, the

worst-case approximation ratio for linear decision rules applied to two-stage robust optimi-

sation problems with m linear constraints has been shown to be of the order O( m), see [7].

Similar results have been derived for two-stage stochastic programs in [8].

Given their scarcity and their somewhat limited scope, it seems fair to say that a priori

15

bounds on u are at most indicative of the expressive power of linear and piecewise linear

decision rules. It thus seems natural to consider a posteriori bounds on u that exploit

the specific structure of individual instances of the problem P. Unfortunately, the direct

computation of u for a specific instance of P would require the solution of P itself, which

is intractable. In this section we demonstrate, however, that an upper bound on u can be

obtained efficiently by studying a dual decision problem associated with P.

an associated dual linear program maxy {b y : A y c, y 0}, which is based on the

same problem data (A, b, c), such that the following hold: the minimum of the primal is

never smaller than the maximum of the dual (weak duality), and if either the primal or the

dual is feasible then the minimum of the primal coincides with the maximum of the dual

(strong duality) [18]. There is a duality theory for decision problems of the type P which is

strikingly reminiscent of the duality theory for ordinary linear programs. Following Eisner

and Olsen [25], the dual problem corresponding to P can be defined as

T

!

X

maximise E bt ( t ) yt ( t )

t=1

!

T

X (D)

s t t

subject to E A y

st s ( ) c t ( )

, t = 1, . . . , T.

s=1

yt ( t ) 0

Note that the dual maximisation problem D is stated in terms of the same problem data as

the primal minimisation problem P. As for ordinary linear programs, dualisation transposes

the constraint matrices and swaps the roles of the objective function and right-hand side

coefficients. Dualisation also reverts the temporal coupling of the decision stages in the

sense that the sums in the constraints of D now run over the first index of the constraint

matrices. Thus, even if the primal stage-t constraint is independent of t+1 , . . . , T , implying

that the conditional expectation E (| t) becomes redundant in P, the dual stage-t constraint

usually still depends on t+1 , . . . , T , implying that the conditional expectation E (| t ) has

a non-trivial effect in D. Therefore, in hindsight we realise that the inclusion of conditional

expectation constraints in P was necessary to preserve the symmetry of the employed duality

scheme.

As in the case of ordinary linear programming, there exist weak and strong duality

results for problems P and D [25]. In particular, the minimum of P is never smaller than

16

the maximum of D (weak duality), and if some technical regularity conditions hold, then

the minimum of P coincides with the maximum of D (strong duality).

The symmetry between P and D enables us to solve D with the linear decision rule

approach that was originally designed for P. Indeed, if we model the dual decision rule

yt ( t ) as a linear function of t , then it can be expressed as yt ( t ) = Yt t for some matrix

t

Yt Rmt k . Substituting these dual linear decision rules into D yields an approximate

problem D l , which can be shown to be equivalent to the following tractable linear program.

T

X

maximise Tr(Pt MPt Bt Yt )

t=1

T

X

(D l )

subject to A

st Ys Ps Mt Pt Ct Pt = t W, t h 0, t 0

t = 1, . . . , T.

s=1

Yt Pt = t W, t h 0, t 0

t

The decision variables in D l are the entries of the matrices Yt Rmt k , t Rnt l and

t Rmt l for t = 1, . . . , T .

In analogy to the primal approximation error u , the dual approximation error can be

defined as l = max D max D l . As D l is a restriction of the maximisation problem D,

l is necessarily nonnegative. It quantifies the loss of optimality of the best linear dual

decision rule with respect to the best general dual decision rule. Unfortunately, l is usually

unknown as its computation would require the solution of the original dual decision problem

D. However, the joint primal and dual approximation error = min P u max D l is efficiently

computable; it merely requires the solution of two tractable finite linear programs. Note that

constitutes indeed an upper bound on both u and l since

= min P u max D l

= min P u min P + min P max D + max D max D l

= u + min P max D + l

u + l ,

We conclude that for any decision problem of the type P the best primal and dual

linear decision rules can be computed efficiently by solving tractable finite linear programs.

17

The corresponding approximation errors are bounded by , which can also be computed

efficiently. Moreover, the optimal values of the approximate problems P u and D l provide

upper and lower bounds on the optimal value of the original problem P, respectively.

A large value of indicates that either the primal or the dual approximation (or both) are

inadequate. If the loss of optimality of linear decision rules is unacceptably high, modellers

will endeavour to find a less conservative (but typically more computationally demanding)

approximation. Ideally, one would choose a richer class of decision rules over which to

optimise.

In this section we show that the techniques developed for linear decision rules can also

be used to optimise efficiently over more flexible classes of nonlinear decision rules. The

underlying theory has been developed in a series of recent publications [4,17,29,31]. We will

motivate the general approach first through an example.

distributed on = {1} [1, 1]. This choice of satisfies the standard assumption that

1 = 1 for all . Any scalar linear decision rule is thus representable as x() =

X1 1 + X2 2 , where X1 denotes a constant offset (since 1 is equal to 1 with certainty),

while X2 characterises the sensitivity of the decision with respect to 2 ; see Figure 4 (left).

To improve flexibility, one may introduce a breakpoint at 2 = 0 and consider piecewise

linear continuous decision rules that are linear in 2 on the subintervals [1, 0] and [0, 1],

respectively. These decision rules are representable as

where X1 denotes again a constant offset, while X2 and X3 characterise the sensitivities of

the decision with respect to 2 on the subintervals [1, 0] and [0, 1], respectively; see Fig-

ure 4 (centre). We can now define a new set of random variables 1 = 1 , 2 = min{2 , 0}

and 3 = max{2 , 0}, which are completely determined by . In particular, the support for

= (1 , 2 , 3 ) is given by = { R3 : 1 = 1, 2 [1, 0], 3 [0, 1], 2 3 = 0}. Note

also that is uniformly distributed on . We will henceforth refer to as the lifted random

18

x() x() x ( )

X2 X3

X3

X1

X2

X1 X1

X2

1 0 1 2 1 0 1 2 3 1 0

1

2

Figure 4: Illustration of the linear and piecewise linear decision rules in the original and the lifted space.

Note that the set = L() is non-convex, represented by the thick line in the right diagram. Since the

decision rule x ( ) is a linear function of the random parameters , however, it is nonnegative over if and

only if it is nonnegative over the convex hull of , which is given by the dark shaded region.

decision rule (4.5) in the original space is equivalent to the linear decision rule x ( ) =

X1 1 + X2 2 + X3 3 in the lifted space; see Figure 4 (right). Moreover, due to the linearity

of x ( ) in , the decision rule x ( ) is nonnegative over the nonconvex set if and only

if x ( ) is nonnegative over the convex hull of . We can therefore replace the nonconvex

support in the lifted space with its (polyhedral) convex hull, which can be represented as

an intersection of halfspaces as required in Section 2.1. Hence, all techniques developed for

linear decision rules can also be used for piecewise linear decision rules of the form (4.5).

To solve a general decision problem of the type P in nonlinear decision rules, we define

a lifting operator

L() = (L1 (1 ), . . . , LT (T )) ,

where each Lt (t ) represents a continuous function from Rkt to Rkt for some kt kt . Using the

lifting operator, we can construct a lifted random vector = (1 , . . . , T ) Rk , where t =

Lt (t ), t = 1, . . . , T , and k = k1 + + kT . The distribution of the lifted random vector is

completely determined by that of the primitive random vector , and the support of can be

defined as = L(). As for the primitive uncertainties it proves useful to define observation

19

t t

histories t = (1 , . . . , t ) Rk , k t = k1 + + kt , and truncation operators Pt : Rk Rk

which map to t , respectively. Our goal is to solve problem P in nonlinear decision

rules of the form xt ( t ) = Xt PtL(), which constitute linear combinations of the component

t

functions of the lifting operator. The matrices Xt Rnt k contain the coefficients of these

linear combinations, while the truncation operators Pt eliminate those components of L()

that depend on the future uncertainties t+1 , . . . , T , thereby ensuring non-anticipativity. By

construction, the nonlinear decision rules xt ( t ) = Xt Pt L() depending on the primitive

uncertainties are equivalent to linear decision rules xt ( t ) = Xt t depending on the lifted

uncertainties. Note that ordinary linear decision rules in the primitive uncertainties can be

recovered by choosing a trivial lifting operator L() = .

For the further argumentation we require that the lifting preserves the degeneracy of the

first random parameter, that is, 1 = 1 for all . Moreover, we assume that there is a

linear retraction operator

R( ) = (R1 (1 ), . . . , RT (T )) ,

which allows us to express the primitive random vector as a linear function of the lifted

random vector . To this end, we assume that each Rt represents a linear function from

Rkt to Rkt . The best nonlinear decision rule can then be computed by solving the following

optimisation problem.

T

!

X

minimise E ct (Pt R( )) Xt Pt

t=1

!

T

X

(P u )

subject to E Ats Xs Ps t bt (Pt R( ))

s=1

, t = 1, . . . , T

Xt Pt 0

Note that the observation histories in the objective function and in the right-hand side

coefficients have been expressed as t = Pt = Pt R( ). The optimisation variables in P u

t

are the entries of the matrices Xt Rnt k . The key observation is that the approximate

problems P u and P u have exactly the same structure. The only difference is that = L()

is typically not a polyhedron because of the nonlinearity of the lifting operator. In the

following, we assume that an exact representation (or outer approximation) of the convex

hull of is available in the form of l inequality constraints:

:= { Rk : W h },

20

where conv = (exact representation) or conv (outer approximation). Such

representations can be determined efficiently for polyhedral supports, see [29]. The nonlinear

decision rule problem P u can then be transformed into a tractable linear program in the

same way as the linear decision rule problem P u was converted to P u , see Section 3.

T

X

minimise Tr(Pt M R Pt Ct Xt )

t=1

T

X

(P u )

subject to Ats Xs Ps Mt Pt Bt Pt R = t W , t h 0, t 0

s=1

t = 1, . . . , T

Xt Pt = t W , t h 0, t 0

Here, the matrix R Rkk is defined through R = R( ) for all , M = E ( )

Rk k denotes the second order moment matrix associated with , and the conditional

t

expectations are assumed to satisfy E ( | t ) = Mt t for some matrices Mt Rk k and

u t

all . The decision variables in P are the entries of the matrices Xt Rnt k ,

t Rmt l and t Rnt l for t = 1, . . . , T .

Similarly, we can measure the suboptimality of nonlinear decision rules by solving the

following dual approximate problem (see Section 3.2).

T

X

maximise Tr(PtM RPt Bt Yt )

t=1

T

X

(D l )

subject to A Y

P M

P C P R = W

, h

0, 0

st s s t t t t t t t

t = 1, . . . , T

s=1

Yt Pt = t W , t h 0, t 0

t

The decision variables of this problem are the entries of the matrices Yt Rmt k , t Rnt l

and t Rmt l for t = 1, . . . , T . One can show that the finite-dimensional primal approxi-

u u

mation P is equivalent to the semi-infinite primal problem P if coincides with conv ,

u u

and P provides an upper bound on the optimal value of P if is an outer approxima-

tion of conv . Likewise, the finite-dimensional dual approximation D l is equivalent to the

semi-infinite dual problem D l (not shown here) if coincides with conv , and D l provides

a lower bound on the optimal value of D l if is an outer approximation of conv . In par-

u

ticular, the finite-dimensional approximate problems P and D l still bracket the optimal

value of the problem P in nonlinear decision rules if we employ an outer approximation

of the convex hull of . The situation is illustrated in Figure 5.

21

Figure 5: Relationship between the primal bounds Piu and the dual bounds Dil for an exact repre-

sentation of conv (i = 1) and an outer approximation of conv (i = 2).

Optimisation problems often involve decisions that are modelled through integer variables.

These problems are still amenable to the decision rule techniques described in Sections 3 and

4 if the integer variables do not depend on the uncertain problem parameters . In order

to substantiate this claim, we consider a variant of problem P in which the right-hand side

vectors of the constraints may depend on a vector of integer variables z Zd .

T

!

X

minimise E ct ( t ) xt ( t )

t=1

subject to z Z

T

! (PMI )

X

E Ats xs ( s ) t bt (z, t )

s=1

, t = 1, . . . , T

t

xt ( ) 0

As usual, we assume that bt (z, t ) depends linearly on t , that is, bt (z, t ) = Bt (z) t for

t

some matrix Bt (z) Rmt k . We further assume that Bt (z) depends linearly on z and that

Z Rd results from the intersection of Zd with a convex compact polytope.

In order to apply the decision rule techniques from Sections 3 and 4 to PMI , we study the

parametric program P(z), which is obtained from PMI by fixing the integer variables z. By

construction, P(z) is a decision problem of the type P, which is bounded above and below

by the linear programs P u (z) and D l (z) associated with the primal and dual linear decision

rule approximations, respectively. Thus, an upper bound on PMI is obtained by minimising

the optimal value of P u (z) over all z Z. The resulting optimisation problem, which we

22

u

denote by PMI , represents a mixed-integer linear program (MILP).

T

X

minimise Tr(Pt MPt Ct Xt )

t=1

subject to z Z u

T

(PMI )

X

Ats Xs Ps Mt Pt Bt (z)Pt = t W, t h 0, t 0

s=1

t = 1, . . . , T

Xt Pt = t W, t h 0, t 0

Similarly, a lower bound on PMI is obtained by minimising the optimal value of D l (z) over

all z Z. The resulting min-max problem has a bilinear objective function that is linear in

the integer variables z and in the coefficients of the dual decision rules Yt . In order to convert

this problem to an MILP, we follow the exposition in [38] and dualise D l (z). One can show

that strong duality holds whenever the original problem PMI is feasible. By construction, we

thus obtain a lower bound on PMI minimising the dual of D l (z) over all z Z. The resulting

l

optimisation problem, which we denote by PMI , again represents an MILP.

T

X

minimise Tr(Pt MPt Ct Xt )

t=1

subject to z Z

T

l

X

(PMI )

Ats Xs Ps Mt Pt + St Pt = Bt (z)Pt

s=1

t = 1, . . . , T

W h e

1 MPt Xt 0

W h e

1 MPt St 0

t t

l

The optimisation variables in PMI are the matrices Xt Rnt k and St Rmt k , as well

as the binary variables z Z. Problem PMI is also amenable to the refined approximation

methods based on piecewise linear decision rules as discussed in Section 4. Further details

can be found in [29].

Remark 5.1. We assumed that the integer decisions z in problem PMI do not impact the co-

efficients ct () and A of the objective function and the constraints, respectively. It is straight-

forward to apply the techniques presented in this section to a generalised problem where the

objective function and constraint coefficients depend linearly on z. Using a Big-M reformu-

lation, the resulting primal and dual approximate problems can again be cast as MILPs.

23

6 Case Studies

In the following, we apply the decision rule approach to two well known operations man-

agement problems, and we compare the method with alternative approaches to account for

data uncertainty. All of our numerical results are obtained using the IBM ILOG CPLEX 12

optimisation package on a dual-core 2.4GHz machine with 4GB RAM [1].

Our first case study concerns a medium-term production planning problem for a multiproduct

plant with uncertain customer demands and backlogging. We assume that the plant consists

of a single processing unit that is capable of manufacturing different products in a continuous

single-stage process. We first elaborate a formulation that disregards changeovers, and we

afterwards extend the model to sequence-dependent changeover times and costs.

We wish to determine a production plan that maximises the expected profit for a set of

products I and a weekly planning horizon T = {1, . . . , T }. To this end, we denote by pti the

amount of product i I that is produced during week t T \ {T }. The processing unit can

manufacture ri units of product i per hour, and it has an uptime of R hours per week. At

the beginning of each week t T , we observe the demand ti that arises for product i during

week t. We assume that the demands 1i in the first week are deterministic, while the other

demands ti , t > 1, are stochastic. Having observed the demands ti , we then decide on the

quantity sti of product i that we sell during week t at a unit price Pti . We also determine

the orders bti for product i that we backlog during week t at a unit cost CB ti . We assume

that the sales sti in week t must be served from the stock produced in week t 1 or before.

Once the sales decisions sti for week t have been made, the inventory level Iti for product

i during week t is known. Each unit of product i held during period t leads to inventory

holding costs CI ti , and we require that the inventory levels Iti satisfy the lower and upper

inventory bounds I ti and I ti , respectively. Deterministic versions of this problem have been

studied in [15, 41]. For literature surveys on related production planning problems, we refer

to [27, 62]. The temporal structure of the problem is illustrated in Figure 6.

24

week t 1 week t week t + 1

supply

pt1,i pt,i pt+1,i

demand

bt1,i bt,i bt+1,i

t,i t+1,i

Figure 6: Temporal structure of the production planning model. The decisions with subscript t may depend

on all demands realised in weeks 1, . . . , t.

" #

XX

maximise E Pti sti ( t ) CB ti bti ( t ) CI ti Iti ( t )

XtT iI

subject to pti ( t )/ri R

iI t T \ {T }

t

pti ( ) 0 i I

)

t t1

bti ( ) = bt1,i ( ) + ti sti ( t )

t T \ {1} , i I

Iti ( t ) = It1,i ( t1 ) + pt1,i ( t1 ) sti ( t )

b1i ( t ) = b0i + 1i s1i ( 1 ), I1i ( t ) = I0i s1i ( 1 )

bti ( t ) 0, sti ( t ) 0, I ti Iti ( t ) I ti t T , i I

We require the constraints to be satisfied for all realisations of the uncertain customer

demands. The parameters b0i and I0i specify the initial backlog and inventory, respectively.

One can easily show that the production planning problem is an instance of the problem

P studied in Section 2.1. The same applies to variations of the problem where the prices

and/or costs are uncertain, as well as variations where the product demand is only known

at the end of each week. For the sake of brevity, we disregard these variants here.

So far, our production planning problem does not account for changeovers between con-

secutively manufactured products. Frequent changeovers are undesirable as the involved

clean-up, set-up and start-up activities result in both delays and costs. To incorporate

changeovers, we follow the approach presented in [41] and introduce binary variables ctij ,

t T \ {T } and i, j I, that indicate whether a changeover from product i to product j

occurs in week t. Likewise, we introduce binary variables ctij , t T \ {T } and i, j I, that

25

ct23 = 1 ct31 = 1 ct14 = 1 ct+1,44 = 1

week t 1 week t week t + 1

product 2 product 3 product 1 product 4 product 4

ot3 = 1 ot1 = 2 ot4 = 3

ft3 = 1 lt4 = 1

Figure 7: The definition of the changeover variables ctij and ctij is enforced through the auxiliary variables

ytj , otj and ftj , ltj .

indicate whether a changeover from product i to product j occurs between weeks t 1 and

t. We set ctii = 0, that is, no changeover occurs during the manufacturing of product i, and

c1ij = 0 for i, j I, that is, no changeover is required for the first product in the first week.

Since ctij and ctij are binary, we must choose the changeovers for all weeks as a here-and-now

decision in our production planning model (see Section 5). Note, however, that the actual

production amounts pti , sales decisions sti and backlogged demands bti remain wait-and-see

decisions that can adapt to the realisation of the uncertain customer demands t .

Our interpretation of the changeover variables ctij and ctij is enforced as follows.

X X

ctij = ytj ftj

c tij = f tj

iI

X iI

j I, t T \ {T } , X j I, t T \ {1, T } ,

ctji = ytj ltj

c = l

tji t1,j

iI iI

)

X X ftj ytj

fti = lti = 1 t T \ {T } , j I, t T \ {T } ,

iI iI ltj ytj

otj M ytj

X

ftj otj yti j I, t T \ {T } .

iI

otj oti + 1 M(1 ctij ) i I

Here, M denotes a sufficiently large constant. The binary variables ytj , ftj and ltj indicate

whether product j is manufactured, manufactured first and manufactured last in week t,

respectively, and the continuous variables otj determine the production order in week t. The

changeover variables and constraints are illustrated in Figure 7.

" #

XX X X

E Pti sti ( t ) CB ti bti ( t ) CI ti Iti ( t ) CC ij ctij + ctij ,

tT iI tT \{T } i,jI

26

Product 1 Product 2 Product 3 Product 4 Product 5

di 10,000 25,000 30,000 30,000 30,000

Pti 0.25 0.40 0.65 0.55 0.45

CB ti 0.05 0.08 0.13 0.11 0.09

ri 800 900 1,000 1,000 1,200

Table 1: Parameters for the production planning instance. The product prices Pti and backlogging costs

CB ti are assumed to be time-invariant.

where CC ij represents the costs of a changeover from product i to product j (with CC ii = 0).

Likewise, we replace the first constraint of our production planning model with

X X

pti ( t )/ri + ij (ctij + ctij ) R t T \ {T } ,

iI i,jI

where ij denotes the duration of a changeover from product i to product j (with ii = 0).

One readily verifies that the production planning problem with changeover constraints is an

instance of the problem PMI studied in Section 5.

We consider an instance of the production planning problem with 5 products. The nominal

demand ti for product i in week t follows a cyclical pattern:

1 (t 2)

ti = 1 + sin di t T , i I,

2 26

where di denotes the long-term average demand for product i. We model the actual demands

ti as independent and uniformly distributed random variables supported on the intervals

[(1 )ti , (1 + )ti ], where denotes the level of uncertainty. We set the weekly uptime of

the processing unit to R = 168h, and we select inventory holding costs CI ti = 3.06 105 and

inventory bounds (I ti , I ti ) = (0, 106) that are independent of the indices t and i. There are

no initial inventories I0i or backlogs b0i , and the remaining problem parameters are defined

in Tables 1 and 2. The time horizon T and the uncertainty level are kept flexible.

compares the decision rule approximation with classical scenario-based stochastic program-

ming. The first two graphs report the gaps between the primal and dual objective values

27

(CC ij , ij ) Product 1 Product 2 Product 3 Product 4 Product 5

Product 1 (0,0.00) (760,2.00) (760,1.50) (750,1.00) (760,0.75)

Product 2 (745,1.00) (0,0.00) (750,2.00) (770,0.75) (740,0.50)

Product 3 (770,1.00) (760,1.25) (0,0.00) (765,1.50) (765,2.00)

Product 4 (740,0.50) (740,1.00) (745,2.00) (0,0.00) (750,1.75)

Product 5 (740,0.70) (740,1.75) (750,2.00) (750,1.50) (0,0.00)

if we employ linear and piecewise linear decision rules for various time horizons T . The

gap amounts to less than 15% for linear decision rules, and it can be reduced to about 5%

for piecewise linear decision rules with one breakpoint. The first graph also shows a box-

and-whisker plot of the objective values reported by the scenario-based formulation for 100

statistically independent scenario trees with a branching factor of two. One can see that the

scenario-based formulation underestimates the achievable profit, which is an artefact of the

small branching factor. If we would increase the branching factor of the scenario tree, then

we would expect the estimated objective values to move between the curves representing

the primal and dual decision rule approximations. The third graph illustrates the runtimes

required to solve the linear and piecewise linear decision rule problems, as well as the run-

times required by the scenario-based formulation with a branching factor of two, three and

four. The figure clearly demonstrates that scenario-based approaches become computation-

ally intractable for problems with many stages and/or a high branching factor. We remark,

however, that even a branching factor of four is very small for a problem with five random

variables per time stage. Indeed, a branching factor of at least six is required to avoid perfect

correlations among the product demands. The problem formulation using decision rules, on

the other hand, can be solved for planning horizons of 52 weeks and more.

Consider now the production planning problem with changeover constraints. We want

to compare our stochastic problem formulation with a deterministic model that replaces the

uncertain customer demands ti with their expected values ti . To this end, we consider a

time horizon of 52 weeks and solve both models in a rolling horizon implementation over a

reduced time horizon T = {1, . . . , 4}. At the beginning of each of the 52 weeks, we observe

the random demands ti . We then solve the deterministic and the stochastic model over

four weeks. In both models, we set the initial backlogs b0i and the initial stocks I0i to the

corresponding values at the end of the previous week, and we adapt the models to account

28

x 10

6

Linear Decision Rules x 10

6

Piecewise Linear Decision Rules Average Solution Time

3 Dual approximation 3 Dual approximation SAA

25

Primal approximation Primal approximation Piecewise linear decision rules

Linear decision rules

2.5 2.5

Objective Value

Objective Value

Time (hours)

20

2 2

5

x 10 15

6

1.5 1.5

4 10

1 1

5

0.5 2 0.5

2 4 6 8

0 10 20 30 40 50 0 10 20 30 40 50 5 10 15 20 25 30 35 40 45 50

Time Stages Time Stages Time Stages

Figure 8: Objective values and runtimes for decision rules and scenario-based stochastic programming. All

graphs are based on the uncertainty level = 0.2.

6 6 6

x 10 x 10 x 10

Cumulative Objective Value

Cumulative Objective Value

2.5 2.5 2.5

Stochastic Stochastic Stochastic

Deterministic Deterministic Deterministic

2 2 2

1 1 1

5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50

Figure 9: Comparison of piecewise linear decision rules (stochastic, using one breakpoint) with a

deterministic production planning model in a backtest.

for the initial changeovers c1ij . We implement the first-stage decisions of both models and

repeat the process for the next set of demands t+1,i . Figure 9 shows the average cumulative

profit achieved by the deterministic and the stochastic model over 100 repetitions of this

backtest. The deterministic production planning model performs only slightly worse than

the stochastic formulation if the uncertainty level is small. For larger values of , however,

it becomes essential to properly account for the random nature of the customer demands.

Our second case study investigates the design of multi-echelon supply chains under demand

uncertainty. The supply chain produces a single product, and it consists of one production

facility and multiple warehouses and distribution centres. The product is manufactured at

the production facility, and it is shipped first to the warehouses and then to the distribution

centres. Customer demands only arise at the distribution centres, and we do not allow

29

direct deliveries from the production facility to the distribution centres. We assume that the

locations of the production facility and the distribution centres are given, whereas there are

multiple candidate locations for the warehouses. We denote by I and J the index sets of

candidate warehouse locations and distribution centres, respectively. We want to build at

most K warehouses, 0 < K < |I|, such that the resulting supply chain minimises the sum

of investment and expected transportation costs.

tion of stocks. This simplification is motivated in [37,55,59], and it allows us to formulate the

problem as a two-stage stochastic program. In the first stage of this problem, we determine

the location and the capacities of the warehouses. To this end, we define binary variables

xi {0, 1}, i I, with the interpretation that xi = 1 if a warehouse is built at candidate

location i and xi = 0 otherwise. Likewise, we denote the capacity of the warehouse at lo-

cation i I by ci . The capacity of each individual warehouse must not exceed c, and the

overall capacity of the warehouses is bounded above by C. We assume that the construction

of warehouse i incurs investment costs CI i ci that are proportional to its capacity ci .

At the beginning of the second stage, we observe the customer demands j arising at

each distribution centre j J . Since our stochastic program only consists of two stages,

we notationally suppress the time stage at which the demands are realised. Once we have

observed the customer demands, we select the shipments fi from the production facility to

each warehouse i I, as well as the shipments gij from the warehouses i I to the distri-

bution centres j J . We assume that the shipments fi and gij incur linear transportation

costs CF i fi and CG ij gij . Deterministic versions of this supply chain design problem have

been studied in [55, 59]. The vast literature on supply chain design is surveyed, amongst

others, in [20, 4244]. Figure 10 illustrates the structure of the supply chain considered in

our problem.

30

Figure 10: Structure of the supply chain. Shaded nodes represent the existing production site and

distribution centres, and unshaded nodes represent candidate locations for warehouses.

" #

X X XX

minimise CI i ci + E CF i fi () + CG ij gij ()

iI

X XiI iI jJ

subject to xi K, ci C

iI iI

ci c xi i I

X

fi () ci , gij () fi () i I

jJ

X

gij () j j J

iI

xi {0, 1} , ci , fi (), gij () 0 i I, j J

In this problem, we require the constraints to be satisfied for all realisations of the

uncertain customer demands. The objective function minimises the sum of investment and

expected transportation costs. The first constraint restricts the total number of warehouses

that can be build, as well as the overall capacity of the warehouses. The second constraint

ensures that the individual warehouse capacity restrictions are met, and it imposes zero

capacities at candidate locations without a warehouse. The third constraint guarantees that

the product shipments satisfy flow conservation and the individual warehouse capacities.

Finally, the penultimate constraint requires that the customer demands are satisfied at each

distribution centre. One readily verifies that our supply chain design problem is an instance

of the problem PMI studied in Section 5.

31

6.4 Numerical Results

We consider an instance of the supply chain design problem where the production facility is

in London, while the distribution centres are located in the capitals of 40 European countries.

We want to build at most K = 5 warehouses in the capitals of the 10 most populous countries,

and the warehouse capacities must not exceed the individual and cumulative bounds c = 40

and C = 100, respectively. We disregard investment costs (CI i = 0), and we set the

transportation costs CF i and CG ij to 90% and 100% of the geodesic distances between the

corresponding cities.

the quotient of the corresponding countrys population and the overall population of the

40 countries. The actual demand j at distribution centre j is uncertain, and the demand

vector follows a uniform distribution with polyhedral support

( )

|J |

X

= R+ : [(1 ) 100, (1 + ) 100], j = 100 ,

jJ

where the parameter 0 1 represents the level of uncertainty. Our support definition

expresses the view that the cumulative customer demands are known, but the geographical

breakdown by country is uncertain. In particular, the demand arising at the capital of

country j J can vary within 100% of the expected demand j .

Figure 11 and Table 3 illustrate the optimal supply chains for = 0 (deterministic de-

mand), = 0.5 and = 1. The figure shows that in the deterministic problem, most of

the distribution centres receive their stock from the closest warehouse in order to minimise

transportation costs. If the geographic breakdown of the demand is uncertain, however, such

an assignment is no longer feasible due to the limited warehouse capacities. Instead, each dis-

tribution centre potentially receives its stock from different warehouses, and the assignment

of warehouses to distribution centres depends on the demand realisation . This in turn

has a profound impact on the design of the optimal supply chain. If the customer demands

are deterministic ( = 0), then the product flows between warehouses and distribution cen-

tres are known. In this case, the optimal supply chain design places the warehouses close

to the distribution centres in order to take advantage of the cheaper transportation costs

CF i between the production facility and the warehouses. If the geographical breakdown of

the customer demands is uncertain, however, such a decentralised warehouse strategy would

32

Location =0 = 0.5 =1

Berlin 30.3731 0 0

Ankara 11.3988 5.69941 0

Paris 0 19.7618 20

London 40 40 40

Rome 11.3066 0 0

Kiev 0 0 0

Madrid 6.92146 0 0

Warsaw 0 7.58782 0

Bucharest 0 0 0

Amsterdam 0 26.9509 40

Table 3: Installed warehouse capacities for the three uncertainty levels. The cities are ordered according to

their expected product demands.

suffer from costly detours due to the limited warehouse capacities. To illustrate this, consider

the warehouse built in Ankara (for {0, 0.5}), which has the second largest expected prod-

uct demand. If the customer demands are subject to significant geographical uncertainty

( = 1), then there is a non-negligible chance that a large part of the demand is realised in

western European countries. Due to the overall capacity limitation C = 100, some products

would have to be shipped to Turkey before they are delivered to the distribution centres in

Western Europe, which would incur high transportation costs. To avoid such detours, the

optimal supply chain design adopts a centralised warehouse strategy if customer demands

are uncertain.

We solved both instances of the stochastic supply chain design problem ( {0.5, 1})

within 10 minutes using linear decision rules. The optimality gap was below 5% ( = 0.5)

and 13% ( = 1). Better results can be obtained by refining the decision rule approximation.

The deterministic problem was solved within a few seconds.

33

References

mization/cplex-optimizer.

[2] Adida, E., and Perakis, G. A robust optimization approach to dynamic pricing

and inventory control with no backorders. Mathematical Programming 107, 1 (2006),

97129.

[4] Ben-Tal, A., El Ghaoui, L., and Nemirovski, A. Robust Optimization. Princeton

University Press, 2009.

[5] Ben-Tal, A., Golany, B., Nemirovski, A., and Vial, J.-P. Retailer-supplier

flexible commitments contracts: A robust optimization approach. Manufacturing &

Service Operations Management 7, 3 (2005), 248271.

[6] Bertsekas, D. P. Dynamic Programming and Optimal Control, 2nd ed. Athena

Scientific, 2001.

[7] Bertsimas, D., and Goyal, V. On the power and limitations of affine policies

in two-stage adaptive optimization. Mathematical Programming, available via Online

First.

[8] Bertsimas, D., and Goyal, V. On the power of robust solutions in two-stage

stochastic and adaptive optimization problems. Mathematics of Operations Research

35, 2 (2010), 284305.

[9] Bertsimas, D., Iancu, D., and Parrilo, P. Optimality of affine policies in multi-

stage robust optimization. Mathematics of Operations Research 35, 2 (2010), 363394.

[10] Bertsimas, D., and Sim, M. Robust discrete optimization and network flows. Math-

ematical Programming 98, 1 (2003), 4971.

[11] Bertsimas, D., and Thiele, A. Robust and data-driven optimization: Modern

decision making under uncertainty. In INFORMS TutoRials in Operations Research:

Models, Methods, and Applications for Innovative Decision Making, M. P. Johnson,

B. Norman, and N. Secomandi, Eds. INFORMS, 2006, ch. 4, pp. 95122.

34

[12] Bertsimas, D., and Thiele, A. A robust optimization approach to inventory theory.

Operations Research 54, 1 (2006), 150168.

1997.

constrained optimization: Feasibility and optimality. Journal of Optimization Theory

and Applications 148, 2 (2011), 257280.

single-stage single-unit multiproduct plants using a hybrid discrete/continuous-time

MILP model. Industrial & Engineering Chemistry Research 47, 6 (2008), 19251934.

[16] Chen, W., Sim, M., Sun, J., and Teo, C.-P. From CVaR to uncertainty set:

Implications in joint chance constrained optimization. Operations Research 58, 2 (2010),

470485.

[17] Chen, X., Sim, M., Sun, P., and Zhang, J. A linear decision-based approximation

approach to stochastic programming. Operations Research 56, 2 (2008), 344357.

[19] Cohen, I., Golany, B., and Shtub, A. The stochastic timecost tradeoff problem:

A robust optimization approach. Networks 49, 2 (2007), 175188.

[20] Daskin, M. S., Snyder, L. V., and Berger, R. T. Facility location in supply chain

design. In Logistics Systems: Design and Optimization, A. Langevin and D. Riopel, Eds.

Springer, 2005, pp. 3965.

book. Springer, 2002.

[22] Drezner, Z., and Hamacher, H. Facility Location: Applications and Theory.

Springer, 2004.

[23] Dupacova, J., Consigli, G., and Wallace, S. Scenarios for multistage stochastic

programs. Annals of Operations Research 100, 1 (2000), 2553.

problems. Mathematical Programming 106, 3 (2006), 423432.

35

[25] Eisner, M., and Olsen, P. Duality for stochastic programming interpreted as L.P.

in Lp -space. SIAM Journal on Applied Mathematics 28, 4 (1975), 779792.

[26] Erdogan, E., and Iyengar, G. On two-stage convex chance constrained problems.

Mathematical Methods of Operations Research 65, 1 (2007), 115140.

[27] Floudas, C. A., and Lin, X. Mixed integer linear programming in process schedul-

ing: Modeling, algorithms, and applications. Annals of Operations Research 139, 1

(2005), 131162.

Mathematical Programming 7, 1 (1974), 117143.

[29] Georghiou, A., Wiesemann, W., and Kuhn, D. Generalized decision rule approx-

imations for stochastic programming via liftings. Submitted for publication, available

via Optimization Online.

[30] Gilboa, I., and Schmeidler, D. Maxmin expected utility with non-unique prior.

Journal of Mathematical Economics 18, 2 (1989), 141153.

[31] Goh, J., and Sim, M. Distributionally robust optimization and its tractable approx-

imations. Operations Research 58, 4 (2010), 902917.

[32] Gounaris, C. E., Wiesemann, W., and Floudas, C. A. The robust capacitated

vehicle routing problem under demand uncertainty. Submitted for publication.

[33] Gupta, A., and Maranas, C. D. Managing demand uncertainty in supply chain

planning. Computers & Chemical Engineering 27, 8-9 (2003), 12191227.

[34] Herroelen, W., and Leus, R. Project scheduling under uncertainty: Survey and

research potentials. European Journal of Operational Research 165, 2 (2005), 289306.

[35] Janak, S. L., Lin, X., and Floudas, C. A. A new robust optimization approach

for scheduling under uncertainty: II. uncertainty with known probability distribution.

Computers & Chemical Engineering 31, 3 (2007), 171195.

[37] Kuhn, D., Parpas, P., and Rustem, B. Stochastic optimization of investment plan-

ning problems in the electric power industry. In Process Systems Engineering: Volume

36

5: Energy Systems Engineering, M. Georgiadis, E. Kikkinides, and E. Pistikopoulos,

Eds. Wiley-VCH, 2008, ch. 4, pp. 215230.

[38] Kuhn, D., Wiesemann, W., and Georghiou, A. Primal and dual linear decision

rules in stochastic and robust optimization. Mathematical Programming 130, 1 (2011),

177209.

spectives at the beginning of the 21st century. Journal of Intelligent Manufacturing 18,

2 (2007), 273284.

[40] Lin, X., Janak, S. L., and Floudas, C. A. A new robust optimization approach

for scheduling under uncertainty: I. bounded uncertainty. Computers & Chemical En-

gineering 28, 6-7 (2004), 10691085.

[41] Liu, S., Pinto, J. M., and Papageorgiou, L. G. A TSP-based MILP model

for medium-term planning of single-stage continuous multiproduct plants. Industrial &

Engineering Chemistry Research 47, 20 (2008), 77337743.

[42] Meixell, M. J., and Gargeya, V. B. Global supply chain design: A literature re-

view and critique. Transportation Research Part E: Logistics and Transportation Review

41, 6 (2005), 531550.

[43] Melo, M., Nickel, S., and da Gama, F. S. Facility location and supply chain

management A review. European Journal of Operational Research 196, 2 (2009),

401412.

[44] Min, H., and Zhou, G. Supply chain modeling: past, present and future. Computers

& Industrial Engineering 43, 1-2 (2002), 231249.

[45] Mula, J., Poler, R., Garcia-Sabater, J., and Lario, F. Models for production

planning under uncertainty: A review. International Journal of Production Economics

103, 1 (2006), 271285.

[46] Natarajan, K., Pachamanova, D., and Sim, M. Constructing risk measures from

uncertainty sets. Operations Research 57, 5 (2009), 11291147.

In Probabilistic and Randomized Methods for Design under Uncertainty, G. Calafiore

and F. Dabbene, Eds. Springer, 2005, ch. 1, pp. 347.

37

[48] Nemirovski, A., and Shapiro, A. Convex approximations of chance constrained

programs. SIAM Journal on Optimization 17, 4 (2007), 969996.

[49] Pagnoncelli, B. K., Ahmed, S., and Shapiro, A. Sample average approximation

method for chance constrained programming: Theory and applications. Journal of

Optimization Theory and Applications 142, 2 (2009), 399416.

ity. Wiley-Blackwell, 2007.

nal of Risk 2, 3 (2000), 2142.

[54] Ruszczynski, A., and Shapiro, A., Eds. Stochastic Programming. Elsevier, 2003.

[55] Santoso, T., Ahmed, S., Goetschalckx, M., and Shapiro, A. A stochastic

programming approach for supply chain network design under uncertainty. European

Journal of Operational Research 167, 1 (2005), 96115.

[56] See, C.-T., and Sim, M. Robust approximation to multiperiod inventory manage-

ment. Operations Research 58, 3 (2010), 583594.

lems. Applied Optimization 99, 1 (2005), 111146.

[58] Snyder, L. Facility location under uncertainty: A review. IIE Transactions 38, 7

(2006), 547564.

[59] Tsiakis, P., Shah, N., and Pantelides, C. C. Design of multi-echelon supply chain

networks under demand uncertainty. Industrial & Engineering Chemistry Research 40,

16 (2001), 35853604.

o.rug.nl/mally/spbib.html.

[61] Vayanos, P., Kuhn, D., and Rustem, B. A constraint sampling approach for

multi-stage robust optimization. Automatica, available via Optimization Online (2011).

38

[62] Verderame, P. M., Elia, J. A., Li, J., and Floudas, C. A. Planning and

scheduling under uncertainty: A review across multiple sectors. Industrial & Engineer-

ing Chemistry Research 49, 9 (2010), 39934017.

[63] Wiesemann, W., Kuhn, D., and Rustem, B. Robust resource allocations in tem-

poral networks. Mathematical Programming, available via Online First.

39

Figure 11: Comparison of the supply chain networks for = 0 (top chart), = 0.5 (middle chart)

and = 1 (bottom chart). White lines represent the transportation links between the warehouses

and the distribution centres for different realisations of the uncertain demand.

40

- 138uf_ME_T_Indus.pdfTransféré parKrishna Kumar Yadav
- Chance-Constrained Problems With Stochastic QuadraticTransféré parPaz Ochoa
- Hydrothermal Scheduling Using Benders Decomposition- Accelerating TechniquesTransféré parkaren dejo
- Linear ProgrammingTransféré parmy_mail1185
- Scheduling Optimisation of Chemical Process PlantTransféré parPrateikMenon
- elenco_abilTransféré paraskafjwpij98u
- Adaptive Robust Optimization for the Security Constrained Unit Commitment Problem.pdfTransféré parShafeek Ghreeb
- Draft Ujian 1 (Teknik KuantitatifTransféré parKirul Nizam Aziz
- OptimizationTransféré parSebastián Cáceres G
- 38-44.pdfTransféré parray m derania
- 2008Transféré parPeerapong Nitikhetprecha
- LINEAR PROGRAMMING.pptxTransféré parKwin Hazel Satiada
- CEC2015 Stochastic PortfolioTransféré parLaura B.
- tcad09-rumble.pdfTransféré parZakzouk Zakzouk
- or_ug_proj_7310Transféré parクマー ヴィーン
- Module 1 - Introduction to OrTransféré parsrk bhai
- Ch3 and 4 - Group two PPT.pptxTransféré parAnn Ak
- Sensitivity Analysis of Linear ProgrammingTransféré parRj Malayao
- 19 Linear Progamming.pptxTransféré parTorres Colunga Julio
- CadTransféré parPratik Kalkal
- Simplex 2Transféré parMukesh Bohra
- Relation between linear and nonlinear linear programming chapterTransféré parap021
- Transportation Statistics: 2003 06Transféré parBTS
- 6 Mixed-Integer Linear Programming.pdfTransféré parAugusto De La Cruz Camayo
- Chapter11.pdfTransféré parZoha Mughal
- Crane OptimizationTransféré parNaul Neyugn
- Project GuidelineTransféré parMohamed Farag Mostafa
- FrsavsfsdfsfTransféré parFelipe Vargas
- Hyperstudy 2019 TutorialsTransféré parPrabhakar Bharath
- IS_Core_Exam_PreparationTransféré parSukhmander Singh

- 1-112 Attack Helicopter ManualTransféré parParijata JATA Mackey
- Insulattion WorksTransféré parMOST PASON
- Welding Inspection CSWIP GudTransféré parAbdul Muneer A
- WIS 10 Interp exeTransféré parDinesh Radhakrishnan
- Ntr OductionTransféré parMOST PASON
- Chance Pile Step5-Lateral CapacityTransféré parbigred210
- Pickling and Passivation Procedure A380Transféré parSuyog Gawande
- a-spreadsheet-program-for-the-calculation-of-piping-systems-and-the-selection-of-pumps.pdfTransféré partebo8tebo
- SteelTransféré parMOST PASON
- 1966_Jachens_Steam Turbines Their ConstructionTransféré parVikash Pattanaik
- 029_TS_Hardox_welding_of_cutting_edge_to_bucket_and_adapters_to_cutting_edge_UK.pdfTransféré parAnonymous m43RBxOVZ2
- 5 Structural Components FINAL DRAFTTransféré parMOST PASON
- DMWATERPLANTSPECS.pdfTransféré parMOST PASON
- MIP SAMPLETransféré parRakesh Ranjan
- 290814533-ITP-Crane-Pertamina-EP.xlsxTransféré parMOST PASON
- RH Notification Process TimelineTransféré parMOST PASON
- BottleBlowDownAnalysisLesson_Final2.pptxTransféré parMOST PASON
- A2U8A2 - KBR Quality Plan for Phase Two Mechanical and Electrical Work .pdfTransféré parMOST PASON
- FRT Tanggeung J02Transféré parMOST PASON
- projectfinalreporttemplateTransféré parapi-368044264
- Lighting System Inspection ChecklistTransféré parMOST PASON
- Inspection Test PlanTransféré parMOST PASON
- Wc 500004455Transféré parMOST PASON
- This Content Provides You With a Example Piping and Equipment Cleaning SpecificationTransféré parMOST PASON
- Comparison Between Dynamic and Static Pile Load TestingTransféré parMOST PASON
- Ion Exchange Resins_water Conditioning ManualTransféré parm9m2070
- case-method-and-the-pda (1).pdfTransféré parStephanie Haynes
- inspection-checklist-overhead-crane.pdfTransféré parMOST PASON
- Amberlite_IRA405ClTransféré parMOST PASON
- Prestressed StructureTransféré parMOST PASON

- SLS-T Allgemein VMT_jmTransféré parjmtexla68
- GrileTransféré parValeriu Barbu
- CH01.pdfTransféré paroucht79
- PRGM10 Age UtilitiesTransféré parAnonymous 07mPdKC
- Infosys Implementaciones PagosTransféré parSharat Asija
- ASA1Transféré parSamba Sidibé
- SAP PS ConfigurationTransféré parParisheel Vasudeo
- Assignment 2Transféré parDharneeshkarDandy
- 7UT6 Differential Protection RelayTransféré parCristobal Orlando Valenzuela Rios
- Bliese MultilevelTransféré parmkgosi4
- 003gpontech-150526072057-lva1-app6891.pdfTransféré parleoninus
- Water Efficiency May 2013Transféré parAshok Subramaniyan
- Sbc Scx640 AclireferenceTransféré parwill_blix
- hungama ho gayaTransféré parhungama ho gaya
- Labview Monitoring (1)Transféré parSyed Mujtaba Ahmed Qadri
- 4.12 Microwave Link (Sdh & Pdh)Transféré parbbiswajit88ece
- Revlon Inc Case Study1Transféré partayyabashah48
- Ec1258 MP LabTransféré parjagabeeee
- Lenovo IdeaPad Z570 DatasheetTransféré paranand2k1
- Instagram Study 2015Transféré parSarah Bastien
- Morrow-Seminar Paper FINALTransféré parDavid Perry
- garmin-etrex-30Transféré parAniruddh Singh
- 041918hpeserverspresentsvmwarecloudfoundationonhpesynergyfinal1524082557282Transféré parSandeep
- p19 WarnerTransféré parreix
- Programa Dips RocscienceTransféré parDanielLozada
- SwethaCheruku_hadoopdeveloperTransféré parAnonymous jqsU1lFQav
- Factor bTransféré parPaola Andrea
- Determining beta factor for a 2N2222 transistorTransféré parJuan Carlos Morales Parra
- b2s brochureTransféré parapi-260469323
- Altec-Peerless 4665 Audio Transformer Addendum 10-12Transféré parRobert Gabriel