Vous êtes sur la page 1sur 161

2010

Optimization Methods

This is the first time that I have included notes to go with the PowerPoint slides. I
hope that they are helpful.
Introductions
James Orlin
Professor of Operations Research at Sloan

Tarik Bourezgue
Professor of Applied Mathematics at MSRI

2
Required Materials

z Course notes

3
Grading Policy

z Midterm 1 (15%)
z Midterm 2 (15%)
z Final Exam (20%)
z Recitation quizzes (20%)
z Homework (except Excel) (15%)
z Excel Homework and Project (15%)
z Extra Credit (up to 5%)

4
Homework Policy

z Approximately 1 homework set per week.


– Students may work in groups of 2
– non-linear grading scheme per assignment
(similar to converting scores to A, B, C, D, F)
• 85% correct leads to a grade of 5/5
• < 50% correct leads to a grade of 0/5
• lowest score is dropped

z Excel Solver “Case study”


– around 8 assignments on a “diet problem.”
– illustrates many different concepts in 15.053
– individual work (but students can discuss with
others)

The Excel case study is new this year. In the past, we have had Excel as part of the
regular homework; because homework was done in teams, many students deferred
the Excel to their partner. Moreover, the Excel assignments were very time
consuming.

This year, we have a continuing case study. By using a single example throughout
the term, we can better illustrate the way that Excel can be used in practice to solve
optimization problems. In addition, having a running example makes it more
efficient for students to carry out the assignment.

Many former 15.053 students have told us that learning that Excel Solver has been
very useful in their summer jobs and in their post MIT jobs.
Class Requirements

z “Required” Lecture Attendance


- 4% for missing more than 10 lectures
-2% points for missing between 7 and 10 lectures
+ 2% points for missing at most 2 lectures

z Required Recitation Attendance


– weekly quizzes (15% of the grade)
– attendance (5% of the grade)

– There is recitation this week.


– sections will be posted on line on Wednesday.

We have given the requirements of class attendance and recitation attendance a lot
of thought. First of all, we know that requiring lecture attendance goes against the
grain of undergraduate education at MIT and reminds some students of high school
requirements. Ironically, it is also typical of what is expected of MBA students, and
it is expected of job performance in most any job. It is a relatively short phase of
one’s life (undergraduate education), where attendance is usually optional.

Most Sloan classes expect students to keep up with the material by preparing in
advance of class and then participating in class discussions. In 15.053, we provide
incentives for students to keep up with the materials through required class and
recitation attendance and through quizzes. We have found that this is very
effective. Since requiring recitations, students have mastered the material far
better, and their grades have reflected it.

15.053 is not graded on a curve, but is graded based on what students learn.
Moreover, we have set the grading policy in the past, and we do not adjust it
upwards as students are learning more. So, if every student learns the material well,
it is possible for every student to get an A.
“Active Learning”

z Occasionally I will introduce a break in the lecture


for you to work on your own or with an in-class
partner. “Cognitive balancing.”

z Please identify your “partner” now.

z Those on aisle ends may be in a group of size 3.

Even very intelligent people have difficulty absorbing new information for an
extended period of time without being able to reflect on that material. The breaks in
class are sometimes used for reflection, and sometimes used for a mental break
between different subjects.
An optimization problem

z Given a collection of numbers, partition them into


two groups such that the difference in the sums
is as small as possible.

z Example: 7, 10, 13, 17, 20, 22


These numbers sum to 89

I can split them into {7, 10, 13, 17} sum is 47


{20, 22} sum is 42
Difference = 5.

Can we do better?

The amazing thing about this Excel example is that it typically leads to solutions
that are under 10, often 0 or 1. This would be extremely hard to achieve without
formal optimization.
What is Operations Research?

What is Management Science?

z World War II : British military leaders asked scientists


and engineers to analyze several military problems
– Deployment of radar
– Management of convoy, bombing, antisubmarine, and
mining operations.
z The result was called Operations Research

z MIT was one of the birthplaces of OR


– Professor Morse at MIT was a pioneer in

the US

– Founded MIT OR Center and helped to

found ORSA

For a book on the history of OR, see Professor Morse’s autobiography In at the
Beginnings: A Physicist's Life. Also, there is a book by Gass and Assad entitled
“An Annotated Timeline of Operations Research: and Informal History. (The only
event listed for 1993 was the awarding of the Lanchester Prize to the book Network
Flows, co-authored by Ravi Ahuja, Tom Magnanti, and Jim Orlin).
What is Management Science
(Operations Research)?
Operations Research (O.R.) is the discipline of
applying advanced analytical methods to help
make better decisions.

10

There is a promotional activity in our society which says that OR is the “science of
better.” The hope of this promotion was that it would encourage a dialogue about
what OR really is, with the “science of better” being its starting point.
Unfortunately, it often leads to questions such as “If OR is really the science of
better, why couldn’t they come up with a better slogan?” And I do not have an
answer for this.

See http://www.scienceofbetter.org for interesting applications of Operations


Research.
Examples of OR/MS in Practice

Moldavian Airlines
Airline fleet
management

Chisinau
Budapest

Timisoara

Figure by MIT OCW.

Revenue Image removed due to


management and pricing copyright restrictions
11

For lots of applications of OR, look at the website


http://www.lionhrtpub.com/ORMS.shtml, which is the website for OR/MS Today.
Then look under past issues for interesting applications.
Examples of OR/MS in Practice

Supply chain Image removed due to


management copyright restrictions

Image removed due to


Travelocity copyright restrictions

12
Some Skills for Operations Researchers

z Modeling Skills
– Take a real world
situation, and Image removed due to

model it using copyright restrictions

mathematics

z Methodological Not this

Toolkit

– Optimization Image removed due to


– Probabilistic copyright restrictions

Models

http://www.dominoartwo 13
rk.com/prints.html

The picture of Lincoln is composed of a very large number of dominos, as


illustrated in the picture below Lincoln. It is a model of Lincoln, but not the type of
model we talk about in 15.053.
Some of the themes of 15.053

z Optimization is everywhere
z Models, Models, Models
z The goal of models is “insight” not numbers
– paraphrase of Richard Hamming

z Algorithms, Algorithms, Algorithms

14

Check out the course information for more on these themes.


Optimization is Everywhere
z It is embedded in language, and part of the way
we think.
– firms want to maximize value to shareholders
– people want to make the best choices
– We want the highest quality at the lowest price
– When playing games, we want the best strategy
– When we have too much to do, we want to optimize
the use of our time
– etc.

Take 3 minutes with your partner to brainstorm on where


optimization might be used. (business, or sports, or
personal uses, or politics, or …)

15

Usually when people brainstorm, they come up with a wide range of places where optimization
occurs.

Within sports, anyone who has read “Moneyball” knows that baseball General Managers try to
develop an optimum team by carefully analyzing each players statistics. And, each manager is
trying to win each game, which is a form of optimization.

Usually several students have worked with firms that are interested in “supply chain management.”
This involves optimizing the obtaining of materials for manufacturing, and also involves the
optimum distribution of delivered products.

Within politics, optimizing the use of TV advertising is becoming a crucial skill.

At MIT, we want to optimize the use of our classrooms, our parking facilities, our labs, and more.
And everyone is concerned about the optimal use of his or her time.

Countless other applications of optimization can come to mind with more thought.
16

I left this page blank to write in places where optimization occurs.


From Google Search
z We searched Google for the number of pages
with the expression “optimal X”

z So, let us guess what was in the top 8 by playing


the 15.053 version of family feud.

17
On the internet, what is the last
word in the phrase “optimal _”.

Performance
1 1.24 Design
5 .88

2
Control 1.12 6
Time .63
3
Health 1.01 7
Choice .60

4
Solution .96 8
Growth .56

Totals are in millions


Hide answers
18

It took me a long time to find out how to do the effects on this slide. Click on a
number and the answer is revealed. Click on a red rectangle and the number of
Google hits (in millions) is revealed. Click on “hide answers” and all the answers
are hidden again.
On 15.053 and Optimization Tools

z Some goals in 15.053:


– Present a variety of tools for optimization
– Illustrate applications in manufacturing, finance, e-
business, marketing and more.
– Prepare students to recognize opportunities for
mathematical optimization as they arise

19

Optimization is everywhere. But that does not mean that we can use the
methodologies taught in 15.053 everywhere. One of the goals of 15.053 is to help
students learn where the optimization methodologies can be used in real life.
Linear Programming

z minimize or maximize a linear objective

z subject to linear equalities and inequalities

Example. Max is in a pie eating contest that lasts 1


hour. Each torte that he eats takes 2 minutes. Each
apple pie that he eats takes 3 minutes. He receives
4 points for each torte and 5 points for each pie.
What should Max eat so as to get the most points?

Step 1. Determine the decision variables

z Let x be the number of tortes eaten by Max.

z Let y be the number of pies eaten by Max.

20

The decision variables are the variables whose specification describes the solution
for the problem. It typically comprises the set of decisions to be made.
Max’s linear program
Step 2. Determine the objective function
Step 3. Determine the constraints

Maximize z = 4x + 5y (objective function)

subject to 2x + 3y ≤ 60 (constraint)

x ≥ 0 ; y ≥ 0 (non-negativity constraints)

A feasible solution satisfies all of the constraints.


x = 10, y = 10 is feasible; x = 10, y = 15 is infeasible.
An optimal solution is the best feasible solution.
The optimal solution is x = 30, y = 0, z = 120.
21

The objective function for a linear program is a linear function that we want to
maximize or minimize. A linear function of x and y is of the form ax + by for some
real numbers a and b.

The constraints provide limits on what choices of decision variables are permissible.
Terminology
z Decision variables: e.g., x and y.
– In general, these are quantities you can control to improve
your objective which should completely describe the set
of decisions to be made.

z Constraints: e.g., 2x + 3y ≤ 24 , x ≥ 0 , y ≥ 0
– Limitations on the values of the decision variables.
z Objective Function. e.g., 4x + 5y
– Value measure used to rank alternatives
– Seek to maximize or minimize this objective
– examples: maximize NPV, minimize cost

22
A geometrical portrayal
y
10

8
2x + 3y ≤ 24

4 The optimal
Feasible solution
2 Region

x
2 4 6 8 10 12

23

In order to get insight into properties of linear programs, it helps to consider the
geometry. In Lectures 3 and 4, we will consider geometrical properties of linear
programs in detail.
David’s Tool Corporation (DTC)

z Motto: “We may be no Goliath, but we think big.”

z Manufacturer of slingshots kits and stone


shields.

24

David’s tool Corporation is a fanciful and ahistorical application of linear


programming. But it is also representative of “product mix” applications of linear
programs, that is, what is a mix of products to manufacture that satisfies resource
constraints and maximizes net revenue.
Data for the DTC Problem

Slingshot Stone Resources


Kits Shields
Stone 2 hours 3 hours 100 hours
Gathering
time

Stone 1 hour 2 hours 60 hours


Smoothing
Delivery 1 hour 1 hour 50 hours
time

Demand 40 30
Profit 3 shekels 5 shekels

25
Formulating the DTC Problem as an LP

Step 1: Determine Decision Variables

K = number of slingshot kits manufactured

S = number of stone shields manufactured

Step 2: Write the Objective Function as a linear function


of the decision variables
Maximize Profit =

Step 3: Write the constraints as linear functions of the


decision variables

subject to

26

When formulating a linear program, one typically chooses the decision variables
first. Then one writes the objectives and constraints. If it is too difficult to express
the constraints, then one may want to reconsider the choice of decision variables.

On this slide, we start to formulate the problem.


The Formulation Continued

Step 3: Determine Constraints

Stone gathering:
Smoothing:
Delivery:
Slingshot demand:
Shield demand:

We will show how to solve this in Lecture 3.


27

maximize 3K + 5S

2K + 3S <= 100
K + 2S <= 60
K + S <= 50
K <= 40
S <= 50
K >= 0 and S >= 0.
We ignore integrality constraints.
Addressing managerial problems: A
management science framework

1. Determine the problem to be solved


2. Observe the system and gather data
3. Formulate a mathematical model of the problem
and any important subproblems
4. Verify the model and use the model for prediction
or analysis
5. Select a suitable alternative
6. Present the results to the organization
7. Implement and evaluate
28

This framework is one of many on how to solve a managerial problem using


Operations Research techniques.

Steps 3 and 5 are highlighted (underlined) because we will focus on them during the
semester. The other steps are enormously important in practice, but are not the
subject material for 15.053.
How problems get large,
and what to do.

z Suppose that there are 10,000 products and 100


raw materials and processes that lead to
constraints. Then use an algebraic description of
the problem, as described in the tutorial.

29

Problems presented in 15.053 typically are easy to describe and small. This is
because large problems require much more time to explain, and smaller problems
typically suffice to illustrate concepts and applications.

In practice, problems can get very large, much too large to consider for Excel
Solver. This slide illustates how the simple DTC problem can get very large if one
has lots of products and raw materials. 10,000 products seems like a lot, but there
are more than 10,000 different types of screwdrivers. It is easy to imagine that there
can be lots of products for more complex applications.
Linear Programs
z A linear function is a function of the form:
f(x1, x2, . . . , xn) = c1x1 + c2x2 + . . . + cnxn
= ∑i=1 to n cixi
e.g., 3x1 + 4x2 - 3x4.

z A mathematical program is a linear program (LP) if the


objective is a linear function and the constraints are
linear equalities or inequalities.
e.g., 3x1 + 4x2 - 3x4 ≥ 7
x1 - 2x5 = 7

z Typically, an LP has non-negativity constraints.


30

We usually use cj to denote the cost coefficient for variable xj.

It really helps to understand abstract formulations if a similar notation is used from

problem to problem.

More on Linear Programs

z A linear program must have linear objectives and


linear equalities and inequalities to be considered
a linear program.

Maximize x1
Not a linear
subject to 3x1 + 4x2 ≥ 7 program.
x1 - 2x5 = 7
|x1| ≥ 0

Maximize x2 Not a linear


subject to x = 3 program.

31

It’s also not a linear program if the inequality constraints are strict, as in x > 0.
A non-linear program is permitted to have a
non-linear objective and constraints.

z maximize f(x,y) = xy

z subject to x - y2/2 ≤ 10

3x – 4y ≥ 2

x ≥ 0, y ≥ 0

Both a linear and


Minimize x
a non-linear
subject to x ≥ 3
program.

32

It’s weird that a special case of “nonlinear programming” is linear programming.


But this is because nonlinear programming permits a nonlinear objective and
nonlinear constraints, but it does not require them.
An integer program is a linear program plus
constraints that some or all of the variables
are integer valued.

z Maximize 3x1 + 4x2 - 3x3


3x1 + 2x2 - x3 ≥ 17
3x2 - x3 = 14
x1 ≥ 0, x2 ≥ 0, x3 ≥ 0 and
x1 , x2, x3 are all integers

33

Integer programming is amazingly useful in practice, far more useful than linear
programming. We will see how integer programming can model many different
types of constraints and objectives in lectures in the second half of 15.053. But one
needs to understand linear programming well before one can understand integer
programming. That is why we focus much more on linear programming in 15.053.
Preview of Some 15.053 Examples

Applying LP and
NLP to optimal
radiation therapy.

How to price in a fair


manner.

34

We will talk about radiation therapy in Lecture 2.

We will discuss pricing issues when we talk about linear programming duality.

Preview Continued

Image removed due to


copyright restrictions

Find the shortest path in a network 35


More preview

2-person zero sum


games.

How to solve
puzzles

36

More on these examples later.


Summary

z Answered the question: What is Operations


Research & Management Science? and provided
some historical perspective.

z Introduced the terminology of linear programming

z An Examples: David’s Tool Company

37
2010

More Linear and


Non-linear Programming Models
including some non-linear optimization
problems that can be made linear

plus applications to radiation therapy

1
Overview of Lecture

z Goals
– get practice in recognizing and modeling linear
constraints and objectives
– and non-linear objectives
– to see a broader use of models in practice

2
Announcements

z Required “tutorials” on line.


z Excel solver and help
– A goal: develop more Excel skills

3
Quotes for today

Reality is merely an illusion, albeit


a very persistent one.
Albert Einstein

Everything should be made as


simple as possible, but not one bit
simpler.
Albert Einstein, (attributed)

“Reality is an illusion” is interesting in the context of 15.053. One wants to model


“reality”, but all we can do is approximate reality. I could wax poetic on this
philosophical theme, but won’t do so here.

We try to keep models as simple as possible. Too much detail in a model makes it
extremely hard to get accurate information, and actually can lead to less accurate
models in practice. One wants to capture the most important aspects of reality in a
model, and not much more than that.

“Excel Tip of the Day” is a new feature this year. If you have any tips you want to
share, please send them to me.
Overview

z Scheduling Postal Workers (5 in 7)


– The model
– Practical enhancements or modifications
– Two non-linear objectives that can be made

linear

– A non-linear constraint that can be made linear

5
Scheduling Postal Workers
z Each postal worker works for 5 consecutive days,
followed by 2 days off, repeated weekly.

Day Mon Tues Wed Thurs Fri Sat Sun

Demand 17 13 15 19 14 16 11

z Minimize the number of postal workers (for the


time being, we will permit fractional workers on
each day.)

I like this problem because it was one of the first scheduling problems that I saw as
a graduate student, and it was the first one that I wrote a paper on. My paper,
written in 1977, was of no lasting significance in and of itself and was never
published. But it led me to an interest in optimization in which the requirements
repeat periodically. I later wrote more than 10 papers on this topic as well as my
Ph.D. dissertation.
Formulating as an LP

z Don’t look ahead.

z Let’s see if we can come up with what the


decision variables should be.

z Discuss with your neighbor how one might


formulate this problem as an LP.

7
The linear program
Day Mon Tues Wed Thurs Fri Sat Sun

Demand 17 13 15 19 14 16 11

Minimize z = x1 + x2 + x3 + x4 + x5 + x6 + x7

subject to x1 + x4 + x5 + x6 + x7 ≥ 17 Mon.
x1 + x2 + x5 + x6 + x7 ≥ 13 Tues.
x1 + x2 + x3 + x6 + x7 ≥ 15 Wed.
x1 + x2 + x3 + x4 + x7 ≥ 19 Thurs.
x1 + x2 + x3 + x4 + x5 ≥ 14 Fri.
x2 + x3 + x4 + x5 + x6 ≥ 16 Sat.
x3 + x4 + x5 + x6 + x7 ≥ 11 Sun.
xj ≥ 0 for j = 1 to 7
8

This is a really unusual model in that the constraint of “5 days on followed by 2


days off” is handled by choosing the decision variables carefully. In particular, the
first decision variable is the number of workers who start on Monday and work
through Friday. The next 6 variables are chosen similarly. In this way, any
selection of decision variables will automatically satisfy the constraint.

There are seven equality constraints, one for each day of the week. The first
variable (the number of workers on the Monday to Friday shift) is part of each of the
constraints on Monday through Friday. We normally view a linear program in
terms of constraints and think of constraints one at a time. But sometimes it is useful
to think of a variable and think of what the coefficients will be for that variable as
we consider each constraint.
A Closer Look at the Constraint Matrix

x1 x2 x3 x4 x5 x6 x7

1 0 0 1 1 1 1 ≥ 17 Mon

1 1 0 0 1 1 1 ≥ 13 Tues

1 1 1 0 0 1 1 ≥ 15 Wed

1 1 1 1 0 0 1 ≥ 19 Thurs

1 1 1 1 1 0 0 ≥ 14 Fri

0 1 1 1 1 1 0 ≥ 16 Sat

0 0 1 1 1 1 1 ≥ 11 Sun

It is cyclically repeating in both rows and columns.


9

It will be standard for us to write the variables at the top, and to express the
constraints in terms of coefficients. So the above formulation is equivalent to the
one on the previous slide.

Each variable correspond to five consecutive days on (assuming Sunday is adjacent


to Monday). In terms of the constraint matrix, this results in consecutive ones in
each column (assuming that the first and last rows are considered consecutive).

The fact that there are also consecutive ones in each row is due to the fact that the
workers on any day (for example, Monday) must come from one of five shifts, and
the shifts start in adjacent days.
On the selection of decision variables

z A choice of decision variables that doesn’t work


– Let yj be the number of workers on day j.
– No. of Workers on day j is at least dj. (easy to formulate)
– Each worker works 5 days on followed by 2 days off (hard).

z Conclusion: sometimes the decision variables


incorporate constraints of the problem.
– Hard to do this well, but worth keeping in mind
– We will see more of this in integer programming.

10

It is natural to try to start with a decision variable corresponding to the number of


workers on each day. Unfortunately, it would not be possible to add the constraint
that each worker works for five consecutive days under this choice of variables.
(It’s hard to prove that it isn’t possible to model it. But you can try for yourself and
see that it can’t be done. By the way, if you think you can do it, the odds are that
you are wrong.)

For most of the models in 15.053, the choice of the decision variables is more
straightforward. But in practice, problems can come up with complex constraints
that are extremely difficult to model as constraints but which can be handled
through a clever choice of decision variables.
Some Modifications of the Model

z Suppose that there was a pay differential. The cost of


workers who start work on day j is cj per worker.

z Suppose that one can hire part time workers (one day
at a time), and that the cost of a part time worker on
day j is pj.
What are the new decision variables?
What are the changes to the model?

11

When one develops a decision model in practice, it is often best to start with a
simple model that captures only a little of reality and then add in more realistic
constraints in subsequent steps. If one tries to capture the whole model in one step,
it is often too complex and too difficult to do.

In addition, we want to emphasize that there are a lot of constraints and costs that
can be handled using linear programming.
What are the new decision variables?

Minimize z=

subject to x1 + x4 + x5 + x6 + x7 ≥ 17
x1 + x2 + x5 + x6 + x7 ≥ 13
x1 + x2 + x3 + x6 + x7 ≥ 15
x1 + x2 + x3 + x4 + x7 ≥ 19
x1 + x2 + x3 + x4 + x5 ≥ 14
x2 + x3 + x4 + x5 + x6 ≥ 16
x3 + x4 + x5 + x6 + x7 ≥ 11
xj ≥ 0 for j = 1 to 7
12
Another Enhancement
z Suppose that the desirable number of workers on day j
is dj, but it is not required. Let sj be the “excess”
number of workers day j. sj > 0 if there are more
workers on day j than dj; otherwise sj ≤ 0.

z What is the minimum cost schedule, where the “cost”


of having too many workers on day j is fj(sj), which is a
non-linear function?

13

In practice, it is often difficult to tell the difference between constraints and goals.
A manager may say that she needs 17 workers on Monday, but we know that the
firm will function OK with 16, albeit perhaps not as well. So, it is not a “hard
constraint”, and we may refer to it as a “soft constraint”, that is, one that can be
violated. A hard constraint is one that can never be violated.

In this model, we treat the number of workers on each day as a goal for the day, and
penalize not reaching the goal exactly. Having too many workers may be inefficient
uses of labor. Having too few workers may make it difficult to handle the tasks for
the day.
What are the new decision variables?

What is the resulting non-linear model?

Minimize z =

subject to x1 + x4 + x5 + x6 + x7 d1
x1 + x2 + x5 + x6 + x7 d2
x1 + x2 + x3 + x6 + x7 d3
x1 + x2 + x3 + x4 + x7 d4
x1 + x2 + x3 + x4 + x5 d5
x2 + x3 + x4 + x5 + x6 d6
x3 + x4 + x5 + x6 + x7 d7

xj ≥ 0 for j = 1 to 7
14

Often students will try to come up with something more complex than minimizing
f1(s1) + … f7(s7). Once it is pointed out that this is the correct objective, students
find it quite obvious.
On non-linear functions

z Occasionally a non-linear program can be


transformed into a linear program.

z Rare, but useful when it occurs

z In general, non-linear programming solvers can


work well on a minimization problem when the
objective function is convex

15

Non-linear objectives occur very frequently in practice. In linear programming, we


sometimes try to approximate non-linear functions using linear functions. This may
lead to a significant loss of accuracy. If a linear approximation is very inaccurate,
we can keep the non-linear objectives and use a non-linear programming solver.
Unfortunately, it is much harder to solve a non-linear program than a linear
program.

A great situation occurs when the non-linear program can be transformed into a
linear program without any loss of accuracy. We will soon give a couple of
examples of this very fortuitous situation.
Examples of Non-linear Functions


7
j =1
( s j )2 sum of squared s’s


7
j =1
c j ( s j )2
weighted sum of squared s’s


7
| sj | sum of absolute values of s’s
j =1

2∑ j =1 x j − ∑ j =1 s j
7 7
twice number of workers – sum of s’s

∑ ∑
7 7
i =1 j =1
si s j

An objective function is called separable if it can be


expressed as the sum of functions of 1 variable.
16

There are a lot of nice modeling properties of the sum of squares, and it is a
commonly used objective in practice. For example, in finance models, minimizing
risk corresponds to minimizing variance, which can be expressed as a quadratic
function.

It takes a lot longer to explain how minimizing risk in finance can sometimes be
represented as a non-linear program, and so I will only provide this “teaser” here.

Separable non-linear objectives are often much easier to deal with, and such
problems can typically be solved faster.
Convex functions of one variable

A function f(x) is convex if for all x and y, the


line segment on the curve joining (x, f(x)) to
(y, f(y) lies above the curve.

25

20

15

10

0
0 5 10
17

The convex shape of a function f(x) of a single variable is easy to recognize. But
the definition on this slide is the proper mathematical way of saying it.

More formally, we would write


f((1-q) x + q x’) <= (1-q) f(x) + q f(x’) for all q in [0,1]. (1)

x and x’ are two points on the x-axis.


q is a fraction between 0 and 1 (say 1/3)
x” = (1-q)x + q x’ is a mixture of x and x’.
If q = 1/3, then x” lies 1/3 of the way from x to x’.

f(x”) = f((1-q) x + q x’) is the value of the function at x”


(1-q) f(x) + q f(x’) is the value for x” on the red curve of the diagram.
Which functions are convex?

f(x) = x2 f(x) = x3 for x ≥ 0 f(x) = x.5

f(x) = |x| Step Function whatever

Yes No
Yes
18

If you guessed wrong for any of these curves, please review the definition of
convexity.
The max of several linear functions is convex.

f1(x) = x/3
f2(x) = 3 f(x) = max{f1(x), f2(x), f3(x)}

f3(x) = 5 – x/2

5 10 15

19

The example gives the max of three linear functions of one variable. But a max of

linear functions of more than one variable is also convex. In fact, the max of

convex functions is convex.

We can see that the max of two convex functions is convex as follows:

Suppose that f( ) and g( ) are both convex functions. Let h(x) = max(f(x), g(x)).

We will show that h(x) is convex.

Suppose that q ∈ [0, 1]. Let x” = (1-q) x + q x’

Then h(x”) = max {f(x”), g(x”)}


<= max {(1-q) f(x) + q f(x’), (1-q) g(x) + q g(x’)}
<= (1-q) max {f(x), g(x)} + q max {f(x), g(x)}
= (1-q) h(x) + q h(x’), showing that h(x) is convex.
On friendly** objective functions

z We will say that f(x) is a friendly non-linear


function if it can be written as the max of one or
more linear functions.

z We say that an objective function for a


minimization problem is friendly if it is a friendly
non-linear function or the sum of friendly non
linear functions.

z If a minimization problem P* has a friendly


objective function its feasible region is that of an
LP, then P* can be expressed as a linear program.

** “Friendly” is a term used in 15.053. Prof. Orlin made it up. 20

OK. “Friendly” may seem like a weird term, but mathematics is filled with such
terms. And converting a non-linear program to a linear program is so fortuitous that
we should in some way acknowledge our gratefulness.

Note that minimizing a friendly objective function subject to linear constraints leads
to an LP. But this is not true for maximization. In fact, maximizing a friendly
objective function subject to linear constraints creates a very difficult problem to
solve. It is at least as hard as an integer programming problem, a topic discussed in
detail later in this semester.
On minimizing friendly objective functions

Example 1. minimize z = max{f1(x), f2(x), f3(x)}


subject to 0 ≤ x ≤ 15
f1(x) = x/3
minimize z
f2(x) = 3
subject to z ≥ x/3
f3(x) = 5 – x/2
5 z≥3
z ≥ 5-x/2
0 ≤ x ≤ 15

5 10 15
z ≥ fi(x) for i = 1 to 3
21

This example illustrates a minimax program of one variable. But the same
transformation shows that minimizing any friendly objective function subject to
linear constraints induces a linear program.
min z
min z = max{f1(x), f2(x), f3(x)}
s.t z ≥ fi(x) for i = 1, 2, 3
s.t a≤x≤b
a≤x≤b
Minimax Problem
LP

Theorem. If (z*, x*) is optimal for the LP, then it


is also optimal for the minimax problem.

Proof. x* is feasible for the


minimax problem.
5
If z* < max{f1(x*), f2(x*), f3(x*)},
then (z*, x*) is infeasible.
If z* > max{f1(x*), f2(x*), f3(x*)},
then (z*, x*) is not optimal for
5 x* 10 15 the LP.
22
Suppose that F is the set of feasible solutions
for some linear programming problem.
Let x be a vector of decision variables.
Let fi(x) be a linear function in x for i = 1 to K.

min z
min z = max { fi(x): 1 ≤ i ≤ K }
s.t z ≥ fi(x) for i = 1, …, K
s.t x∈F
x∈F
Minimax Problem
LP

Theorem. If (z*, x*) is optimal for the LP, then it


is also optimal for the minimax problem.

23

We prove the theorem next.

Suppose that (z*, x*) is optimal for the LP. First one can show that x* is feasible
for the minimax problem. This follows because x* ∈ F, and z* >= max {fi(x): 1 ≤ i
≤ K }. Now this doesn’t prove that z* = max {fi(x): 1 ≤ i ≤ K }. But if z* > max
{fi(x): 1 ≤ i ≤ K }, then (z*, x*) is not optimal for the LP since one could replace z*
by max {fi(x): 1 ≤ i ≤ K }, and get a lower objective. So, it follows that (z*, x*) is
feasible for the minimax problem.

Now we show that there is no better LP solution. We do a proof by contradiction.


Suppose that (z’, x’) were feasible for the minimax problem and z’ < z*. Then (z’,
x’) would also be feasible for the linear program, contradicting that (z*, x*) is
optimal for the LP. So, we conclude that (z*, x*) is optimal for the minimax
problem.
Friendly objectives: Example 2
minimize the maximum number of excess workers
needed on any day: that is
Minimize z = max {s1, …, s7}, where sj ≥ 0 for each j.

subject to x1 + x4 + x5 + x6 + x7 - s1 = 17
x1 + x2 + x5 + x6 + x7 - s2 = 13
x1 + x2 + x3 + x6 + x7 - s3 = 15
x1 + x2 + x3 + x4 + x7 - s4 = 19
x1 + x2 + x3 + x4 + x5 - s5 = 14
x2 + x3 + x4 + x5 + x6 - s6 = 16
x3 + x4 + x5 + x6 + x7 - s7 = 11
xj ≥ 0 , sj ≥ 0 for j = 1 to 7 24
Minimize z = max {s1, …, s7}, where sj ≥ 0 for each j.

We can express it as an LP as follows:

minimize
subject to x1 + x4 + x5 + x6 + x7 - s1 = 17
x1 + x 2 + x5 + x6 + x7 - s2 = 13
x1 + x 2 + x 3 + x6 + x7 - s3 = 15
x1 + x2 + x3 + x4 + x7 - s4 = 19
x1 + x2 + x3 + x4 + x5 - s5 = 14
x 2 + x3 + x 4 + x5 + x6 - s6 = 16
x3 + x4 + x5 + x6 + x7 - s7 = 11
x j ≥ 0 , sj ≥ 0 for j = 1 to 7

and also the constraints:

25
Friendly objectives: Example 3

Suppose the objective is minimize |s1| + … + |s7|.


How do we modify it to make it linear?

Note: |sj| = max{sj, -sj} for each j.

The objective is the sum of friendly functions.


26

We do almost the same trick for minimizing the sum of friendly non-linear
objective functions. But instead of creating one new variable z, we create a new
variable zj for each term in the sum. so, the objective function becomes

Minimize z1 + z2 + … + z7
And we need to append the constraints
z1 >= s1, z1 >= -s1
z2 >= s2, z2 >= -s2

z7 >= s7, z7 >= -s7

In any optimal solution, it will be true that zj = |sj| for j = 1 to 7.


Minimize
subject to x1 + x4 + x5 + x6 + x7 d1
x1 + x2 + x5 + x6 + x7 d2
x1 + x2 + x3 + x6 + x7 d3
x1 + x2 + x3 + x4 + x7 d4
x1 + x2 + x3 + x4 + x5 d5
x2 + x3 + x4 + x5 + x6 d6
x3 + x4 + x5 + x6 + x7 d7

xj ≥ 0 for j = 1 to 7
27
A ratio constraint: another non-linear
constraint that can be made linear
Suppose that we need to ensure that at least 30% of
the workers have Sunday off.

How do we model this?

Workers on Sunday
≥ .3
Total workers

x1 + x2 But this is non-


≥ .3 linear
x1 + x2 + … + x7
28
Making it linear

x1 + x2
≥ .3
x1 + x2 + … + x7

Note: x1 + x2 + … + x7 > 0
Multiply both sides of the inequality by x1 + x2 + … + x7.

(x1 + x2 ) ≥ .3 (x1 + x2 + x3 + x4 + x5 + x6 + x7)

.7x1 + .7x2 - .3x3 - .3x4 - .3x5 - .3x6 - .3x7 ≥ 0

29

Note that this transformation assumes that the denominator x1 + … +x7 is strictly
greater than 0. The non-linear constraint is not defined well if x1 + … + x7 = 0. In
this particular problem, every feasible solution has x1 + … + x7 > 0, assuming that
there is a demand for workers on some day. Thus, we don’t need worry about the
issue of dividing 0 by 0.
Other enhancements

z Require that each shift has an integral number of


workers
– integer program

z Consider longer term scheduling

– model 6 weeks at a time

z Consider shorter term scheduling

– model lunch breaks

z Model individual workers

– permit worker preferences

30

This personnel scheduling problem illustrates a number of useful aspects of modeling. And it can go
a long way towards useful modeling personnel scheduling problems in practice. Of course, in
practical scheduling problems, one needs to address the problem at hand, whatever that problem may
be.

Permitting a fractional number of workers is totally unrealistic, but the fractional solution of the
model is usually easy to deal with in practice. Managers are often comfortable taking a fractional
solution and rounding off the fractional parts. It often leads to a solution that is almost as good in
terms of the objective function and still satisfies the constraints. However, it is better if one can solve
the integer program. And occasionally fractional solutions can’t be rounded to a good integer
solution.

Personnel scheduling typically deals with longer term scheduling than one week because these
models permit one to be fairer to the workers. For example, in the model above, most workers never
get a Friday or Saturday off. But in longer term models, one can ensure that each worker can get part
or all of a weekend off.

Also, scheduling lunch breaks can be quite a challenge.

And in some scheduling problems, one can make workers happier by giving them days off that they
request, such as to accommodate special days.

Most of these added complexities cannot be modeled easily using linear programming, and rely on
integer programming, which we will discuss later in this subject.
Time for a mental break

31

Cartoons are by Sidney Harris


http://www.sciencecartoonsplus.com/index.htm
Math Programming and Radiation Therapy
z An important application area for optimization

z Thanks to Rob Freund and Peng Sung for some


of the following slides

Cancer cells

Normal cells

Figure by MIT OCW.

32

This is one of my favorite applications of linear and non-linear programming, in


large part because it is so important, and in part because it is a great application to
medical technology.
Math Programming and Radiation Therapy

z High doses of radiation (energy/unit mass) can


kill cells and/or prevent them from growing and
dividing
– True for cancer cells and normal cells

z Radiation is attractive because the repair


mechanisms for cancer cells is less efficient than
for normal cells

33

This discussion applies to radiation therapy, where there is high energy radiation
beamed at a patient. It does not apply to proton beam techniques, which have
different physical properties than the usual radiation therapy.
Radiation Imaging
z Recent improvements in
imaging
– MRI
– CT Scan
– other

34

Imaging can reveal precise characteristics of tumors, and make radiation treatment
possible. In addition, the imaging is critical for surgeries. It can also help monitor
the progress of chemo-therapy.
Radiation Delivery

z Improvements of delivery
of radiation

z New field: tomotherapy

IMRT

“Optimizing the Delivery of

Radiation Therapy to Cancer

patients,” by Shepard, Ferris,

Olivera, and Mackie, SIAM

Review, Vol 41, pp 721-744,

1999.

35

The radiation therapy for brain tumors is delivered by a machine that can deliver
large doses of radiation beamed from different angles into the brain.
Use of Multi-leaf Collimaters

z multi-leaf
collimator
– blocks radiation
– turns a large
beam into a
focused beam

36

For more precise beamlets, one uses a multi-leaf collimator. These metal fingers
can be adjusted to create a space for the beams to pass through. It turns a very
large beam into a collection of very focused beamlets.
Conventional Radiotherapy

Relative Intensity of Dose Delivered

37

As radiation passes through the brain (or other parts of the body for other types of
cancer), the radiation dose decreases as radiation is absorbed. The largest dose is
where the beam enters the body. The smallest dose is where it leaves.
Conventional Radiotherapy

Relative Intensity of Dose Delivered

38

This very simple example shows the significance of aiming beams from multiple
directions. Note that the largest dose is now in the center of the brain.
Conventional Radiotherapy

z In conventional radiotherapy
– 3 to 7 beams of radiation
– radiation oncologist and physicist work
together to determine a set of beam angles and
beam intensities
– determined by manual “trial-and-error”
process

39

The problem of determining optimal beamlets is absurdly difficult to solve by hand,


especially when there are thousands of possible choices. Radiation oncologists are
quite pleased to have optimization procedures help them out.
Goal: maximize the dose to the tumor while
minimizing dose to the critical area

Critical Area

Tumor area

With a small number of beams, it is difficult


to achieve these goals.
40

In the brain, every non-cancerous cell is critical. In other parts of the body, some
cells are more critical to functioning of an organ than others.

We say that we want to maximize the dose to the tumor while minimizing the dose
to the critical area. But this is a figure of speech to indicate that there are two
competing objectives. In reality the minimum possible dose to the critical region is
0, and this is easy to achieve by avoiding radiation therapy. The highest possible
dose to the tumor is extremely high, enough to cause fatality. There is no possible
way of achieving the lowest possible dose to the critical region while
simultaneously achieving the highest possible dose to the tumor.
Radiation Therapy: Problem Statement

z For a given tumor and given critical areas


z For a given set of possible beamlet origins and
angles
z Determine the weight of each beamlet such that:
– dosage over the tumor area will be at least a
target level γL .
– dosage over the critical area will be at most a
target level γU.

41

A natural way to try to balance the competing objectives is to put a lower limit on
the amount of radiation delivered to the cancer cells and an upper limit on the
amount of radiation delivered to critical cells. This is not the only way of trying to
balance the competing objectives, as we shall see.
Display of radiation levels

42

Here is a nice picture of what can be achieved using linear programming. The
orange and red regions indicate the parts of the figure with the highest radiation
levels.
Linear Programming Model

z First, discretize the space


– Divide up region into a 2D (or 3D) grid of pixels

43

To model this problem using linear programming, the decision variables will
correspond to how much radiation is delivered from each possible beamlet.

In order to determine whether a cell gets too much or too little radiation, we
aggregate the cells into “pixels.” If we didn’t divide up the region into pixels, we
could end up with one constraint for each neuron, leading to 100s of billions of
constraints.
More on the LP

z Create the beamlet data for each of


p = 1, ..., n possible beamlets.
z Dp is the matrix of unit doses delivered by beam p.

Dipj = unit dose


delivered to pixel
(i, j) by beamlet p

44

We estimate the unit dose delivered to pixel (i, j) by beamlet p. This approximation
is pretty close to the exact amount of radiation delivered in the therapy.
Linear Program

z Decision variables w = (w1, ..., wp)


z wp = intensity weight assigned to beamlet p
for p = 1 to n;

z Dij = dosage delivered to pixel (i, j)

Dij = ∑ p=1 Dijp w p


n

45

The primary place where linear programming equality constraints are used is here.
The total dosage delivered to a pixel is the sum of the doses delivered by all the
beamlets to that pixel.
An LP model
took 4 minutes to
minimize ∑ (i, j)
Dij solve in 1999.

Dij = ∑ p=1 Dijp w p


n

Dij ≥ γ L for ( i , j ) ∈T
Dij ≤ γ U for ( i , j ) ∈ C
wp ≥ 0 for all p
In an example reported in the paper, there were
more than 63,000 variables, and more than 94,000
constraints (excluding upper/lower bounds) 46

This linear program is pretty simple conceptually. It says that we need a


sufficiently large dose at tumor cells and a sufficiently small dose at critical areas.
What happens if the model is infeasible?
Allow the constraint for
minimize ∑ ((ii,, jj))
D
yijij pixil (i,j) to be violated by
an amount yij, and then
minimize the violations.
Dij = ∑ p=1 Dijp w p
n

DD + ≥yijγ L≥ γ L for for


ij ij ( i , j()i ∈ ) ∈T
, jT
ij ij− ≤yγ
DD ij U≤ γ U for for
( i , j()i∈ )∈C
, jC
wp ≥ 0 for all p
yij ≥ 0 for all ( i , j )

47

A doctor will naturally try to set the threshold for cancer cells as high as he or she
can achieve while setting the threshold for critical cells as low as possible. Even
with lots of experience in what is achievable, there is a natural tendency for the
doctor to set these parameters so that there is no feasible solution. In this case, then
the doctor would keep adjusting the parameters until eventually getting a feasible
solution. This can be a time consuming process.
An even better model

∑∑( i ,((jii),,(jj))yDyijijij)
2
minimize minimize the sum of
squared violations.
Dij = ∑ p=1 Dijp w p
n

DD + ≥yijγ L≥ γ L for for


ij ij ( i , j()i ∈ ) ∈T
, jT Least

ij ij− ≤yγ
ij U≤ γ U for for
squares
DD ( i , j()i∈ )∈C
, jC
wp ≥ 0 for all p
yij ≥ 0 for all ( i , j )
This is a nonlinear program (NLP). This one can
be solved efficiently. 48

It is far easier to set the parameters conservatively (as far as the patient is
concerned) while allowing some violation. If there is no feasible solution to the
problem on the previous slide, there will definitely be a feasible solution to this non
linear program since one can choose yij to guarantee feasibility.

The objective function would try to set all of the y’s to 0. Failing that, it would set
all of the y’s as small as possible.

The quadratic function is used instead of a linear function in recognition that it is


better to have a slightly higher dose for a large number of critical cells than an
incredibly high dose for a lesser number of cells. The latter situation would result in
killing more critical cells.

Similarly, we don’t want to miss cancer cells with the radiation because those cells
would be the most likely to grow and create more cancer. So, we would prefer a
slightly lower dose for a large number of cancer cells than having a much lower
dose for some cancer cells.
Optimal Solution for the LP

49

Here is the solution that we have already seen before.


An Optimal Solution to an NLP

50

The NLP worked much better. I suspect that this is true in general, but one example
is not enough to prove it.
Is that the end of the story on modeling?

z Other issues:
– Delivering the radiation doses quicker by
setting the multi-leaf collimators optimally.
– trying to keep average radiation levels low
over part of the critical region.
– how do we trade off the doses to the critical
area with doses to the tumor?

51

The doctors are very excited about using optimization to set beams.

But it still opens lots of other questions.

If the optimization can set the multi-leaf collimators optimally, then we can speed

up the delivery of the radiation, and this permits more patients to use the machine.

Optimizing the use of the multi-leaf collimators is a challenging problem, often

relying on results in network optimization.

Also doctors are concerned with radiation delivery over larger areas than single

cells, and want solutions that keep average radiation levels low over the critical

region. If these constraints are modeled in a straightforward manner, it leads to

problems that are extremely hard to solve.

There is often a balance in practice of figuring out the right level of fidelity needed

for a model. Too much fidelity may make the model too hard to solve, and often

too hard to validate. Tool little fidelity can mean that we are solving a model that is

too different from reality to be of much use.

2010

z The Geometry of Linear Programs


– the geometry of LPs illustrated

1
Quotes of the day

You don't understand anything


until you learn it more than one
way.
Marvin Minsky

One finds limits by pushing

them.

Herbert Simon

2
Goal of this Lecture

z Present the Geometry of Linear Programs


– A key way of looking at LPs
• Others are algebraic and economic
– Some basic concepts
– 2-dimensional (2 variable) linear programs)
– 3-dimensional (3 variable) linear programs
– Properties of the set of feasible solutions and of optimal
solutions
• generalizable to all linear programs

3
A Two Variable Linear Program

(a variant of the DTC example)

z = 3x + 5y objective

2x + 3y ≤ 10 (1)

x + 2y ≤ 6 (2)

x + y ≤ 5 (3)

x ≤ 4 (4)

y ≤ 3 (5)
x, y ≥ 0 (6)
4

We could have used the original variable names of K and S, but it is simpler to use x
and y since we usually think of the two axes as the x and y axis.
Finding an optimal solution

z Introduce yourself to your partner

z Try to find an optimal solution to the linear


program, without looking ahead.

Finding an optimal solution to a 2-variable LP can be challenging until you have


seen the theory. Finding an optimal solution to an LP with more than a few
variables and constraints is very hard to do by hand (or at least prone to errors) and
we typically use a computer.
Some Basic Concepts

y
5 A point is represented as a pair
(x, y). For example, (2, 3).
4
Sometimes, we will call (x, y)
3 a vector. In that case, it is
often represented with a line
segment directed from the
2
origin.
1

1 2 3 4 5 6 x
6

We go through this review pretty quickly


Lines
Every pair of (distinct) points determines a
unique line.
y p1 = (1, 5)
5 p1 p2 = (4, 2)
L: x + y = 6.
4
Alternative representation
3 of the line: (1-λ)p1 + λp2
for -∞ ≤ λ ≤ ∞.
2 p2
L = (1, 5) + λ(3, -3)
1
for -∞ ≤ λ ≤ ∞.

1 2 3 4 5 6 x 7

The alternative representation is really important, as we shall see on the next few
slides.
Rays
Every pair of (distinct) points determines a
unique ray beginning at the first point.
y
p1 = (1, 5)
5 p1
p2 = (4, 2)
4

Ray: (1-λ)p1 + λp2


3
for 0 ≤ λ ≤ ∞.

2 p2
= (1, 5) + λ(3, -3)
1
for 0 ≤ λ ≤ ∞.

1 2 3 4 5 6 x 8
Line segments
Every pair of points determines a unique
line segment.
y
p1 = (1, 5)
5 p1
p2 = (4, 2)
4

Segment: (1-λ)p1 + λp2


3
for 0 ≤ λ ≤ 1.

2 p2
= (1, 5) + λ(3, -3)
1
for 0 ≤ λ ≤ 1.

1 2 3 4 5 6 x 9

We keep seeing (1-λαμβδα)p1 + lambda p2 as the formula. But the representation


of the line segment is the most useful for our purposes.
Inequalities
An inequality with two variables
y determines a unique half-plane
5

4
x+ 2y ≤ 6

1 2 3 4 5 6 x 10

A half plane contains the line as well as all points on one side of the line.
Graphing the Feasible Region

y
We will construct
5
and shade the
feasible region
4 one or two
constraints at a
3 time.

1 2 3 4 5 6 x
11
Graph the Constraints:
2x+ 3y ≤ 10 (1)
x ≥ 0 , y ≥ 0. (6)
y
5

2 2x + 3y = 10

1 2 3 4 5 6 x
12

OK. This is really three constraints despite what was said on the last slide.
Add the Constraint:
x + 2y ≤ 6 (2)
y
5

2
x + 2y = 6
1

1 2 3 4 5 6 x
13
Add the Constraint:
x + y ≤ 5

y A constraint is
called redundant
5
if deleting the
constraint does
x + y = 5

4
not increase the
size of the
3
feasible region.

2
“x + y = 5”
is redundant
1

1 2 3 4 5 6 x

14

We don’t concern ourselves much with redundant constraints 15.053. In principle,


we could delete a redundant constraint because it might make the problem easier to
solve. But in reality, it doesn’t help much. But it is a widely used concept.
Add the Constraints:
x ≤ 4; y ≤ 3
y
5

4 We have now
graphed the
3 feasible
region.
2

1 2 3 4 5 6 x
15
How do we maximize z = 3x + 5y ?

Let’s avoid adding a 3rd dimension.


x
Find a feasible solution such that 3x + 5y = p
5 for different values of p.

4
Choose p as large as possible.

1 2 3 4 5 6 y
16

If you can do so, try to maximize the objective function before looking ahead.
How do we maximize 3x + 5y ?
y
Is there a feasible solution
such that 3x + 5y = 8?
3
Is there a feasible solution
such that 3x + 5y = 11?
2

1 3x + 5y = 11
3x + 5y = 8

x
1 2 3 4 17

The lines such as “3x + 5y = 8” are often called “isoquant lines” or “isoprofit lines”
or “isocost lines”. Note that if the objective function is linear, then all isoprofit
lines are parallel to each other.
Find the maximum value p such that there is a
feasible solution with 3x + 5y = p.
y
Move the line with profit p parallel as much as
3 possible.
This is called the geometric
method for optimizing in 2D
2 3x + 5y = 16
The optimal
solution
occurs at a
1
3x + 5y = 11 corner point.

3x + 5y = 8

x
1 2 3 4 18

The geometric method for solving a 2-variable LP is to move the isoprofit line so
that
(1) there is still at least one feasible point on the line
(2) moving it any further would make all points on the line infeasible.

If there is exactly one feasible point on the line it will be a “corner point,” which is
defined on the next slide.
Corner Points
z A corner point of the feasible region is a point
that is not the midpoint of two other points of the
feasible region.

Where are the


corner points of
this feasible
region?

19

Note that this is a very simple definition of a corner point. It also leads to some
unintuitive aspects.

First of all, a corner point only makes sense if the feasible region is convex, that is,
if two points p1 and p2 are feasible, then every point on the line segment [p1, p2] is
also feasible. We will discuss convexity later in this lecture.

Second, if the feasible region is a disk, then every point on the outside of the disk is
a corner point even though there are no “corners” to a disk.
Solving for the Corner Point

In two dimensions, a corner point lies at


the intersection of two lines.
x + 2y = 6
y 2x + 3y = 10

3 x +2y = 6
2x + 4y = 12
2x + 3y = 10
2
2x +3y = 10

1 y=2

1 2 3 4
x 20

It’s very useful that the corner point lies at the intersection of two lines. Then
solving a system of equations with two variables and two equations will give the
value of the corner point.
Solving for Corner Points
In three dimensions, a corner point is the
intersection of three constraints. (3 planes)

0 ≤ x ≤ 2

0 ≤ y ≤ 2

z
0 ≤ z ≤ 2

x -y +z≤3

x What is the red corner point?


y

21

The red corner point is the intersection of three planes

x = 2

z = 2

x - y + z = 3

The unique solution is x = 2, y = 1, z = 2.

An important difference between the


geometry and the algebra

2x + y = 9 Usually, corner points can be


5 described in a unique way as an
intersection of two lines
x + 2y = 9
4 (constraints). But not always.

3
2x + y ≤ 9
x + 2y ≤ 9
0≤x≤3
2
0≤y≤3

1 The point (3, 3) can be


written as the
intersection of two
1 2 3 4 5 lines
6 in 6 ways.
22

This turns out to be very important in later lectures when we consider the simplex
algorithm, which is the algebraic technique for solving a linear program. Within the
simplex algorithm, there may be many structural descriptions (bases) that
correspond to the same solution. This leads to a number of technical issues that
need to be resolved.

We’ll return to this slide later in the subject when we discuss degeneracy.
Remainder of the lecture: Corner Points.

Does every linear program with a feasible solution


have a corner point?

No. Consider the LP:


Maximize y
subject to 0 ≤ y ≤ 1
no constraints on x

23

Even a person who has studied linear programming for a long time can forget that
not all linear programs have corner points. Fortunately, all linear programs with
non-negativity constraints do have corner points.

We will generally deal with linear programs that have non-negativity constraints.
Theorem. If there is a feasible solution, and if there is
no feasible line, then there is a corner point.

Corollary. Any LP in which each variable is


non-negative has a corner point.

y
5

1 2 3 4 5 6 x 24
If there is an optimal solution, and if there is a
corner point, is there always an optimal
solution that is a corner point?
y
YES!
3
It is true even if
there is an entire
2 line segment that is
optimal.

x
1 2 3 4 25

It is also possible that the entire feasible region is optimal, but this can only happen
when the objective is to maximize (or minimize) 0x + 0y. In this case, all corner
points are optimal.
z If there is an optimal solution, then there is an
optimal solution that is a corner point.

se
s. t o
t. f p
pr o
of in
it ts
=
a

26

In three dimensions, the isoprofit points form a plane. For example, we may have
an objective of 2x + 3y + w =.

We’ll try to avoid having z as one of the variables since z usually denotes the value
of the objective function.
What types of Linear Programs are there?

There is no max x
feasible solution. s.t. x + 2y ≤ -1
x ≥ 0, y ≥ 0

There is a feasible
max x
solution and an
s.t. x + 2y ≤ 1
optimal solution.
x ≥ 0, y ≥ 0

There is a feasible
solution and the max
objective value is s.t. x - 2y ≤ 1
unbounded from x ≥ 0, y ≥ 0
below
27

Case 1 unfortunately happens too frequently, often because of an error.


Sometimes it comes because the decision maker is trying to
accomplish too much with too little. For example, in the application to radiation
therapy to destroying brain tumors, a doctor may require a very large dose of
radiation to the tumor while requiring a very low dose to non-tumor cells. But this
may lead to a linear program with no feasible solution.

Case 2 will always occur when the feasible region is non-empty and bounded.
However, it can also occur with the feasible region is unbounded. For example, min
{x : x >= 1}.

Case 3 can only happen when the feasible region is unbounded.


Any other types

z Is it possible to have an LP such that the feasible


region is bounded, and such that there is no
optimal solution?

No. But it could happen if we

permitted strict inequality

constraints.

Maximize x

subject to 0 < x < 1

28

It might seem obvious that if an LP feasible region is bounded, then it must have an
optimal solution. But this relies on the fact that the inequalities are not strict, as the
example above shows.

From a mathematical perspective, an LP feasible region is closed; that is, if a


sequence of feasible points converges then it converges to a point that is feasible.
But this is a digression, and is not used elsewhere in 15.053.
Convex Sets
A set S is convex if for every two points in the
set, the line segment joining the points is also in
y the set; that is,
if p1, p2 ∈ S, then so is(1-λ)p1 + λp2 for λ ∈ [0,1]
3
Theorem. The feasible
p1 region of a linear program is
2 convex.

1 p2

x
1 2 3 4 29

Convexity is of some use for linear programming. It is critically important in non


linear programming. Non-linear programs are extremely hard to solve in general
(impossible may be a better word); however, when the objective function is convex,
they are often tractible.

We will use properties of convexity a number of times in the remainder of this


lecture and in the next lecture.

Notice that we are using the same description of a line segment as earlier.
More on Convexity
Which of the following are convex ? or not ?

y
5
4
3
2
1

1 2 3 4 5 6 x
30

The doughnut (object 5) is not convex because it contains a hole in it.

The 8 points (object 8) is not convex because it only contains those 8 points and not

points in between.

Which shapes are convex


Which regions are LP feasible regions?
Which are the corner points of each shape?

31

Not convex: heart and moon

Not LP: heart and moon and circle. (The circle cannot be expressed as a finite

number of linear inequalities.)

Corner points: those that are not the midpoints of two other points of the object.

The top of the heart is not a corner point since it is the midpoint of two other points

of the heart. The points on the outside curve are corner points except for the points

below the top two curves.

All outside points on the circle are corner points, despite the fact that in common

usage we would not think of a circle as having any corners.

The cube has 8 corner points.

The outside points on the left of the moon are corner points. The outside on the

right side are not.

The infinite region has 4 corner points.

On corner points

z Corner points make more sense if the region is


convex.

z We are only concerned about corner points of


linear programs.

32

When we use corner points in linear programming, it will make intuitive sense.
It is not very useful to talk of corner points of non-convex regions.
A Theorem on Corner Points

z Theorem. Every corner point of an LP is an


optimal solution for some linear objective.

www.mathematica.com

33

All one has to two in two dimensions is to find a line that goes through the corner

point but doesn’t touch any other feasible point.

In three dimensions, one needs to find a plane that goes through a corner point and

doesn’t touch any other feasible point.

Intuitively, this is quite easy to do. However, it is not so easy to prove that it is

always possible, even though it is.

In fact, in an n-variable LP, every corner point is an optimal solution for some linear

objective.

Theorem: The feasible region of an LP is convex.

Proof illustrated. Let p1 = (1, 2). Let p2 = (3, 1).


Suppose that both points satisfy one of the constraints:
say ax + by ≤ c.
Then 1a + 2b ≤ c and 3a + 1b ≤ c.
y
Suppose that p3 = (1-λ)p1 + λp2
3 = (1 - λ)(1, 2) + λ(3, 1) = (1 + 2λ, 2 - λ)
Claim: p3 satisfies the inequality.
2 p1
1a + 2b ≤ c × (1-λ)
p3 3a + 1b ≤ c ×λ
1 p2
(1 + 2λ)a + (2 - λ)b ≤ c

1 2 3 4
x 34

To prove that a set is convex, one goes back to the definition. One shows that is p1
and p2 are both in the set and if p3 is on the line segment joining p1 and p2, then p3
must also be in the set.

In the case of a linear programming feasible region, one needs to show that p3
satisfies each of the linear equalities and inequalities. The above is an outline of the
proof why this is true with two variables. Indeed, the proof idea can be generalized
to show that it is true regardless of the number of variables.
The proof continued

z Every inequality is satisfied by p3. So, p3 is


feasible.

z Equalities are also satisfied.

35
And now, it’s time for …..

36

“Who wants a piece of candy” is not stored on the web.


Summary: 2D Geometry helps guide
the intuition

z 2D visualization
z infeasibility and unboundedness
z Corner Points and their significance
z Convexity of the feasible region

37
2010

z The Geometry of Linear Programs


– The simplex algorithm
– More properties of linear programs

Pentagonal prism 1
Overview of Lecture
z Review of Geometry

z The Simplex Algorithm

z More on convexity

z RHS Sensitivity Analysis

2
Quotes of the Day

Geometry is not true, it is


advantageous.
Jules H. Poincare

I've always been passionate about


geometry and the study of three-
dimensional forms.
Erno Rubik
3
Review of Geometry
A set S is convex if for every two points in the
set, the line segment joining the points is also in
y the set; that is,
if p1, p2 ∈ S, then so is(1-λ)p1 + λp2 for λ ∈ [0,1]
3
Theorem. The feasible
p1 region of a linear program is
2 convex.

1 p2

x
1 2 3 4 4
Corner Points

z A corner point of the feasible region is a point that


is not the midpoint of two other points of the
feasible region.
z All feasible LPs with non-negativity constraints
have at least one corner point.

If an LP is feasible, has non-


negativity constraints, and has an
optimal solution, then there is a
corner point that is optimal.

5
Solving for Corner Points

z In two dimensions, a corner point is the


intersection of two equality constraints.
z In three dimensions, a corner point is the
intersection of three constraints. (3 planes)

0 ≤ x ≤ 2

0 ≤ y ≤ 2

z
0≤z≤2

x -y +z≤3

x y

The red corner point is the intersection of three planes

x = 2

z = 2

x - y + z = 3

The unique solution is x = 2, y = 1, z = 2.

There are 3 Types of Linear Programs

3 3

2 2

1 1

1 2 3 4
1 2 3 4
Those with an optimal solution Those with no feasible solution.

5
4
Those whose objective
3 Isoprofit value is unbounded
2 line
1

1 2 3 4 5 6 7
The Simplex Method in Two Dimensions
Start at any feasible corner point.
Move to an adjacent corner point with better
y objective value. Move along an edge of the feasible
region.
5
Continue until no adjacent corner point has a better
objective value.
4
Max z = 3 x + 5 y
3

2
3 x + 5 y = 19

1 2 3 4 5 6 x 8 8
The Simplex Method Again
Start at any feasible corner point.

Move to an adjacent corner point with better

x
objective value. Move along an edge of the feasible
region.
5

Continue until no adjacent corner point has a better


objective value.
4

1 2 3 4 5 6 y 9

The Simplex Method in 3 Dimensions


Start at any feasible corner point.
Move to an adjacent corner point with better objective value.
Move along an edge of the feasible region.
Continue until no adjacent corner point has a better
objective value.

Note: in two dimensions, the


“edges” are the intersections of
two constraints. The corner
points are the intersection of
three constraints.

Pentagonal prism 10
An Example of the Simplex Method in 3 Dimensions
y
Maximize y
y=2
The number of
y=1.7 iterations
depends upon
y=1.5
which edge is
chosen at
y=1
each iteration.
y=0 y=.1
x
y=.3
y=.2

z In practice, the simplex method is very efficient,


even on very complex large scale LPs. 11

This is a twisted cube.


Notice how the simplex method starting at the origin could move to the optimum in
1 step or pivot.
It is also possible for the simplex method to take 7 pivots, thus visiting each corner
point.

Klee and Minty developed an example that is very similar that has n variables. It is
possible that the simplex method would take 2^n – 1 pivots on these examples, thus
showing that the simplex method can take exponential time in the worst case.

In practice, there may be many different edges that the simplex method can select at
a given iteration. The speed in which the simplex method moves to the optimum
depends on the choice of the edge.
The Simplex Method on Unbounded LPs

Maximize x

y If the objective is
5 unbounded from
4 above, then the
simplex method
3 will move
infinitely far along
2
an edge.
1

1 2 3 4 5 6 x 12
Interior Point Algorithms for LP
y

2 3x + 5y = 16

x
1 2 3 4 13

There are a variety of algorithms that move within the interior of linear programs.
These algorithms typically take far fewer iterations than the simplex algorithm and
far more time per iteration for large problems. Sometimes interior point algorithms
obtain answers quicker than the simplex algorithm. Often they are slower.

Interior point algorithms were popularized by Karmarkar in 1984, who proved that
the number of iterations is bounded by a polynomial in the dimension of the
problem and in the number of bits needed to describe the coefficients.

While interior point algorithms are ingenious and have practical import, they are
also beyond the scope of 15.053, and will not be covered further.
Comments on Optimality Conditions

z Linear programming produces both the optimal


solution and the proof of optimality. (This is true
for any number of variables, and even if many of
the constraints are equality constraints.)
– special among optimization problems
– very valuable
– “The Gold Standard” for optimization

z For other optimization problems in the subject,


we will settle for bounds from optimality
– e.g., we will be happy if we can guarantee at most 10%
from optimality

14
Convex Combinations

Suppose that p1, p2, …, pk are all vectors (or points).


Let pk +1 = λ1p1 + λ2 p2 + ... + λk pk .

We say that pk+1 is a convex combination of p1, …, pk


if the following are true:

λ1 + λ2 + ... + λk = 1
and λi ≥ 0 for i = 1 to k .

p1
Suppose k = 2. What points
are convex combinations of p2
p1 and p2?
15
More on Convex Combinations
What points can be represented as the convex
combination of (0,0), (0, 4), and (3, 0)?
5

λ1 (0, 0)
3
+ λ2 (0, 4)
2 + λ3 (3, 0)

1 2 3 4 5 6
16
Convex Combinations and Convex Hulls

The convex hull of points p1, …, pk is the smallest


convex region containing all of the points. It is
also the set of all points that can be expressed as
convex combinations of p1 to pk.

4
Note that the
3
convex hull of
points in 2
2 dimensions looks
like an LP feasible
1 region.

17
1 2 3 4 5 6
Convex Hulls in 3 dimensions

z x

18

In Slide show mode, the figure is revealed as a cube that is partially cut off.
Representation Theorem

z Theorem. Every bounded polyhedra (linear programming


feasible region) can be represented as a convex hull of its
corner points.

z Theorem. The convex hull of a set of points is a bounded


linear programming feasible region.

z Usually, we prefer to represent a linear program in terms of


constraints. But there are times when it is useful to
represent it as the convex combination of corner points.

19
Mental Break

What are the odds?

20
Sensitivity Analysis

z Sensitivity analysis: Determining the marginal


effect on the optimal objective function if we
make small changes in the data.

z In LP, we focus on two types of sensitivity


analysis that are very useful and very easy for an
LP package to compute

21
The revised DTC example

z = 3x + 5y

2x + 3y ≤ 10 Gathering time

x + 2y ≤ 6 Smoothing time

x + y ≤ 5 Delivery Time

x ≤ 4 Demand: kits

y ≤ 3 Demand: shields

x, y ≥ 0 non-negativity

22

We could have used the original variable names of K and S, but it is simpler to use x
and y since we usually think of the two axes as the x and y axis.
The optimal solution

In two dimensions, a corner point lies at


the intersection of two lines.
x + 2y = 6
y smoothing time 2x + 3y = 10

3 x +2y = 6
2x + 4y = 12
gathering time
2x + 3y = 10
2
2x +3y = 10

1 y=2
z = 3x + 5y

1 2 3 4
x x = 2, z = 16 23

It’s very useful that the corner point lies at the intersection of two lines. Then
solving a system of equations with two variables and two equations will give the
value of the corner point.
Varying the RHS
Suppose that we consider the problem in which
gathering time is parameterized by G.
Let z(G) be the
y smoothing time optimal objective
value when
3 x +2y = 6 gathering time is
G, and all other
gathering time data is
2 unchanged.
2x +3y = 10
G

1 z(10) = 16

What is z’(10)?
1 2 3 4
x (derivative) 24

It’s very useful that the corner point lies at the intersection of two lines. Then
solving a system of equations with two variables and two equations will give the
value of the corner point.
Computing the derivative

z(10 + Δ ) − z(10)

z '(10) = limΔ→0
Δ

Key observation: if Δ is small, then the optimum


corner point of the problem will be the
intersection of the smoothing time constraint
and the gathering constraint.

That is, the constraints that define the corner


point will not change.

25
On the new corner point

The solution value


changes, but the
“structure” of the
solution does not
y smoothing time change.

3 x +2y = 6

gathering time
2 2x +3y = G

x 26

1 2 3
Computing the New Corner Point

x + 2y = 6
2x + 3y = 10 + Δ

y smoothing time

3 x +2y = 6 2x + 4y = 12
2x + 3y = 10 + Δ
gathering time
2
2x +3y = 10 + Δ
y=2-Δ
1

z = 3x + 5y x = 2 + 2Δ
z = 16 + Δ
1 2 3 4
x
27
The Shadow Price

z (10 + Δ ) − z (10)
z '(10) = limΔ→0
Δ

Note that z’(10 + Δ)


(16 + Δ ) − 16
z '(10) = limΔ→0 =1 is linear in Δ.
Δ

This is the shadow price of the gathering constraint.

Note. We only needed that the corner point was the


intersection of the gathering and smoothing constraint.

28
More on Shadow Prices

The shadow price of a constraint is the unit increase in


the optimal objective value per unit increase in the
RHS of the constraint. It is also a derivative.

Let p denote the shadow price.


If the RHS of gathering increases from 10 to 10 + Δ,
then the objective value increases from 16 to 16 + pΔ,
that is, it increases by pΔ.

29
Exercises

z z(10) = 16; z’(10) = 1 (The shadow price is 1)

z Fact: z(11) = 16 + 1 = 17.

z What is z(10.2)?

z What is z(9.7)?

z What is z(0)? (trick question)

30
Computing the Shadow Price of a Constraint

z Step 1. Determine the binding x + 2y = 6


constraints that determine the
corner point. 2x + 3y = 10

z Step 2. Add Δ to the RHS of the x + 2y = 6


constraint whose shadow price we 2x + 3y = 10 + Δ
are computing.

z Step 3. Solve the system of x = 2 + 2Δ


equations. y=2-Δ

z Step 4. Compute the “increase” in


z when Δ increases from 0 to 1 z(10 + Δ) = 16 + Δ
(i.e., compute the derivative) Shadow price = 1.

31
Bounds on RHS coefficients in
Sensitivity Analysis
z Recall that the optimum solution is a corner
point, which in 2 dimensions is the solution of 2
equations in 2 variables, and the equations are
the binding constraints.

z Compute the largest changes in the RHS


coefficient so that all constraints remain satisfied.

32
What happens if the RHS changes by a lot?

y
3

2 2x +3y = G

1 2 3 4
x 33
More Sensitivity Analysis: Determining the Interval
Constraint after
x =2+2Δ; y= 2-Δ
substitution.

Maximize Profit z = 3 x + 5 y (in 10s) z = 16 + Δ

Gathering time: 2 x + 3 y ≤ 10 + Δ 10 + Δ ≤ 10 + Δ

Smoothing time: x + 2 y ≤ 6 6 ≤ 6

Delivery time: x + y ≤ 5 4+Δ ≤ 5

Slingshot demand: x ≤ 4 2 + 2Δ ≤ 4

Shield demand: y ≤ 3 2-Δ ≤ 3


Non-negativity: K,S ≥ 0 2 + 2Δ ≥ 0
2- Δ ≥ 0
So, -1 ≤ Δ ≤ 1 34
Summary for changes in RHS coefficients
z Determine the binding constraints
z Determine the change in the “corner point
solution” as a function of Δ.
z Compute the largest and smallest values of Δ so
that the solution stays feasible.
z The shadow price is valid so long as the “corner
point solution” remains optimal, which is so long
as it is feasible.

35
And now, it’s time for …..

36

“Who wants a piece of candy” is not stored on the web.

Vous aimerez peut-être aussi