Vous êtes sur la page 1sur 118

15.060 Data, Models, Decisions 15.

060 Data, Models, Decisions


Final Review Final Review
December 15 2007 DMD Fall 07 Final Review 1
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Final Exam Final Exam
Date: Date: Monday, December 17
Time: Time: 9am 12pm Time: Time: 9am-12pm
Place: Place: See MIT Server (come early!)
Closed Closed book exam
No No laptops or communication devic No No laptops or communication devices
You can You can bring a calculator
Formula Sheet will be will be provided
BUT BUT get a good night good nights sleep!
DMD Fall 07 Final Review 2
BUT BUT get a good night good night s sleep!
December 15 2007
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Table of Contents Table of Contents

Topic 1 : Topic 1 : Decision Analysis Decision Analysis


Topic 2 : Topic 2 : Discrete Random Variables Discrete Random Variables
T i 3 T i 3 C i d C C l t l i i d C ti Topic 3 : Topic 3 : Covariance an C d C d orrel orre ati a o ovariance an C l tion
Topic 4 : Topic 4 : Continuous Random Variables Continuous Random Variables
Topic 5 Topic 5 : Topic 5 Topic : 5 : Statistical S Statistical ampling Sampling Statistical Sampling Statistical Sampling
Topic 6 : Topic 6 : Simulation Simulation
Top Topppic 7 : ic 7 : Reg Regggression ression
Topic 8 : Topic 8 : Linear Optimization Linear Optimization
Topic 9 : Topic 9 : Nonlinear Optimization Nonlinear Optimization
Topic 10 : Topic 10 : Discrete Optimization Discrete Optimization
December 15 2007 DMD Fall 07 Final Review 3
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
You are NOT responsible for:
TOPIC 1: TOPIC 1:
Decision Analysis Decision Analysis
Conditional Probabilities
December 15 2007 DMD Fall 07 Final Review 4
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC PIC 2 TOPIC 2: TOPIC 2:
Discrete Random Variables Discrete Random Variables
December 15 2007 DMD Fall 07 Final Review 5
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Discrete Random V Discrete Random ariables Variables
A probability distribution for a discrete random variable X consists of
(i) (i) possible values x
1
, x
2
, . . . , x
n
,
(ii) (ii) corresponding probabilities
p 1
, p
2
, . . . , p
n
,
so that: P(X = x
1
) = p
1
, P(X = x
2
) = p
2
, . . . , P(X = x
n
) = p
n
.
050
P(Y=y)
0.20
0.30
0.40
0.50
A histogram is a
display of
probabilities as a
0.00
0.10
0 1 2 3 4 5 6
bar chart
Probabilities are non-negative, must sum to 1,
The possible values are mutually exclusive
and collectively exhaustive (describe all the possibilities that can happen).
y
December 15 2007 DMD Fall 07 Final Review 6
y ( p pp )
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

3 important measures 3 important measures
1. Expected Value or Mean: (measured in units of X)
Average outcome measure of central tendency
E( X)
X _
P( X x
i
)x
i _
p
i
x
i
i i
2. Variance: (in units of X squared) 2. Variance: (in units of X squared)
Squared deviation around the mean measure of spread
Var(( X )) o=
XX
2
__=
P(( X x
ii
)( )(x
ii
=
XX
))
2
__
pp
ii
((x
ii

XX
))
2
i i
3. Standard Deviation: (in units of X)
M f d Measure of spread
o
X
Var( X )
December 15 2007 DMD Fall 07 Final Review 7
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
You are NOT responsible for:
The Binomial distribution The Binomial distribution
December 15 2007 DMD Fall 07 Final Review 8
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC 3 TOPIC 3 TOPIC 3: TOPIC 3:
Covariance and Correlation Covariance and Correlation
December 15 2007 DMD Fall 07 Final Review 9
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

COV(X, Y) COV(X, Y)

Y
Covariance: Covariance:
Cov( X ,Y ) E[( X
X
)(Y
Y
)]
__=
P(( X x
ii
;;Y yy
jj
)( )(x
ii

XX
)( )( yy
jj

YY
))
i, j
Measures the extent to which two random variables
vary together.
Correlation: Correlation:
CORR(X, Y)
CORR(X, Y) always between -1 and 1.
December 15 2007 DMD Fall 07 Final Review 10
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Correlation is unitless Correlation is unitless
Woorking with joint distributions W rking with joint distributions
Suppose X and Y are two random variables with
jjoint distribution P((X=x
ii
;; Y=y y
kk
)):
Marginal distribution of X
Joint distribution of X and Y
December 15 2007 DMD Fall 07 Final Review 11
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
E[ X] =

x
i
P( X = x
i
)
ii
Var( X) =
2
X
=

(x
2
i

X
) P( X = x
i
)
i
P( X = x
i
) =

P( X = x
i
;Y = y
k
)
k
Sums of random variables Sums of random variables
Mean of a sum:
E(aX +bY + c) aE( X ) +bE(Y ) + c
Variance of a sum:
2 2
V ( X bY ) a
2
VVar( X ) +bb
2
V ( ) + 2. . . b COV( X Y ) Var(aX +bY + c) ( X) Var(YY) 2 a b COV ( X ,Y)
a
2
o
X
2
+b
2
o
Y
2
+ 2.a.b.o
X
o
Y
CORR( X ,Y )
December 15 2007 DMD Fall 07 Final Review
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
12
=
=
=
TOPIC 4: TOPIC 4:
Con Conti ti ia i bles bl a es CC tinuous ran tinuous randddom var dom varii bl bl
December 15 2007 DMD Fall 07 Final Review 13
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Continuous random variables Continuous random variables
A continuous r.v can take any value in some interval
Exampple: W Time sppent waitingg
in line at
Au Bon Pain!
There are an infinite number of possible values that the
random variable can assume
For a continuous random variable questions are phrased in For a continuous random variable, questions are phrased in
terms of a range of values.
NOTE: NOTE:
You would never say: Probability to wait exactly 10.5 minutes!
P(W=10.5)=0
B t P b bilit t it But: Probability to wait :
Less than 10 minutes: P(W<10);
More than 20 minutes; P((W>20));
Between 10 and 15 minutes: P(10<W<15).
December 15 2007 DMD Fall 07 Final Review 14
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Density functions Density functions
Probability density function:
Denoted f(t): gives a picture of
the distributio the distribution
(think of a smoothed histogram)
Area under the curve between 2
values a and b: P(a XX b) b) values a and b: P(a
Total area under the curve = 1
(total probability)
Cumulative density function:
F(t) = P(X t)
P(X t) = 1-F(t)
P(a X b) = P(X b) - P(X a) =
F(b) F(a)
0.35
0 3 0.3
0.25
0.2
0.15
0.1
0.05
00
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
1.2
1
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
December 15 2007 DMD Fall 07 Final Review 15
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
5
The normal distribution The normal distribution
0.06
0.08
0.1
0.12
0
0.02
0.04
Bell-shaped curve
-6 -4 -2 0 2 4 6 8 10 12
Computing probabilities with the Normal distribution:
You want : P(a X b) where X is N(,)
1. Define : : Z is N(0,1)
2 Use the standard normal probabilit table (Z table) 2. Use the standard normal probability table (Z table)
December 15 2007 DMD Fall 07 Final Review 16
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].


=
X
Z
a b
P(a X b) = P( Z )

b a
= P(Z ) P(Z )

68.3%
95.4%
3o= 2o= o= = +o= +2o= +3o=
.0228
.1587
.5
.8413
.9772
December 15 2007 DMD Fall 07 Final Review 17
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
mean n and n (standard deviation n
/ ( t d d d i ti
Sum of i.i.d random variables: Sum of i.i.d random variables:
Central Limit Theorem Central Limit Theorem
X
1
, X
2
, ..., X
n
independent identically distributed random variables:
E[X
i
] = , Var(X
i
) =
2
[
i
] , (
i
)
For n>30, S
n
= X
1
+ X
2
+...+ X
n
is approximately normal with

2
) mean n. and variance n. (standard deviation n.)
X + X +... + X
For n>30,
M
n
n
=
1 2
d i
n
2
is approximately normal with
/ ) mean and variance
2
/n (standard deviation /n )
The probability distribution of Xi does not matter; The probability distribution of Xi does not matter;
n does not have to be very large ( 30 is good enough);
CLT requires only 2 pieces of information:the mean CLT requires only 2 pieces of information:the mean and SD of X
i
and SD of X
i
December 15 2007 DMD Fall 07 Final Review 18
mean n and variance n (standard deviation n
/ ( t d d d i ti
TOPIC PIC 5 TOPIC 5: TOPIC 5:
Statistical Sampling Statistical Sampling
December 15 2007 DMD Fall 07 Final Review 19
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Sample mean of a population Sample mean of a population
Estimator of the mean of a
population (): Sample mean X
Population of size N
where X
1
,,X
n
are n R.Vs following
1
, ,
n
g
the population distribution (unknown
mean , unknown std dev )
Random sample 2:
l
is a random variable !
X
Random sample 1:
sample mean x
1
sample mean x
2
is a random variable !
X
By Central Limit Theorem, if n>30, then
is
approximately
normal with mean and standard deviation /n
X
December 15 2007 DMD Fall 07 Final Review 20
normal, with mean and standard deviation /n
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
n
X X
X
n
+ +
=
...
1
Sample standard deviation Sample standard deviation
Population of size N
Estimator of the standard
deviation of a population ():
Population of size N
The sample standard
deviation S :
Random sample 1:
l d d
Random sample 2:
sample std dev s
2
S
2
is an unbiased estimator of the variance, i.e. E[S
2
]=
2
sample std dev s
1
[ ]
S is a random variable !
December 15 2007 DMD Fall 07 Final Review 21
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
1
) (
2

=

n
X X
S
i
i
(
Confidence interval for sample mean Confidence interval for sample mean
How confident are we that
X
is a good estimate of the true mean of
the population ?
The realized values of
X
and S in the sample of size n are:
x
and s The realized values of
X
and S in the sample of size n are:
x
and s
What sample size do we need to be sure that the % confidence
i t l i i hi / L f h ? interval is within +/- L of the true mean ?
the required sample size is:
if n>30, then a % confidence interval for the mean is:
c is such that P( -c <Z< c) = %, where Z~N(0,1):
= 90 c = 1.645,, = 95 c = 1.960,, = 99 c = 2.576.
December 15 2007 DMD Fall 07 Final Review 22
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
2
2 2
L
s c
n =
L

+
s c
x
s c
x
. .

+
n
x
n
x ,
3 types of CI problems
There are 3 main types of confidence interval problems you should
know how to do:
1. Given x , s, n, |% -> find c -> find Confidence Interval
[ , ]
2. Given x, s , n , L (or the interval itself [ , ])
-> find c -> find the | % confidence level
3. Design Problem: given |%, s , L -> find c -> find the required
sample size n
December 15 2007 DMD Fall 07 Final Review 23
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Confidence interval for proportion Confidence interval for proportion
Let X = number of observations in a sample of size n to have a certain
characteristic,
p = the actual
proportion of the population to have that
characteristic.
The sample proportion is approximately normally distributed
with mean p, standard deviation
(
p(1 . p)
n
A % confidence interval for p is:
where c is that number for which : P( is that number for which : P( -c<Z<c) = c<Z<c) Z~N(0 1) %, Z~N(0,1) where c %
Note: p is unknown:
Option 1: replace it by its estimate
p
Option 2: p=1/2 (worst case) because p(1-p) for all p
December 15 2007 DMD Fall 07 Final Review 24
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
n
X
p =
n

n
p p
c p
n
p p
c p
) 1 (
;
) 1 (
TOPIC PIC 6 TOPIC 6: TOPIC 6:
Simulation Simulation
December 15 2007 DMD Fall 07 Final Review 25
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Some lessons on simulation Some lessons on simulation
1. 1. Provides more info than average case analysis and simple formulas simple formulas.
2. 2. You generate random variables generate random variables that obey a variety of discrete and continuous continuous
probability distributions (e.g uniform, binomial, etc).
3. 3. The results are not precise not precise, due to the inherent randomness in a simulation.
We typically obtain estimates estimates of the distributions of particular quantities of
interest, means means and standard deviations standard deviations of these distributions.
F h di ib i d i fid i l d h i f F h di ib i d i fid i l d h i f From these distributions, one can derive confidence intervals and other inferences From these distributions, one can derive confidence intervals and other inferences
of statistical sampling. of statistical sampling.
4. 4. The question of how many trials or runs how many trials or runs of a simulation of a simulation can become a
comp complllex s lex sttta t ti ati ti tissti ti tical i ti ssue cal issue. F t t ly, with tod 's computi l i l i Fortunatel ith t day' ting power,
this is not a paramount issue for most problems.
5. 5. In practice, one should recognize that gaining managerial confidence in
a simulation model will depend on at least three factors a simulation model will depend on at least three factors:
(i) a good understanding a good understanding of the underlying management problem,
(ii) one's ability to use the concepts of probability and statistics corr use the concepts of pr ectly obability and statistics correctly,
(iii) one's ability to communicate these concepts effectively communicate these concepts effectively.
Decem 26
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
So what happens on the final exam? You may get Sampling questions !
TOPIC PIC 7 TOPIC 7: TOPIC 7:
Regression Regression
December 15 2007 DMD Fall 07 Final Review 27
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Multiple regression Multiple regression
Explanatory variables : Explanatory variables :
X
1,
X
2
, ,X
k
taking values x
1i
, x
2i
, . . . ,x
ki
(i = 1, . . . ,n)
Y DDepen ependdent var ent variiaabl ble : e : Y
taking values y
i
(i = 1,. . . n)
Y
i
= + x
i
+ +
k
x
ki
+
i
Model: Model: Y
i

0
+
1
x
1i
+ . . . +
k
x
ki
+
i

1
,
2
, . . . ,
n
are iid random variables, N(0, )
Goal: Goal: Choose b
0
, b
1
, . . . , b
k
to minimize the residual sum of squares
y

i
= b + b x + + b x e = y
y

i
y



i
= b
0
+ b
1
x
1i
+ . . . + b
k
x
ki
, e
i
= y
i
-
y


i
n n

e

n i
2
=
(
y
i
y
i
)
2
Mi Miniimmiize ze

i

=1
e
i
i

=1
(
y
i
y


i
)
December 15 2007 DMD Fall 07 Final Review 28
D d i bl D d i bl
Model: Model:
Mi i i Mi i i
Regr
Degr
Regression Output Regression Output
1) 1) Regression coefficients:: ession coefficients b
0
, b
1
, . . . , b
k
sample estimates of |
0
, |
1
, . . . , |
k
2) 2) Standard error Standard error : : estimate of o=
a measure of h f the amount of f noi ise in th he modell i d
3) 3) Standard err Standard errors of the coefficients , . . . , s
b
k
ors of the coefficients: : s
b
0
, s
b
1
same role as the estimate of the standard deviation of the sample mean same role as the estimate of the standard deviation of the sample mean in
sampling
b |
Prior to observing b
m
and s
b
m
,
m m
has t-dist. with (n - k - 1) d.o.f.
s s
b
m
ees of fr Degrees of freedom eedom:: n - (k + 1) = n - k - 1
n pieces of data;
used up up ((k + 1)) degrees of freedom to estimate b0, , . . . , bk g , b1, . . . , b
used to test the existence of a linear relationship between Y and x
m
;
+ What is 95% confidence interval for |
m
?
+ Does the interval contain 0?
December 15 2007 DMD Fall 07 Final Review 29
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
4) 4) Significance test: Significance test: Is
m
significantly different from zero?
The % confidence interval for
m
is:
(b
m
- c s
b
m
, b
m
+ c s
b
m
),
where c is such that : P( -c < T < c) = /100.
Steps Steps to to finding finding the the CConfidence onfidence Interval: Interval: Steps Steps to to finding finding the the Confidence Confidence Interval: Interval:
1) d.o.f. = n k 1
2) using % and d.o.f. , find c on the t-table
3) using c, b
m
, s
bm
write the interval above.
If zero does not lie in the confidence interval we are confident at the %
level that
m
is different from 0.
If zero lies in the confidence interval, then
m
is not significantly different
from zero:
we should be skeptical that Y depends linearly on x
m
and we might want to p
December 15 2007 DMD Fall 07 Final Review 30
p y
m
g
eliminate x
m
from the model.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
6) 6) Coefficient of Coefficient of determination determination::

n
( is sample mean of y
y
i
s.)

n
( )
2
y

i
y
i
2
R
=
1

i =1
y
i

n
( y )
2
Variation not accounted for by x variables
=
1

Total variation
(y
i =1
i
y)

=
that is accounted for by x variables
Total variation
R
2
takes values between 0 and 1:
35
30
25
20
30
25
20
20
15
10
5
0
0 5 10 15 20 25 30
15
10
5
0
0 5 10 15 20 25 30
X X
R
2
= 1; x values completely account R
2
= 0; x values account for none of
for Y values the variation in the Y values
A good value of R
2
depends on the situation
December 15 2007 DMD Fall 07 Final Review 31
Variation that is accounted for by x variables Variation
Linearity: Linearity: If there is only one explanatory variable, construct a scatter-plot of the
Checklist for evaluating linear regression models Checklist for evaluating linear regression models
data to check for linearity. Otherwise, use common sense to decide if a linear
relationship is reasonable. (Rule of thumb for choosing no of factors n > 5(k + 2) )
Significance tests: Significance tests: check if the regression coeffs are significantly different from zero
Signs of Regression Coefficients: Signs of Regression Coefficients: Check to see that the signs make intuitive sense
RR
22
:: Check if the value of R
2
is reasonably high.
Normality: Normality: Check that the residuals are approximately Normally distributed by
constructing a histogram of residuals.
Heter Heteroscedasticity: oscedasticity: Do error terms have constant standard deviation?
Plot
the residuals with the
observed values of each of the explanatory variables.
Autocorrelation: Autocorrelation: Are error terms independent?
If data are time-dependent,
plot the residuals over time to check for any apparent patterns.
Multicollinearity: Multicollinearity: Are two explanatory variables correlated?
Signs: if regression coeffs have wrong sign or we find high R
2
but one or more
of the regression coeffs is not significantly different from 0.
Look at the correlation matrix. Large positive or negative correlations between the
explanatory variables are bad.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Heteroscedasticity Heteroscedasticity
0.00
10.00
20.00
0 0 1 0 2 0
0.00
10.00
20.00
Advertising Expenditures
-20.00
-10.00
0.0 1.0 2.0
Advertising Expenditures
-20.00
-10.00
0.0 1.0 2.0
Autocorrelation Autocorrelation
No Evidence of Heteroscedasticity Evidence of Heteroscedasticity
0.00
10.00
20.00
0 5 10 15
0.00
10.00
20.00
0 5 10 15
No evidence of Autocorrelation Evidence of Autocorrelation
i
-20.00
-10.00
i
-20.00
-10.00
December 15 2007 DMD Fall 07 Final Review 33
10 00
Important regression issues you should know
Know how to interpret the regression output. Explain in English what
the coefficients mean and give intuition about how they affect the
deppendent variable.
Know how to build the confidence intervals for the coefficients using
the t-table.
Know how to read and interpret the regression graph and the output Know how to read and interpret the regression graph and the output
residual graphs (histogram, autocorrelation, heteroscedasticity)
Know how to improve your model
Ch k h i i ifi l i hi h i bl Check the signs, significance, correlation, etc which variables to
add and drop (explaining why)
Check linearity: if it fails, can you modify your data to make a
b tt better modd l E el. Exampl le: makke a pollynomi iall
Dummy variables: you need to know how to model categorical
data. Example: beer bottles red x green.
December 15 2007 DMD Fall 07 Final Review 34
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC PIC 8 TOPIC 8: TOPIC 8:
Linear Optimization Linear Optimization
December 15 2007 DMD Fall 07 Final Review 35
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Optimization terminology Optimization terminology
Decision Variable Decision Variable: : Describes a decision that needs
to be made, e.g. how many items
to produce.
Objective Functio Objective Function: : An expression (in terms of the Objective Function Objective Function:: An expression (in terms of the
variables) that needs to be
minimized or maximized.
Constraint Constraint: : An expression that restricts the
values of the variable values of the variables.
December 15 2007 DMD Fall 07 Final Review 36
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Steps in formulation Steps in formulation
1. 1. Define the decision decision variables. variables.
2. 2. Write the objective objective as a function of these vars.
Determine whether max or min max or min..
4. 4. Determine the variable restrictions, restrictions,
e.g. non-negative, integer.
Be careful of units! units!
December 15 2007 DMD Fall 07 Final Review 37
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
3. 3. Write the constraints constraints as functions of these vars.
Either , , , = , = .
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
A Fundamental Point A Fundamental Point
y
4
y
40
y
4
2
3
20
30
2
3
x
0
1
3 0 1 2
x
30 10 20
0
10
0 40
x
0
1
3 0 1 2
If an optimal solution exists, there is
always a corner point optimal solution!
3 0 1 2 30 10 20 0 40 3 0 1 2
December 15 2007 DMD Fall 07 Final Review 38
always a corner point optimal solution!
About Shadow Prices About Shadow Prices
h Associated with each constraint is a shadow price. (=0 for non-
binding constraints)
hh Th h d h d ii i th h i th bj ti l it The sshhaaddow pr ow priice ce is the change in the objective value per unit
change in the right hand side, given all other data remain the
same.
h Associated with each shadow price is a range over which this
shadow price holds.
hh If r If r..h.s changes h.s changes within within range: range: current solution remains optimal,
shadow price tells us rate of change in the optimal objective
function value;;
hh If r If r..h.s changes h.s changes outside outside range: range: current solution is not optimal
anymore; we need to solve the optimization pb again !
December 15 2007 DMD Fall 07 Final Review 39
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Avoid frequent mistakes! Avoid frequent mistakes!
Forgetting the non-negativity non-negativity restrictions
Confusing Maximizing with Minimizing Maximizing with Minimizing
Inconsistent Inconsistent and/or incorrect incorrect units Inconsistent Inconsistent and/or incorrect incorrect units
Reversingg
th
e signs signs of the constraints s g s g
Wrong interpretation of the shadow prices shadow prices.
Change in R.H.S outside outside the allowable range
December 15 2007 DMD Fall 07 Final Review 40
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC PIC 9 TOPIC 9: TOPIC 9:
Nonlinear Optimization Nonlinear Optimization
December 15 2007 DMD Fall 07 Final Review 41
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Some possible cases Some possible cases
objective objective
function level function level
objective objective
function level function level
optimal solution optimal solution
Feasible
Region
Multiple optimal Multiple optimal
solutions solutions
Feasible
Region g
linear objective, linear objective,
nonlinear constraints nonlinear constraints
objective objective
function level function level
Corner solution Corner solution
g
nonlinear objective, nonlinear objective,
linear constraints linear constraints
objective objective
function level function level
objective objective
optimal solution optimal solution
Feasible
function level function level
December 15 2007 DMD Fall 07 Final Review 42
nonlinear objective, nonlinear objective,
Region
optimal solution optimal solution
Feasible
Region
nonlinear objective, nonlinear objective,
f ti l l f ti l l
optimal solution optimal solution
nonlinear constraints nonlinear constraints linear constraints linear constraints
Local vs global solutions Local vs global solutions
O t O ti Op O tima ti p mall l soluti l on lution: AA fffeas f ibl easibl ibl iblee sol t lution that optiimiizes l ti l ti h
the objective value among all among all feasible points.
Local optimal Local solution optimal solution: A feasible feasible solution that optimizes Local optimal Local solution optimal solution: A feasible feasible solution that optimizes
the objective value among all among all feasible points near it near it
Example: Example:
Minimization in one variable over 2 <= x <= 7
Computer software for NLP can efficiently efficiently find local local opt.
BUT! BUT! BUT! BUT! thi this solluti tion will ill not not necessaril ily be th the globa globall ll opt. tt b l b l b t
x = x === 22 is a local local optimal solution. x 2 x 2 is a local local optimal solution
x = x 3.5 = 3.5 is a local local optimal solution.
x = x 5 = 5 is the global global optimal solution
43
f(x)
December 15 2007 DMD Fall 07 Final Review x
2 3 4 5 6 7
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Shadow prices in NLP Shadow prices in NLP
Review: Review:
Shadow p Shadow ppprice rice of a constraint for LP: of a constraint for LP:
Incremental Incremental change in the optimal objective function value change in the optimal objective function value
per unit per unit incr increase in the right--hand side (RHS) of the constraint. of the constraint. ease in the right hand--side (RHS)
Shadow price Shadow price of a constraint for of a constraint for NLP: (Lagrangian multiplier) NLP: (Lagrangian multiplier)
Approximate Approximate Incremental Incremental change in optimal objective function change in optimal objective function
value value w wiiith ith small small change change iiin the RH i S n the RHSSS..
Binding Binding constraint : constraint : Binding Binding constraint : constraint :
when satisfied when satisfied as equality as equality at the optimum. at the optimum.
For nonbinding constraints For nonbinding constraints, shadow prices ar , shadow prices e are zero! zero!
December 15 2007 DMD Fall 07 Final Review 44
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC TOPIC 10 10: TOPIC T 10: OPIC 10:
Discrete Optimization Discrete Optimization
December 15 2007 DMD Fall 07 Final Review 45
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Discrete optimization Discrete optimization
Feasible region is a set
of discrete poin of discrete points.
Cant be assured a
corner point or ev corner point or even
boundary solution.
Not as easy to solve Not as easy to solve
as LP.
Solving it as an L Solving it as an LP
provides a relaxation
and a bound on the
solution.
December 15 2007 DMD Fall 07 Final Review 46
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
y
4
2
3
1
2
0
1
Modeling issues Modeling issues
Decision variables are restricted to take only integer only integer values
Great modeling flexibility using binary binary variables
x
i
= 1 , if event i occurs
x
i
= 0 , otherwise
Strategic planning Strategic planning (number of people to hire)
Allocation Allocation Allocation Allocation of resources (which project to fund) of resources (which project to fund)
Determination Determination of productivity and distribution
December 15 2007 DMD Fall 07 Final Review 47
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
More on modeling issues More on modeling issues
If x
1
= 0 then x
2
= 0 x
2
x
1
If x
1
= 1 then x
2
= 1 x
2
x
1
If x
1
= 1then x
2
= 1 and vice versa x
2
= x
1
If x
1
= 1 then x
2
= 1 or x
3
= 1 x
1
x
2
+ x
3
10
Invest in at most 2 projects
_
x

ss 2 2
Invest in at most 2 projects
_
x
i
i 1
Select 5 out of 10 projects
10
__
i 1
x 55 x
ii
Key concept: Analyze logical implication of constraint in all
ibl possible cases
December 15 2007 DMD Fall 07 Final Review 48
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Partial taxonomy of optimization Partial taxonomy of optimization
Nonlinear Nonlinear
Optimization Optimization
Linear Optimization Linear Optimization
objective and constraints are objective and constraints are
linear expressions linear expressions
objective and/or constraints are objective and/or constraints are
non non- -linear expressions linear expressions
linear expressions linear expressions
Integer
Optimization variables
are restricted to discrete
(integer) values
Mixed - Integer
Optimization
some variables are
continuous, some are
December 15 2007 DMD Fall 07 Final Review 49
discrete
P bl f 2005 fi P bl f 2005 fi Problems from 2005 fina Problems from 2005 finalll l
December 15 2007 DMD Fall 07 Final Review 50
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 1: T Problem 1: True or False rue or False
(a) If the 95% confidence interval for the sample mean extends from 4 to
14 based on a random sample of size 60, then the sample mea
was 9.
Interval is centered around the sample mean:
x-L =4 x+L=14
Midpoint:
TRUE
Midpoint:
x =(4+14)/2
= 9
(b) (b) If R If R
2
= 0, it means that all the data points in an y vs x regression 0, it means that all the data points in an y-vs-x regression
model must fall along the horizontal line
FALSE
y
FALSE
December 15 2007 DMD Fall 07 Final Review 51
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 1: T Problem 1: True or False rue or False
(c) A resident of Boston is chosen at random. Consider the 2 events:
I. The person selected is a lawyer;
II. The person selected is a lawyer and an environmental activist.
The probability of event II can never exceed that of event I.
TRUE
Lawyers and environmental
activists
December 15 2007 DMD Fall 07 Final Review 52
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Environmental
activists
lawyers
Problem 1: T Problem 1: True or False rue or False
d) If X has mean 1, standard deviation 2 and Y has mean 1, standard
deviation 4, then the standard deviation of Z=X+Y cannot exceed 6.
Var(Z) = Var(X) + Var(Y) + 2*
X
*
Y
*CORR(X,Y)
Max when CORR(X,Y) = 1 Var(Z) = 36,
Z
= 6
TRUE
e) Mendel asks a random number generator to create 10,000
independent selections from a N(0,1) distribution. The 10,000
selections turn out to have a sample mean of 0.0 selections turn out to have a sample mean of 0.08.
Assuming the random generator to work properly, the chance would
be less than 1% that the sample mean would fall at least as far as it
did from the true mean.
n = 10,000, x = 0.08. By CLT X~N(0,1/n) (approximately).
P(X 0.08) = P(Z (0.08 0)/(1/100)) = P(Z 8) ~ 0 (8 standard
deviations from the mean)
TRUE TRUE
December 15 2007 DMD Fall 07 Final Review 53
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 2 (a) Problem 2 (a)
John has not been feeling well recently and he believes he has a bacterial
infection with probability 0.6.
He takes a test that is 99% reliable:
The probability that the test is positive given that he has an infection is
99%;
The probability that the test is negative given that he does not have an
iinf fecti tion iis 99%. 99%
If the test result is positive, what is the probability that he has an infection?
P(INF) = 0 6 P(INF) 0.6 P(!INF) = 0 4 P(!INF) 0.4
P(test+ | INF) = 0.99 P(test- | !INF) = 0.99
We want : P(INF | test+)
P(INF | test+) = P(INF and test+)
P(test+ | INF) = P(INF and test+)
P(test+)
P(INF)
December 15 2007 DMD Fall 07 Final Review 54
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 2 (a) Problem 2 (a)
P(INF) = 0.6 P(!INF) = 0.4
P(test+ | INF) = 0.99 P(test- | !INF) = 0.99
We want : P(INF | test+)
P(INF | test+) = P(INF and test+)
P(test+)
P(t P(test+ | INF) t+ P(INF and test+) P(t t+ | !INF) P(!INF d t t+) | INF) = P(INF d t t+) P(test+ | !INF) = P(!INF and test+)
P(INF) P(!INF)
P(INF and test+) = 0 99 0.99 0 *0 6 P(!INF and test+) = 0 01 P(!INF and test+) *0 4 P(INF and test+) .6 0.01 0.4
= 0.594 = 0.004
P(test+) = P(INF and test+ ) + P(!INF and test+)
= 0.594 + 0.004 = 0.598
P( INF | test+) = 0.594/0.598 = 0.99
December 15 2007 DMD Fall 07 Final Review 55
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Problem 2 (b) Problem 2 (b)
Statistics show that the number of years a CEO spends in office is
normally distributed with mean 5.5 and standard deviation 1.2.
Given that a CEO has been in office for exactly 5 years so far what is Given that a CEO has been in office for exactly 5 years so far, what is
the probability that she will still be in office 2 years from now?
X: # yyears in office : X~N((5.5,,
1.2
))
now t t=7 Office tenure: t Office tenure: t=00 now t=55 t 7
We want : P(X 7 | X 5)
P( X 7 | X 5) = P(X 7 and X 5) = P(X 7) 1 P(X 7) P( X 7 | X 5) P(X 7 and X 5) P(X 7) = 1 - P(X 7)
P(X 5) P(X 5)
1-
P(X 5)
P( X 7) = P(Z (7-5.5)/1.2) = P(Z 1.25) = 0.8944
Z~N(0,1):
P( X 7) P(Z (7 5.5)/1.2) P(Z 1.25) 0.8944
P( X 5) = P(Z -0.417) = 0.3372
Look up in Z-table!
P( X 7 | X 5) = (1 - 0.8944)/(1 - 0.3372) = 0.1593 ( | ) ( ) ( )
December 15 2007 DMD Fall 07 Final Review 56
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 3 Problem 3
In a random poll of 100 randomly-selected business leaders, 77% say
that they support Bernanke as new chairman of the Fed.
(a) (a) What is the 99% confidence interval for the percentage of al What is the 99% confidence interval for the percentage of all
business leaders who support Bernanke;
n = 100, p , p = 77%
99% confidence interval for the sample proportion:
99% confidence interval: [0.66; 0.88]
December 15 2007 DMD Fall 07 Final Review 57
Where c is that number for which P( -c < Z < c) = 99%, Z~N(0,1) Where c is that number for which P( c Z c) 99%, Z N(0,1)
i.e c = 2.576
99% confidence interval: [0.66; 0.88]
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

p p p p ) 1 ( ) 1 (

+
n
p p
c p
n
p p
c p
) 1 (
;
) 1 (
Problem 3 Problem 3
(b) Ezekiel- who has not seen the results of the poll- wants to find a
95% confidence interval for the percentage of business leaders who
supp pport Bernanke. He also wants the interval to extend no more
than one percentage point in each direction around its midpoin
Make a sensible estimate of the number of business leaders he
should poll.
2
n
c
Where L = 1%, and c is that number for which:
4L
2
P((
-c<Z<c
)) = 95%,,
Z~N
( (0, ,1), ),
i.e c = 1.960
n = 9,604 (round up non-integer values!)
December 15 2007 DMD Fall 07 Final Review 58
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 4 Problem 4
Mendel performs a linear regression analysis on the unemployment
rate in Massachusetts (UM) versus the current wholesale price of fuel
oil per gallon (P) in Massachusetts in inflation-adjusted dollars.
U i hl d f i i d (i i 72 Using monthly data for a recent six-year period (i.e., using 72
observations), he reaches the least squares equation:
UM = 2.10 + 3.00P (P is in dollars and UM in per cent.)
The R^2 value for the regression is 0.66, and the upper end for the
95% confidence interval for the slope of P is 5.00. The sample
standard deviation of the monthly Massachusetts unemployme standard deviation of the monthly Massachusetts unemployment
rates over the six-year period studied was 1.00 percent.
December 15 2007 DMD Fall 07 Final Review 59
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 4 (a) (c) Problem 4 (a)--(c)
(a) If fuel oil is projected to cost $1.30 in a forthcoming month, what
is the estimate of the Massachusetts unemployment rate for that is the estimate of the Massachusetts unemployment rate for that
month based on the regression result?
Um = 2.1 + 3 * (1.3) = 6%
(b) Does the 95% confidence interval for the slope of P include 0?
NO: CI is symmetric around mean 3 and upper bound is 5 [1, 5]
(c) What is the sum of squared residuals of the 72 data points
around the regression line? around the regression line?
Decem 60
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 4 (d) Problem 4 (d)
(d) Consider one at a time the following possible patterns among the
residuals for this regression analysis. Briefly explain for each pattern
whheth b itself it would sub tanti lly red fidence i th ther, by it lf, it ld bst tiall duce your confid in the
regression analysis:
I The heavy majority of the residuals in the first three years studied I. The heavy majority of the residuals in the first three years studied
were positive, while the heavy majority of those in the second three
years were negative.
Autocorrelation: residuals are not casual but follow a time-based Autocorrelation: residuals are not casual but follow a time based
pattern.
(Another acceptable answer would be that the relationship might be
nonlinear.))
II. The residuals are consistently larger in the months when the fuel
prices are high than in those in which prices are low.
Heteroscedasticity: the residuals consistently get larger with larger
values of the independent variable P.
61
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 4 (contd) Problem 4 (contd)
Fearful of an omitted variable in the regression above, Mendel
performs another linear regression on the same data. For each
th th d d t i bl i till UM hil th i bl th month, the dependent variable is still UM, while the variables on the
right are P and UN, the average unemployment rate in the other 49
American states. He reaches the revised regression equation:
UM = 1.50 + 2.00P + 0.50UN
R^2 for the revised regression was 75 while the upper ends of the R 2 for the revised regression was .75, while the upper ends of the
95% confidence intervals are 6.00 for the slope of P and 1.10 for the
slope of UN.
December 15 2007 DMD Fall 07 Final Review 62
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 4 (e) (f) Problem 4 (e)--(f)
(e) Do the regression results provide statistically convincing evidence
that UN really belongs in the regression model? Briefly discuss.
NO: both 95% CIs contain 0: P [ -2, 6 ] and Un [-.1, 1.1]
(f) Suppose that UN and P exhibited strong positive correlation over (f) Suppose that UN and P exhibited strong positive correlation over
the six years studied. What general problem in regression analysis
might result from that circumstance? How might that problem have
affected the regression results? affected the regression results?
Multi-collinearity: the independent variables are highly correlated
amongg themselves. This may y neg gatively y affect the statistical
significance of both variables (like in this case).
December 15 2007 DMD Fall 07 Final Review 63
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 5 Problem 5
Recall the Filatoi Riuniti case and linear optimization model, where the firm
would like to determine its monthly outsourcing strategy for spun yarn among six
yarn.
The objective function is to minimize the variable cost (including transportation
cost) for meeting demand for the four spun yarn sizes (Extrafine, Fine, Medium,
and Coarse).
There are four types of constraints in the mode There are four types of constraints in the model:
1. Filatoi must meet monthly demand for each of the four spun yarn sizes.
2. None of the seven mills can exceed their monthly production capacity.
3. Neither Ambrosi nor De Blasi can produce Extrafine yarn.
4 All d isi ion vari bl be nonnegati ive. 4. All deci iables must b
Suppose that demand for spun yarns is the same as in the original case, as are
the production capacities and machine hour requirements the production capacities and machine hour requirements.
December 15 2007 DMD Fall 07 Final Review 64
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
other spinning mills as well as their own internal production strategy for spun other spinning mills as well as their own internal production strategy for spun
Problem 5 Problem 5
Suppose, however, that over time the variable production and transportation
costs have changed, and that the current data for Filatoi Riunitis production
problem for the coming month of January are shown in Table 1 below problem for the coming month of January are shown in Table 1 below.
Decem 65
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 5 Problem 5
Roberto Cominetti has re-run the linear optimization model using this
new data, resulting in the optimal solution shown in Table 2 along with the
Sensitivity Report shown in Table 3 Please answer the following questions Sensitivity Report shown in Table 3. Please answer the following questions
based on the linear optimization model solution and Sensitivity Report.
December 15 2007 DMD Fall 07 Final Review 66
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 5 Problem 5
December 15 2007 DMD Fall 07 Final Review 67
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 5 (a) (b) Problem 5 (a)--(b)
(a) What are binding constraints in the model? In the optimal plan for the coming
month, which spinning mills would use all of their spinning capacity to produce
spun yarn for Filatoi Riuniti? spun yarn for Filatoi Riuniti?
All the constraints are binding, except for Capacity at Giuliani. This is the only
mill that has not its capacity fulfilled under the optimal strategy.
(b) What would be the cost impact of increasing the required production
of Extrafine yarn from 25,000 kg to 27,000 kg? What can you say, if anything,
about the cost impact of increasing the required production of Extrafine yarn about the cost impact of increasing the required production of Extrafine yarn
from 25,000 kg to 29,000 kg?
Shadow Price = 18.397 ($/kg). Max increment allowed +3,197.5 Kg
Additi l C 2 000 K * $18 397/K $36 794 / h Additional Costs = +2,000 Kg * $18.397/Kg = $36,794 / month
Max Additional Costs = +3,197.5 Kg * $18.397/Kg = $58,824.4 (for additional
3,197.5 Kg). Nothing can be said for the remaining 802.5 Kg except that they
would cost at least $18.397/kg. would cost at least $18.397/kg.
December 15 2007 DMD Fall 07 Final Review 68
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 5 (c) Problem 5 (c)--(d) (d)
(c) Another local spinning mill by the name of Havarti has informed Filatoi
that they can produce Fine spun yarn for Filatoi for a delivered cost of
$14 25/kg Should Filatoi consider entering into an agreement with Havarti to $14.25/kg. Should Filatoi consider entering into an agreement with Havarti to
produce Fine spun yarn at this price?
NO: The shadow price for the demand of fine is 14.018. Hence, if we were to
produce less fine yarn with the current machines and outsource it to Havarti, we
would save 14.018 per Kg, and the extra cost would be 14.25 per Kg, so it is not
worth it.
(d) According to the models data, monthly capacity at De Blasi is 2,600
spinning machine hours. However, Filatoi Riuniti has just received an email from
the outsourcing manager at De Blasi indicating that capacity for the coming
month will be curtailed to 2,200 spinning machine hours due to some
unanticipated machine maintenance. How much will this change the total
variable cost of producing and/or outsourcing spun yarn in the coming month?
Shadow Price = -.086 ($/hour)
Additional costs = (-400 hours) * (-$.086/hour) = $34.4
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 5 (e) Problem 5 (e)
(e) How much do you think Giuliani would have to reduce the price they charge
Filatoi Riuniti for Fine spun yarn in order for Filatoi to want to discuss
outsourcing production of Fine spun yarn to them? outsourcing production of Fine spun yarn to them?
The shadow price for fine yarn is $14.02. De Blasi would have to reduce their
price below this level.
December 15 2007 DMD Fall 07 Final Review 70
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 6 Problem 6
Forest Capital (FC) has decided to appoint Sarah Edwards as the new portfolio
manager of its portfolio of technology and utility stocks in emerging markets,
which is currently comprised of various amounts in ten different compani which is currently comprised of various amounts in ten different companies.
Table 4 below shows the current portfolio weights, the latest annualized
expected return and standard deviation estimates, and the classifications of
each of the ten companies.
December 15 2007 71
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 6 Problem 6
The estimated correlations among the returns of the ten companies are shown
in Table 5. Note in Table 5 that FC assumes for simplicity that returns among
stocks are uncorrelated except among stocks A B, and C and C. stocks are uncorrelated except among stocks A, B
December 15 2007 DMD Fall 07 Final Review 72
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 6 Problem 6
Sarah has decided to use an optimization model to select the new weights of
the portfolio for the coming month. She would like to maximize the expected
return of the portfolio subject to the following constraint return of the portfolio subject to the following constraints:
1. The standard deviation of the resulting portfolio should be at most 8%.
2. The amount of turnover of the portfolio should be at most 30%. As an
example of how turnover is calculated, if prior to trading a portfolio has 70%
of its funds in Stock 1 and 30% in Stock 2, and after the trade the weights are
60% for Stock 1 and 40% for stock 2 the turnover of the portfolio is (|70- 60% for Stock 1 and 40% for stock 2, the turnover of the portfolio is (|70
60)|+ |30-40|)= 20%.
3.
0 0
Last month, the total portfolio weight in technology
0.088+0 0 0.07 7+0 1 0.17 7+0 0 0.09 9+0 0.08 08 = 0 49 S
stocks was
0.49 = 49% 49%. Sarahh would ld lik like to mai intaiin th the
character of the portfolio as a balanced portfolio between technology and
utility stocks. For this reason, she would like the total weight of the portfolio
in in technology technology stocks stocks to to be be between between 45% 45% and and 55%. 55%.
4. All portfolio weights need to be nonnegative. That is, short positions are not
allowed in the portfolio.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Problem 6 (a) Problem 6 (a)


(a) Write down a formulation of a nonlinear optimization model to determine the
new weights of the portfolio.
w
i
= fraction of the resulting portfolio invested in stock i
Obj MAX (.08w
1
+ .12w
2
+ .15w
3
+ .11w
4
+ +.08w
9
+ .05w
10
)
Subject t Subject to:
w
1
+ w
2
+ w
3
+ w
4
+ + w
9
+ w
10
= 1 (fractions)
[(.13)
2
(w
1
)
2
+ (.25)
2
(w
2
)
2
+ (.35)
2
(w
3
)
2
+ + (.07)
2
(w
10
)
2
+ 2(.13)(.25)(.4)(w
1
)(w
2
)
++ 2( 13)( 35)( 2(.13)(.35)(-.1)(w 1)(w
11
)(w )(w
33
) + 2( 25)( 35)( 1)(w ) + 2(.25)(.35)(.1)(w
22
)(w )(w
33
)] )]
1/2
08 .08
[|w
1
- .12| + |w
2
- .08| + |w
3
- .07| + + |w
10
- .08|] .3
w
2
+ w
3
+ w
5
+ w
7
+ w
8
55 w
2
+ w
3
+ w
5
+ w
7
+ w
8
.55
w
2
+ w
3
+ w
5
+ w
7
+ w
8
.45
w
i
0 (for each i from 1 to 10) w
i
0 (for each i from 1 to 10)
December 15 2007 DMD Fall 07 Final Review 74
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 6 (b) Problem 6 (b)
(b) Suppose that in order to trim the rather excessive transaction costs in
emerging markets Sarah would like to limit her portfolio to stocks in only six emerging markets, Sarah would like to limit her portfolio to stocks in only six
different companies. How would you augment your formulation of the model
using binary variables to incorporate this requirement into the model?
December 15 2007 DMD Fall 07 Final Review 75
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Problem 6 (c) Problem 6 (c)
(c) Suppose that Sarah would like to limit the number of trades to at most
seven of the ten companies (Note that if a stocks weight does not change seven of the ten companies. (Note that if a stock s weight does not change, it it
does not produce a trade.) By defining binary variables, describe how you
would augment your model to incorporate this additional requirement as well.
December 15 2007 DMD Fall 07 Final Review 76
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Good luck ! Good luck !
There are things MBAs cant solve There are things MBAs cant solve
For everything else, there is DMD ! For everything else, there is DMD !
December 15 2007 DMD Fall 07 Final Review 77
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Additional Practice Problems Additional Practice Problems
December 15 2007 DMD Fall 07 Final Review 78
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC 2: TOPIC 2:
Discrete Random Variables Discrete Random Variables
December 15 2007 DMD Fall 07 Final Review 79
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
BBeer and C d C k oke daily salles at t a soccer sttadium d il di
Probability
pp
i
X:# of Beer Cans
xx
i
Y:# of Coke Cans
yy
i
0.15
0.27
0.15 0.15
0.26
0.17
35
78
81 81
30
16
41
10
00
13
42
S Pl f D il S l f B dC k Scatter Plot of DailySales of Beer andCoke
40
50
a
l
e
s

0
10
20
30
0 20 40 60 80 100
C
o
k
e

S
a

0 20 40 60 80 100
Beer Sales
p
i
= P(X=x
i
and Y=y
i
)
December 15 2007
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
The Beer and Coke Example
Wh t i the expected numb f b ld? of cok ld? What is th t d ber of beer cans sold? f ke cans sold?
What is the standard deviation of beer cans sold? of coke cans?
What is the covariance and the correlation of beer and coke cans sold?
What is the expected daily revenue?
What is the standard deviation of the daily revenue?
December 15 2007 DMD Fall 07 Final Review 81
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
S Q ti Some Questions
10 22.70
Summary of Daily Beer Sales
Th The expected d numb ber of b ld i f beer cans sold is

x
= E(X) = E
i
p(X=x
i
)x
i
The variance of beer cans sold is The variance of beer cans sold is
o
2
=VAR(X)=E p(X=x
i
)(x
i
-
x
)
2
x
The standard deviation of beer cans sold is The standard deviation of beer cans sold is
o
x
= VAR(X)
Here it turns out that :
P(X ) b t ll P(X=x
i
)=p
i
, but usually:
P(X=x
i
)=
j
P(X=x
i
; Y=y
j
)
Summary of Daily Coke Sales
Prob.
# Beer
Cans
p
i
x
i
p
i
x
i
p
i
( x
i
- E(X))
2
0.15 35 5.25 29.32
Prob.
# Coke
Cans
p
i
y
i
p
i
y
i
p
i
( y
i
- E(Y))
2
0.15 41 6.15 70.18
0 27 2 70 23 71
0.15 81 12.15 153.79
0.26 30 7.80 93.66
0.17 16 2.72 184.91
0.27 23.71
0.15 0 0.00 56.28
0.26 13 3.38 10.55
0.17 42 7.14 87.06
Fall 07 Final Review StdDev(X)=26.25
E(X)= 48.98 VAR(X)=689.06 E(Y)= 19.37 VAR(Y)=247.77
StdDev(Y)=15.74
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Some Questions
0 27 78 21 06 227 38 0.27 78 21.06 227.38 10
The correlation of beer and coke cans sold is
CORR(X,Y) = COV(X,Y)/( o
x
o
y
)
Scatter Plot of Daily Sales of Beer and Coke
40
50
a
l
e
s

The covariance of beer and coke cans sold is
COV(X,Y)=E=p
i
(x
i
-
x
) (y
i
-
y
)
0
10
20
30
0 20 40 60 80 100
Beer Sales
C
o
k
e

S
a

S um m ary of D aily B eer and C oke S ales
P rob.
Num ber of
Beer Cans
Num ber of
Coke Cans
p
i
x
i
y
i
p
i
( x
i
- E(X)) (y
i
- E (Y))
0.15 35 41 -45.36
0.27 78 10 -73.42
0.15 81 0 -93.03
E(X)=48.98
E(Y)=19.37
0.15 81 0 93.03
0.26 30 13 31.43
0.17 16 42 -126.88
C O V (X,Y) = -307.25
December 15 2007 DMD Fall 07 Final Review
C orrelation = -0.74
StdDev(X)=26.25
StdDev(Y)=15.74
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Some Questions
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Some More Questions About Beer and Coke
X= number of cans of beer sold; Y= number of cans of coke sold
Revenues : $3 per can of beer, $2 per can of coke
Daily revenue (in $) = 3X+2Y
What is the expected daily revenue?
E(X)=48.98
E( 3 X + 2 Y ) = 3 E(X) + 2 E(Y)
E(Y)=19.37
= $3 * 48.98 + $2 * 19.37
StdDev(X)=26.25

= $185.68
What is the standard deviation of the daily revenue?
( )
2 2
StdDev(Y)=15.74
Cov(X,Y)= - 307.25
VAR( 3 X + 2 Y ) = 3
2
* VAR(X) + 2
2
* VAR(Y)
+ 2 * 3 * 2 * COV(X,Y)
= 9 * 689+ 4 * 248+ 12 * (-307)
December 15 2007 DMD Fall 07 Final Review 84
= 3509
( )
STD DEV(3X+2Y) = 3509 = $59.23
= $185 68
* ( ) * ( )
TOPIC TOPIC 44: TOPIC 4: TOPIC 4:
Continuous Random Variables Continuous Random Variables
December 15 2007 DMD Fall 07 Final Review 85
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
The amazon.com The amazon.com example example
The time, in minutes, spent surfing amazon.comlast
X
month by people
N(170 10)
in this auditorium is normally distributed:
X~N(170,10)
What if we triple the time spent of a randomly chosen student?
Y=3X

Y
=3
x
=3(170) = 510

2
Y
=Var(3X)=3
2

2
x
=9(100) = 900,
Y
=30
Lets take 3 independent students at random and combine
the time they spent on amazon.comlast month
Y X +X +X
=3
December 15 2007 DMD Fall 07 Final Review 86
Y=X

Y
=3
x
=3(170) =
1
+X
2
+X
3
510

2
Y
=Var(X
1
+X
2
+X
3
) =3(100)=300,
Y
= 17.32
L t t k 3 i d d t t d t t d d bi
=3(170) = 510
Th l Th l
What is the probability that a randomly selected student has
spent between 160 and 180 minutes between 160 and 180 minutes on amazon.com last
month ? th ?
X~N(170,10)
P[[160 <X < 180]] = ?
P [(160-170)/10 < (X-)/o <(180-170)/10]=
P [ 1<Z<1] = F(1) F(1)-F( F(-1) = 0 8413 0 1587=0 6826! P [-1<Z<1] 1) 0.8413-0.1587=0.6826!
What is the pprobabilityy
that
three inde three indeppppendent endent students
together have spent more than 460 minutes more than 460 minutes?
Y=X
1
+X
2
+X
3
~ N(510, 17.32)
P(Y>460)=P(Y-510/17.32>460-510/17.32)=P(Z> -2.89)
December 15 2007 DMD Fall 07 Final Review 87
1-P(Z<-2.89) = 1-0.0019 = 0.9981!
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
The amazon.com example The amazon.com example
TOPIC 5: TOPIC 5:
Statistical Statistical Samplin Sampling Statistical Sampling Statistical Sampling
December 15 2007 DMD Fall 07 Final Review 88
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
After having managed to successfully survey 100 families we have
found that the observed sample mean of the annual income is
$19,763 while the observed sample standard deviation is $4,000.
a) What is the distribution of the sample mean
(i l di h f f h di ib i i d d d (including the form of the distribution, its mean and standard
deviation)?
_
The sample mean X follows a normal distribution
with mean and standard deviation o/\n : N(, o/\n)
December 15 2007 DMD Fall 07 Final Review 89
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Annual income example Annual income example

\ \
b) What is the probability that the sample mean will be within $784
of the population mea of the population mean?
_
P((
-
$784 < X - =<$
_
784))
= P( -$784/(o/\n) < (X - )/(o/\n) < $784 /(o/\n) )
~ P( -$784/(s/\n) < Z < $784 /(s/\n) )
= P( $784/(4000/\100) < Z < $784 /(4000/\100) ) 100) ) P( -$784/(4000/ 100) < Z < $784 /(4000/
= P(-1.96 < Z < 1.96)
= P(Z < 1.96) - P(Z < -1.96)
= 0.975 - 0.025
= 0.95
December 15 2007 DMD Fall 07 Final Review 90
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Annual income example Annual income example
\ \
\
\ \
_ _
c) What is a number L such that the probability that the sample
mean is within L of the population mean is 99 mean is within L of the population mean is 99% ?
A 99% confidence interval for the sample mean is given by:
[x - cc s/ *s/\n x + c*s/ s/\n] n] [x n , x + c
_
(where c = 2.576 (|=99%), s= 4000, n= 100,and x = 19,763)
Therefore L=c*s/ s/ \n 1030.4 Therefore, L=c n =1030 4
So the 99% confidence interval is given by:
[19 763-2 576*4000/\100 19 763+2 576*4000/\100] [19,763 2,576 4000/ 100, 19,763+2,576 4000/ 100]
= [18,732.6, 20,793.4]
December 15 2007 DMD Fall 07 Final Review 91
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Annual income example Annual income example
d) How many families should we successfully survey so that the
probability that the sample mean is within $200 of the
population mean is 95% ?
T t t 95%i t l th t i ithi $200 f th To construct a 95% interval that is within $200 of the
population mean, the required sample size n is given by:
n = c
2
s
2
/L
2
= 1.96
2
* 4000
2
/ 200
2
= 1536.64 ~ 1537
December 15 2007 DMD Fall 07 Final Review 92
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Annual income example Annual income example
_ _
A hotel manager would like to find out the mean time guests have to wait for
room service. For a sample of 45 45 guests the observed sample mean turned
out to be 32 minutes 32 minutes while the observed standard deviation 11 1 minutes 1 minutes. out to be 32 minutes 32 minutes while the observed standard deviation 11 1 minutes 1 minutes.
What is the 95% confidence interval 95% confidence interval for the mean time guests have to
wait for room service?
We assume the mean time guests have to wait for room service is
approximately Normal.
A 95% confidence interval for the mean time guests have to wait is given by:
[ x-c*s/\n, x+ c*s/\n ]
_
where c = 1.96 (|=95%), s = 11, and x = 32
So the 95% confidence interval is given by:
[ 32-1.96*11/ \45, 32+1.96*11/ \45 ] = [28.79, 35.21]
December 15 2007 DMD Fall 07 Final Review 93
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
The Room Service Example The Room Service Example
TOPIC 7: TOPIC 7:
Regression Regression Regression Regression
December 15 2007 DMD Fall 07 Final Review 94
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
-
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
An Ice An Ice Cr Cream Example eam Example
The fat content in a gallon of chocolate ice cream is believed to depend on
Cr Cream, Chocolate eam, Chocolate and Sugar and Sugar according to:
Fat =A Fat =A +B*Cr +B*Cream +C*Chocolate +D*Sugar eam +C*Chocolate +D*Sugar
A multiple regression was run on data from 20 differ 20 differeent batches nt batches of chocolate ice cream:
R Square: 0.8433
Standard Error: 13.73
Observations: 20
df
Regression: 3 Regression: 3
Residual: 16
Total: 19
Coefficients Standard Error t-Stat. Lower 95% Upper 95%
Intercept -8.94 19.95 -0.45 -51.24 33.35
Cream (ounces) 0.93 0.12 7.80 0.67 1.18
December 15 2007 DMD Fall 07 Final Review 95
Choc. (ounces) 2.07 0.60 ??? ??? ???
Sugar (ounces) 2.47 1.33 1.86 - 0.34 5.29
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
An Ice An Ice Cr Cream Example eam Example
Correlation between different variables:
Fat (gm) Cream (ounces) Choc. (ounces) Sugar (ounces) (g ) ( ) ( ) g ( )
Fat (gm) 1
Cream (ounces) 0.769 1
Choc (ounces) Choc. (ounces) 0.486 0.025 1
Sugar (ounces) 0.280 -0.099 0.409 1
Compute Compute the 95% CI for the 95% CI for Choc. Choc. coefficient coefficient
CCrriti itique mo que moddeell
December 15 2007 DMD Fall 07 Final Review 96
0 486 0 025 1
C iti d l C iti d l
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
th ?
1<Z<1] = F(1) 1) = 0 8413 0 1587=0 6826!
An Ice An Ice Cr Cream Example eam Example
Compute Compute the 95% CI for the 95% CI for Choc. Choc. Coefficient Coefficient
Coefficients Standard Error t-Stat. Lower 95% Upper 95%
Intercept -8.94 19.95 -0.45 -51.24 33.35
Cream (ounces) 0.93 0.12 7.80 0.67 1.18
Choc. (ounces) 2.07 0.60 ??? ??? ???
Sugar (ounces) 2.47 1.33 1.86 - 0.34 5.29
The 95% confidence interval for the Choc. coefficient
using c=2.120 from the T-table, will be:
[2.07 2.120*0.60, 2.07 + 2.120* 0.60]
[ = [0.798, 3.342]
December 15 2007 DMD Fall 07 Final Review 97
0 798 3 342]
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Critique model Critique model
An Ice An Ice Cr Cream Example eam Example
Signs of Regr Signs of Regression Coefficients ession Coefficients
Coefficients Standard Error t-Stat. Lower 95% Upper 95%
Intercept -8.94 19.95 -0.45 -51.24 33.35 p
Cream (ounces) 0.93 0.12 7.80 0.67 1.18
Choc. (ounces) 2.07 0.60 3.45 0.80 3.34
S ( ) 0 34 5 29 Sugar (ounces) 2.47 1.33 1.86 - 0.34 5.29
The coefficients for Cream, Choc and Sugar appear to make sense.
Significance test: Significance test:
0 is in the confidence interval for Sugar coeff.
so Sugar should be
excluded from the regression excluded from the regression.
December 15 2007 DMD Fall 07 Final Review 98
2 47 1 33 1 86
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
RR
2 2 - -
The value for R
2
is 0.8433 which indicates that the model has a high level
of prediction.
Multicollinearity: Multicollinearity:
Fat (gm) Cream(ounces) Choc (ounces) Sugar (ounces) Fat (gm) Cream (ounces) Choc. (ounces) Sugar (ounces)
Fat (gm) 1
Cream (ounces) 0.769 1
Choc. (ounces) 0.486 0.025 1
Sugar (ounces) 0.280 -0.099 0.409 1
There is a high correlation between chocolate and sugar
(>0.4) hence we should eliminate one of these variables
- sugar because of the low t-statistic. s a s c.
December 15 2007 DMD Fall 07 Final Review 99
suga because o e ow
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Heter Hetero oscedasticity: scedasticity:
a
l
s
u
0
Cream Residual Plot
u
R
e
s
i
d0
C ( )
Chocolate Residual Plot
Cream (ounces)
R
e
s
i
d
u
a
l
s
0
Chocolate (ounces)
There appears to be no heteroscedasticity
December 15 2007 DMD Fall 07 Final Review 100
Th t b h t d ti it
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Autocorrelation: Autocorrelation:
e
Residuals vs. Sample Number
e
s
i
d
u
a
l

V
a
l
u
e
0
p
R
e
Sample Number
There appears to be no autocorrelation
Residual Distribution: Residual Distribution:
n
c
y
Residual Frequency
F
r
e
q
u
e
n
December 15 2007 DMD Fall 07 Final Review 101
The residuals appear to be normally distributed
Residual
TOPIC 8: TOPIC 8:
Linear Optimizatio Linear Optimization Linear Optimization Linear Optimization
December 15 2007 DMD Fall 07 Final Review 102
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
DDifferent MModes of DDriving manufacturer of cars & trucks.
Vehicles are processed in the paint and body shops.
Painting trucks takes 1.5 times 1.5 times as much time as painting cars. If the paint shop only
paints trucks, then it paints 40 trucks/day 40 trucks/day. If it only paints cars, then 60 cars/day 60 cars/day.
Body work on cars and trucks takes the same amount of time If the body shop only Body work on cars and trucks takes the same amount of time. If the body shop only
produces trucks, then 50/day 50/day. If only produces cars, then 50/day 50/day.
Trucks contribute $500 $500 and cars contribute $400 $400 to profit.
Determine daily production schedule to maximize maximize profits.
Decision Variables : Decision Variables : C=# cars, T=# trucks
Objective Function: Objective Function: Max Max 400 C+ 500 T
Constraints: Paint Shop: Constraints: Paint Shop: T/40+C/60 <=1 day
Body Shop: Body Shop: T/50+C/50 <=1 day
Objective Function: Objective Function: Max Max 400 C+ 500 T
December 15 2007 DMD Fall 07 Final Review 103
y p y p y
T,C >=0 vehicles
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Different Modes of Driving Example Different Modes of Driving Example
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Trucks Trucks
Different Modes of Driving Example... Different Modes of Driving Example...
Which are the Binding Constraints? Which are the Binding Constraints?
Optimal Solution Optimal Solution
50
Total Profit: Total Profit: $22,000/Day
T/50+C/50 <=1
40
$ , y
Cars 30, Trucks 20 Cars 30, Trucks 20
20
Profit: Profit: $10,000/Day
Feasible Feasible
Profit: Profit: $
T/40+C/60 <=1
T >=0
Region Region
Profit: Profit: $5,000/Day
December 15 2007 DMD Fall 07 Final Review 104
Cars Cars
0 50 25
C>=0
O ti l S l ti O ti l S l ti
5 000/Day
60
Adjustable Cells Adjustable Cells
Final Reduced Objective Allowable Allowable
Cell Name Value Cost Coefficient Increase Decrease
$B$2 Cars Decision Variables 30 0 400 100 66.66666667
$B$3 Trucks Decision Variables 20 0 500 100 100 $B$3 Trucks Decision Variables 20 0 500 100 100
Constraints
Final Shadow Constraint Allowable Allowable
Cell Name Value Price R.H. Side Increase Decrease
$B$7 Paint Shop Constraints 1 12000 1 0.25 0.166666667
$B$8 Body Shop Constraints 1 10000 1 0.2 0.2
December 15 2007 DMD Fall 07 Final Review 105
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Different Modes of Driving Example... Different Modes of Driving Example...
pp
An outside contractor offers to paint 8 more trucks 8 more trucks ((or 12 more cars) or 12 more cars)
per day for $2,000. $2,000. Should we accept the offer?
Yes, based on the shadow prices, this expansion is worth:
$12,000 * 8/40 - $2,000 = $400
and, the increased capacity of 8/40 or 0.2 is within the allowable increase.
If the DMD company was given extra labor to increase productivity
in the body shop by 5 cars (or trucks) by 5 cars (or trucks), what would DMDs profits become profits become?
Increased profit is $10,000 * 5/50 = $1,000
December 15 2007 DMD Fall 07 Final Review 106
and, the increased capacity of 5/50 or 0.1 is within the allowable increase.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Economic Economic Interpretation Interpretation
Value of Opt. Obj
25 K 25 K
22 K
20 K 20 K
Value of Paint Shop Value of Paint Shop
0 0.83 1.00 1.25
0 K
RHS RHS
Careful of the range! Careful of the range!
Sl 0
Slope = 12,000
Slope = 0
Slope = 24,000
In this range, every unit change in the RHS results in a $12,000 $12,000 unit
change in the objective function
December 15 2007 DMD Fall 07 Final Review 107
change in the objective function.
This value is called the shadow price shadow price of the constraint over this range.
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC 9: TOPIC 9:
Nonlinear Optimizatio Nonlinear Optimization Nonlinear Optimization Nonlinear Optimization
December 15 2007 DMD Fall 07 Final Review 108
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].



You are producing three products A, B and C. You need to satisfy production
limits and resource availability constraints: (1) You can produce at most 1000,
800 and 700 units of A, B and C respectively; (2) The data for resource
il bilit i f ll availability is as follows:
A B C A B C Resources Resources (in hours) (in hours)
machine 1 machine 1 2 1 3 5 1000 machine 1 machine 1 2 1 3.5 1000
machine 2 machine 2 0.2 0.8 1.2 350
Production levels influence market price of each product:
P
A
=200 - X
A
+ 0.5 X
B
, P
B
=100 - 2X
B
+ 0.25 X
A
, P
C
= 500 - X
C
December 15 2007 DMD Fall 07 Final Review 109
We want to Maximize Maximize revenue
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
A Production Example A Production Example
=
Decision Variables
X = product A to be produced by machine 1 X
1A
product A to be produced by machine 1
X
1B
= product B to be produced by machine 1
X
1C
= product C to be produced by machine 1
X
2A
= product A to be produced by machine 2
X
2B
= product B to be produced by machine 2
X
2C
= product C to be produced by machine 2
2C
p p y
Objective Function
Max Max P
A
* (X
1A
+ X
2A
) + P
B
* (X
1B
+ X
2B
) + P
C
* (X
1C
+ X
2C
)
December 15 2007 DMD Fall 07 Final Review 110
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Formulation Formulation
Subject to:
Price: P
A
= 200 - (X
1A
+ X
2A
) + 0.5 * (X
1B
+ X
2B
),
P
B
= 100 - 2 * (X
1B
+ X
2B
) +0.25 * (X
1A
+ X
2A
) ,
P
C
= 500 - (X
1C
+ X
2C
)
Resource:
Machine 1: 2 X
1A
+ X
1B
+ 3.5 X
1C
<= 1000
Machine 2: 0.2 X
2A
+ 0.8 X
2B
+ 1.2 X
2C
<= 350
Production Limit: A: X + X <= 1000 Production Limit: A: X
1A
+ X
2A
<= 1000
B: X
1B
+ X
2B
<= 800
C: X
1C
+ X
2C
<= 700
Non-negativity: X
1A
,X
2A
,X
1B
,X
2B
,X
1C
,X
2C
, P
A
, P
B
, P
C
>= 0
December 15 2007 DMD Fall 07 Final Review 111
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
More Formulation... More Formulation...
$
Decision Variables: 58 82 23 53 125 00 Decision Variables: 58.82 23.53 125.00
(units) 58.82 23.53 125.00
Price A 105.88 $
Price B 35.29 $
Objective Function: 76,617.65 $
Price C 250.00 $
MAX
maximize revenues
112
hours/unit
A B C
Machine 1 2 1 3.5
Machine 2 0.2 0.8 1.2
Machine Limit
1000 hours
350 hours
Capacity Limit 1000 800 700
units units units
Constraints: LHS RHS
machine 1 capacity 578.68 <= 1000
machine 2 capacity 180.59 <= 350
product A limit product A limit 117 65 117.65 <<= 1000 1000
product B limit 47.06 <= 800
product C limit
r 15 2007 Decembe DM
250.00
D Fall 07 Final
<=
Review
700
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Excel Solution Excel Solution
Sensitivity Report Sensitivity Report
Mi ft E l 10 0 S iti it R t Microsoft Excel 10.0 Sensitivity Report
Worksheet: [Book1]Products
Report Created: 12/9/2004 10:19:06 PM
Adjustable Cells
Final Reduced
Cell Name Value Gradient
$B$8 units 58.82 0
$C$8 $C$8 i 23.53 00 units 23 53
$D$8 units 125.00 0
$B$9 units 58.82 0
$C$9 units 23.53 0
$D$9 $D$9 units 125.00 00 units 125.00
Constraints
Final Lagrange
Cell Name Value Multiplier
$B$19 machine 1 capacity LHS 578.68 0
$B$20 machine 2 capacity LHS 180.59 0
$B$21 product A limit LHS 117.65 0
$B$22 product B limit LHS 47.06 0
$B$23 $B$23 product C limit LHS 250.00 00 product C limit LHS 250 00
$B$11 Price A units $ 105.88 $ -
$B$12 Price B units $ 35.29 $ -
December 15 2007 DMD Fall 07 Final Review
$B$13 Price C units $ 250.00 $ -
All Lagrange
Multipliers are zero!
All constraints are non-
binding around the close
proximity of the optimal
solution.
Optimal solution occurs
in the interior of the in the interior of the
feasible region.
113
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC 10: TOPIC 10:
Discrete Optimizatio Discrete Optimization Discrete Optimization Discrete Optimization
December 15 2007 DMD Fall 07 Final Review 114
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].




An electrical utility company each day is deciding which generators to start up.
Lets play! Lets play!
It has three generators three generators (see below).
There are two periods in a day two periods in a day, and the number of megawatts needed in the
first period is 2900. 2900. The second period requires 3900 3900 megawatts. Unused first period is 2900. 2900. The second period requires 3900 3900 megawatts. Unused
electricity left over from period 1 can be used in period 2.
It wants to minimize minimize total cost.
Formulate and solve as Formulate and solve as MIP ! MIP !
Generator Fixed costs Cost per period Max capacity Generator Fixed costs Cost per period Max capacity
per period ($) per megawatt used ($) in each period (MW) per period ($) per megawatt used ($) in each period (MW)
A 3000 5 2100
B 2000 4 1800
December 15 2007 DMD Fall 07 Final Review 115
B 2000 4 1800
C 1000 7 3000
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
TOPIC 10: TOPIC 10:
Discrete Optimizatio Discrete Optimization Discrete Optimization Discrete Optimization
December 15 2007 DMD Fall 07 Final Review 114
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
More Formulation... More Formulation...
Objective Function
minimize 3000 (X
A1
+ X
A2
) + 2000 (X
B1
+ X
B2
) + 1000 (X
C1
+ X
C2
)
5 (Y
A1
+ Y
A2
) + 4 (Y
B1
+ Y
B2
) + 7 (Y
C1
+ Y
C2
) (
A1 A2
) (
B1 B2
) (
C1 C2
)
Subject to:
Capacity:
Machine A: Y
A1
<= 2100 X
A1
; Y
A2
<= 2100 X
A2 A1 A1
;
A2 A2
Machine B: Y
B1
<= 1800 X
B1
; Y
B2
<= 1800 X
B2
Machine C: Y
C1
<= 3000 X
C1
; Y
C2
<= 3000 X
B2
Demand: Demand:
Period 1: Y
A1
+ Y
B1
+ Y
C1
>= 2900
Period 2: Y
A2
+ Y
B2
+ Y
C2
+Y
A1
+ Y
B1
+ Y
C1
- 2900 >= 3900
Binary: X
A1
, X
A2
, X
B1
, X
B2
, X
C1
, X
C2
= {1 if used; 0 otherwise}
Non-negativity:
Y
A1
, Y
A2
, Y
B1
, Y
B2
, Y
C1
, Y
C2
>= 0
December 15 2007 DMD Fall 07 Final Review 117
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Excel Solution Excel Solution
Fixed + Variable = Total
10000 30400 40400
Objective Function:
Cost:
A B C
3000 2000 1000
5 4 7 Cost per megawatt
Fixed Cost per period
Decision Variables and Constraints:
Xij (0 or 1) Yij Limit * Xij Limit Demand
A 1 2100 <= 2100 2100
Total
d

1

B 1 1800 <= 1800 1800
C 0 0 <= 0 3000
A 1 1100 <= 2100 2100
B 1 1800 <= 1800 1800
C 0 0 <= 0 3000
P
e
r
i
o
d

>= 3900
>= 2900
= 3900
3900
2900 + 1000
P
e
r
i
o
d

2

Xij binary
Xij, Yij >= 0
Other constraints:
December 15 2007 DMD Fall 07 Final Review 118
Robert Freund, David Gamarnik, and Andreas Schulz, course materials for 15.060 Data, Models, and Decisions, Fall 2007.
MIT OpenCourseWare (http://ocw.mit.edu), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].

Vous aimerez peut-être aussi