Académique Documents
Professionnel Documents
Culture Documents
Things to remember
Study regularly
Take notes
Probability Review
Experiment
I
I
Event: subset of
I
Occurrence
I
I
I
Events
I
Set operations
I
I
I
I
Empty set:
I
Union: A B = {x : x A or x B}
Intersection: A B = {x : x A and x B}
Complement: A = {x : x 6 A}
Inclusion: A B
A A =
[
n=1
\
n=1
An = A1 A2 . . . = {x : x An for some n}
An = A1 A2 . . . = {x : x An for all n}
Algebra of Events
I
Algebra
I
I
I
I
I
I
I
I
I
I
AB =B A
AB =B A
A (B C ) = (A B) C
A (B C ) = (A B) C
A (B C ) = (A B) (A C )
A (B C ) = (A B) (A C )
A = A
A A =
A=
A=A
De Morgans Law
I
I
A B = A B
A B = A B
More Details...
Sample space
I
I
Probability Law
I
Probability law P
I
I
3 axioms for P:
I
I
I
[
X
P
An =
P[An ]
n=1
Not so simple...
n=1
Consequences
I
P[] = 0
Examples
Conditional Probability
I
I
0 P[A|B] 1
P[|B] = 1
If the events A1 , A2 , . . . are disjoint, then
"
#
[
X
P
An B =
P[An | B]
n=1
n=1
Conditional Probability
I
Example
I
I
I
I
I
Proof...
Example
Independence
I
I
I
Conditional Independence
I
I
I
I
I
I
Two Tools
I
Bayes rule
P[A|B] =
P[A] P[B|A]
P[B]
Random Variables
I
Example
I
I
I
I
Example
I
I
Geometric PMF
Example
I
I
I
I
PMF
I
Binomial
I
I
Poisson
I
x
x!
Bernoulli
I
Uniform
I
on {1, 2, . . . , n}
pX (x) = 1/n, x = 1, 2, . . . , n
Expectation
I
Definition
EX =
xi pX (xi )
Examples
I
I
X is uniform on {0, 1, 2, . . . , n}
X is geometric
Expectation
I
Theorem
EY =
g (x) pX (x)
x
I
Examples
I
I
nth moment of X : EX n
nth central moment X : E[(X EX )n ]
IMPORTANT: In general
E[g (X )] 6= g (EX )
Expectation
I
Properties
I
I
p
Standard deviation: Var(X ) =
Small variance = mass closer to EX
0 variance = all mass at EX
If b is a constant: Eb = b
If a is a constant: E[aX ] = a EX
Expectation is linear:
E[g (X ) + h(X )] = E[g (X )] + E[h(X )]
Example
E[(X EX )2 ] = E[X 2 2X EX + (EX )2 ]
= E[X 2 ] 2(EX )2 + (EX )2 = E[X 2 ] (EX )2
Example
I
Thus
ET EV 6= d = E[TV ]
Joint PMF
I
Normalization
X
pX ,Y (x, y ) = 1
x,y
I
Marginals
pX (x) =
pX ,Y (x, y )
pY (y ) =
X
x
Amount of information
pX ,Y (x, y )
Joint PMF
I
c
P[X > Y ]
P[X = Y ]
Expectation
E[g (X , Y )] =
XX
x
g (x, y ) pX ,Y (x, y )
In general
E[g (X , Y )] 6= g (EX , EY )
Expectation
Expectation is linear:
E[g (X ) + h(Y )] = E[g (X )] + E[h(Y )]
Example
I
I
I
I
n people throw their hats in a box and then pick one at random
X = # of people who get their own hats
EX = ?
Var(X ) = ?
Conditioning
I
Conditional PMF
pX |A (x) = P[X = x | A] =
P[{X = x} A]
P[A]
Conditional expectation
E[g (X ) | A] =
g (x) pX |A (x)
Example
I
I
I
I
Conditioning
I
Conditional PMF
pX |Y (x|y ) = P[X = x | Y = y ] =
P[X = x, Y = y ]
P[Y = y ]
Conditional expectation
X
E[X | Y = y ] =
x pX |Y (x|y ) = h(y )
(a number)
E[X | Y ] = h(Y )
(random variable)
Double conditioning
X
E[E[X |Y ]] =
E[X |Y = y ] pY (y )
I
XX
y
x pX |Y (x|y ) py (y ) =
XX
x
x pX ,Y (x, y ) = EX
Conditioning
I
Example
I
I
I
Example
S=
N
X
Xi
n=1
I
Independence
I
for all x
E[XY ] = EX EY
Var(X + Y ) = Var(X ) + Var(Y )
(x) 0
RfX
f (x) dx = 1
X
x+
fX (u) du fX (x)
x
PDF
I
I
Z
fX (x) dx =
1
dx = 1
2 x
R
EX = x fX (x) dx
R
E[g (X )] = g (x) fX (x) dx
R
Var(X ) = X2 = (x EX )2 fX (x) dx
EX = (a + b)/2
Rb
Var(X ) = a (x (a + b)/2)2 /(b a) dx = (b a)2 /12
CDF:
FX (x) = P[X x]
P
Discrete: FX (x) = ux pX (u)
Rx
Continuous: FX (x) = fX (u) du
dFX (x)
dx
Examples
I
I
I
I
I
1
2 2
R
normalization: fX (x) = 1
mean =
variance = 2
standard normal r.v.: = 0, = 1
(x)2
2 2
Moments
I
I
I
FX X
Example: exponential X
Joint PDF
I
Interpretation
P[x X x + , y Y y + ] fX ,Y (x, y ) 2
Properties
I
I
fRR
X ,Y (x, y ) 0 for all x and y
f (x, y ) dx dy = 1
X ,Y
Joint PDF
I
Expectation
ZZ
E[g (X , Y )] =
g (x, y ) fX ,Y (x, y ) dx dy
Marginal densities
Z
fX (x) =
fX ,Y (x, y ) dy
Z
fY (y ) =
fX ,Y (x, y ) dx
Joint CDF
I
FX ,Y (x, y ) = P[X x, Y y ] =
fX ,Y (u, v ) du dv
Then
... and
2
FX ,Y (x, y ) = fX ,Y (x, y )
x y
FX (x) = FX ,Y (x, )
FY (y ) = FX ,Y (, y )
Independence
I
for all x, y
Conditional density
I
I
I
P[x X x + , y Y y + ] fX ,Y (x, y ) 2
P[x X x + ] fX (x)
Would like to have
P[x X x + | Y = y ] fX |Y (x|y )
Thus
fX |Y (x|y ) =
fX ,Y (x, y )
fY (y )
Derived Distributions
I
Z = g (X ) or Z = g (X , Y )
Distribution of Z
X , Y discrete:
P[Z = z] =
pX ,Y (x, y )
X , Y continuous:
I
I
Examples
I
I
Sums of R.V.s
I
Independent X and Y
Discrete X and Y
P[X + Y = z] =
P[X + Y = z | X = x] P[X = x]
P[Y = z x] P[X = x]
x
I
Poisson r.v.s
Continuous X and Y
Z
fX +Y (z) =
fX +Y |X (z|x) fX (x) dx
Z
Uniform r.v.s
fY (z x) fX (x) dx
Example
Z = X + BY
Poisson Process
Model
I
I
Nt 0
Nt is integer valued
Nt is non-decreasing
Applications:
I
I
I
I
I
I
Transportation
Manufacturing
Service industry
Supply chains
Healthcare
...
Data
Arrival
Rate
May 1959!
Time
24 hrs
% Arrivals
Dec 1995!
Time
24 hrs
(Help Desk Institute)
Bernoulli Process
I
Xn = 1 arrival at time n
Xn = 0 no arrival at time n
3 Quantities of Interest
I
geometric
i.i.d.
Definition
Independent increments
The numbers of arrivals in disjoint time intervals are
independent
Stationary increments
The distribution of the number of arrivals in any interval of
time depends only on the length of the interval
Poisson Process
Most relevant
Definitions
I
N0 = 0
stationary and independent increments
for 0
1 , k = 0
P[N = k] ,
k=1
0,
k 2
N0 = 0
stationary and independent increments
P[Nt = k] =
I
(t)k t
e
k!
N0 = 0
interarrival times are i.i.d. exponential with rate
Bernoulli/Poisson Relationship
Bernoulli process: pm = t
... as 0
n k
p (1 p)nk
k
Poisson Process
I
Arrival rate :
ENt =
X
X
(t)i
(t)i t
e
= e t
i
i!
(i 1)!
i=0
i=1
= t e t
X
(t)i
i=0
= t
i!
Interarrival times: Ti
P[Ti > t] = P[no arrivals in (Yi1 , Yi1 + t]] = P[Nt = 0] = e t
Exponential Distribution
fX (x) = e x , x 0
Tail probability:
P[X > x] = e x
P[X > t + s]
= e s = P[X > s]
P[X > t]
Example
Arrival Times
I
Approach one:
I
{Yt k} = {Nt k}
P[Nt n] =
X
(t)i
i=k
e t
Approach two:
I
I
i!
Approach three:
k y k1 y
(y )k1 y
e
=
e
(k 1)!
(k 1)!
(2)
(1)
(2)
and T1
(2)
> t] P[T1
> t] = e 1 t e 2 t
Examples
Bulbs
I
I
2 2 switch
I
I
Exponential weights
MaxWeight scheduling
Justification
(1)
P[Nt
I
(2)
= i, Nt
= j] =
T1
K
X
i=1
Ti
Examples
Infinite-server model
I
I
Highway encounters
I
I
I
I
Conditional Arrivals
I
Suppose {Nt = n}
Simulation
Infinite-server model
nth model
I
I
ET = 1/
n sources
a source characterized by T (n) = nT
1
ET (n)
Z
0
Markov Chains
Stochastic Process
I
I
Discrete time: n = 0, 1, . . .
Simplest example
I
I
I
I
I
Incorporate memory
Example
I
I
I
Daily temperatures
70, 72, 71, 74, 75, ?
Simplest way to take into account the past: tomorrow depends
only on today
Markov Chain
I
Definition
I
I
I
I
I
Transition Matrix
I
P is a Markov matrix if
I
I
P(i,
P j) 0 for any i, j E
jE P(i, j) = 1 for each i
Example
I
I
I
I
Example
I
E = {0, 1, 2, . . .}
Markov property
P[Nn+1 = j | N0 = 0, N1 = i1 , . . . , Nn = i] =?
Matrix P
P(i, j) = P[Nn+1 = j | Nn = i] =?
Memory
Markov chain X on E
X0 , X1 , X2 , X3 , X4 , . . . , Xn , . . .
Define Yn = (Xn+1 , Xn )
Y is a MC
P[X2 = j, X1 = k|X0 = i]
k
I
In general
ri,j (n) =
X
k
Example
Red or blue
A ball is randomly chosen and replaced by a new ball
I
I
Example
Find
P[X4 = 2, X3 2, X2 2, X1 2 | X0 = 1]
Example
E = {1, 2},
P=
Initial condition: X0 = 1
r1,1 (n)
r1,2 (n)
0.5 0.5
0.2 0.8
n=0
1
0
n=1
0.5
0.5
n=2
0.5 0.5 + 0.5 0.2 = 0.35
0.5 0.5 + 0.5 0.8 = 0.65
n=3
0.305
0.695
Examples
I
Example 1
0 1 0
P = 0.5 0 0.5
0 1 0
I
Example 2
1
0
0
0
0.3 0.4 0.3 0
P=
0
0 0.5 0.5
0
0 0.5 0.5
I
Classification of States
Recurrent States
I
State i is
I
I
P
recurrent if P P n (i, i) =
transient if
P n (i, i) <
Example
Recurrent states:
Transient states:
Recurrent classes:
P[X1 = 2, X2 = 6, X3 = 7|X0 = 1] =
P[X4 = 7|X0 = 2] =
Classification
I
I
Transient
Recurrent
I
I
Finite MC
Periodic States
I
Example
5
6
1
4
3
I
8
9
Steady-State Probabilities
I
Yes, if
I
I
I
irreducible chain
positive recurrent
no periodicity (aperiodic)
Start with
ri,j (n) =
ri,k (n 1) pk,j
k
I
Take n : j =
k pk,j or
= P
where = [1 , 2 , . . .]
I
Additional equation
X
k
k = 1
Example
Example
0.5 0.5
P=
0.2 0.8
Then
1 = 0.51 + 0.22
2 = 0.51 + 0.82
1 + 2 = 1
Interpretations
I
i
I
I
I
= P
I
I
I
I
eigenvector
one column of P is redundant
need an extra equation to find
frequency interpretation
I
I
I
k frequency of being in k
k pk,j frequency or transitioning from k to j
P
k k pk,j frequency of entering j
Birth-Death Processes
I
Special structure of P
p0
1p0
p1
q1
= P works
Local balance:
p2
q2
q3
i pi = i+1 qi+1
I
pm1
p3
Example: pi = p, qi = q, p + q < 1:
i = 0 (p/q)i
(p/q)i
i = Pm
j
j=0 (p/q)
qm1
qm
1qm
Example
I
Example
Theorem
Suppose {Xi } is irreducible and aperiodic. Then all states are
recurrent positive if and only if the system of linear equations
X
j =
i P(i, j)
iE
i = 1
iE
Examples
Random walk
I
I
E = {0, 1, 2, . . .}
Transition matrix
q
q
P=
p
0
q
p
0
Absorption Probabilities
I
Markov chain
Then
as = 1
ai = 0 for all absorbing i 6= s
X
ai =
aj P(i, j) for all transient i
j
Minimal solution
Let
i = E[min{n 0 : Xn is recurrent } | X0 = i]
Then
i = 0
Minimal solution
Let
"
S(i, j) = E
#
1{Xk =j} X0 = i
k=0
Then
S = (I Q)1
Example
CTMC
I
Definition
I
I
I
Transition Function
I
I
Pt (i, j)
Properties
I
I
I
Pt (i, j) 0
P
PjE Pt (i, j) = 1
kE Pt (i, k) Ps (k, j) = Pt+s (i, j)
Example
I
I
Transition function
(
Pt (i, j) =
0,
(t)ji
(ji)!
j <i
e
, j i
Then
P[Wt > u + v | Xt = i] = P[Wt > u, Wt+u > v | Xt = i]
= P[Wt > u | Xt = i] P[Wt+u > v | Xt+u = i]
Hence
P[Wt > u | Xt = i] = e i u
for u 0 and some i [0, ].
Two components
I
I
A vector of rates
A stochastic matrix Q with Q(i, i) = 0
Example
M/M/1
Single server
Poisson arrivals
In state i:
I
I
I
I
R generator matrix
Example: M/M/1
R(i, j) = 0
Transition Functions
I
Start with
Pt+h (i, j) =
Pt (i, k) Ph (k, j)
k
I
Consider h 0 to obtain
d
Pt = Pt R = RPt
dt
Solution:
Pt = e tR
where
e tR =
n
X
t
n=0
n!
Rn
Steady State
I
dPt /dt = 0
Interpretations
I
Limit as t :
j = lim P[X (t) = j | X (0) = i]
t
I
I
I
I
Examples
M/M/1
M/M/
Queueing Theory
Queueing Theory
I
Queue = line
I
I
I
I
I
I
I
Notation 1/2/3/4/5
M/M/1/n
I
M/M/1/n
I
CTMC
Transition rates
I
I
R(i, i + 1) =
R(i, i 1) =
M/M/1/n
I
Let = /
Then i = 0 i and
01
n
X
i=0
I
Littles Theorem
I
Let
I
I
I
t
P(t)
Ti
Tt = i=1
ET
(t)
Littles Theorem
I
Theorem: EN = ET
(t)
t
I
i=1
P(t)
i=1 Ti
(t)
Z
Ti
N(u) du
0
(t)
X
Ti
i=1
Z
0
(t)
N(u) du
t
As t
ET EN ET
P(t)
i=1 Ti
(t)
M/M/1
I
Balance equations
0 = 1
1 = 2
...
i <
i=0
I
i
i=0
=1
PASTA
Back to M/M/1
I
ii =
i=0
I
(i 1)i = EN (1 0 ) =
i=1
I
ET =
1X
1
(i + 1) i =
i=0
2
1
Back to M/M/1
EW =
1X
1
ii =
1
i=0
Littles law
I
I
I
EN = ET
EK = EW
1 0 = /
Multiple Servers
I
M/M/2
I
I
I
Analysis
I
I
I
I
I
FCFS
2 identical servers (service rate )
work-conserving
Similar to M/M/1
Consider the number of customers in the system
Transition rates
Balance equations
System stable only if < 2
Example
I
System
I
I
I
Analysis
I
I
I
I
Example
I
I
Analysis
I
I
I
I
k stages of service
k nodes
Customer routing:
I
I
I
I
probabilistic routing
move from node i to node j with probability pij
P
after node i depart the network with probability 1 j pij
every customer eventually leaves the system
Example
j pji
Stability:
i =
i
<1
i
Theorem:
k
Y
lim P[N1 (t) = n1 , . . . , Nk (t) = nk ] =
(1 i )ni i
i=1
Economies of Scale
M/M/1
Dynamic Programming
Dynamic Programming
Dynamic program
I
I
I
I
Shortest Path
I
Classify the cities: All cities that can be reached in n days are
stage n cities
In general
ft (i) =
min
j: j is a t + 1 city
Why?
I
Computational efficiency
destination at stage 6
5 choices for stages 1-5
55 paths in total
Computational cost: 56 additions and (55 1) comparisons
Dynamic programming
I
I
Four Applications
Shortest path
Resource allocation
Equipment replacement
Single item
Planning period of T periods
I
I
Demand in period t (t = 1, . . . , T ): dt
Cost of producing x units in period t: ct (x)
I
Examples
I
I
NLP formulation
min
T
X
ct (xt ) +
t=1
T
X
ht (It )
t=1
s.t. It1 + xt = dt + It ,
t = 1, . . . , T
0 It B,
t = 1, . . . , T
0 xt C ,
t = 1, . . . , T
Dynamic programming
I
I
I
Stages: t = 1, . . . , T (time)
Recursion
I
I
I
Recursion
I
ft (I ) =
Demand met:
I + x dt 0
Setup
I
I
I
I
T = 4 periods
Demands: 1,4,2,3
Inventory holding cost: $0.50 per unit
Production costs:
I
I
I
I
Initialization: T = 4
I
I
d4 = 3
Cost at the beginning of stage T = 4:
f4 (I ) = c4 (d4 I )
Next stage: t = 3
I
I
d3 = 2
Cost at the beginning of stage t = 3:
f3 (I ) =
I
min
max(0,2I )xmin(5,6I )
I = 0:
f3 (0) = min {c3 (x) + 0.5 (x 2) + f4 (x 2)}
2x5
(x = 5)
I = 2:
(x = 3)
Network representation
Arcs: decisions x
Resource Allocation
I
I
I
NLP formulation:
max
n
X
ri (di )
i=1
s.t.
n
X
di B
i=1
di {0, 1, . . .}
Resource Allocation
Example
I
I
I
I
n = 3, B = 6
r1 (d1 ) = 7d1 + 2 (d1 > 0), r1 (0) = 0
r2 (d2 ) = 3d2 + 7 (d1 > 0), r1 (0) = 0
r3 (d3 ) = 4d3 + 5 (d1 > 0), r1 (0) = 0
DP formulation
I
I
I
I
I
Resource Allocation
I
I
I
I
f2 (0) = 0
f2 (1) = max{r2 (0) + f3 (1), r2 (1) + f3 (0)} = 10
f2 (2) = max{r2 (0) + f3 (2), r2 (1) + f3 (1), r2 (2) + f3 (0)} = 19
Network representation
Arcs: decisions d
Example
I
I
I
DP formulation
I
I
I
I
I
I
I
Stages: t = 0, 1, . . . , 5 (time)
States: y = 0, 1, 2, 3 (age of machine)
Decisions: d = 0, 1 (keep or trade-in)
Recursion: ft (y ) (minimal net cost after period t)
Goal: f0 (y )
Secretary Problem
I
Objective:
Example
I
The owner of a lake must decide how many bass to catch and
sell each year.
P[M = 2] = 0.2.
Example
I
Each gallon is sold at the chains three stores for $2 per gallon.
Unsold milk at the end of the day can be sold back to the
diary at $0.50 per gallon.
P[D1 = 1] = 0.6
P[D2 = 1] = 0.5
P[D3 = 1] = 0.4
No terminal stage
What to do?
Cost/Reward
I
I
Discounting
Averaging
MDP
I
I
State space S
Decision set D(i)
I
Transition probabilities
I
Expected rewards
I
Policy
I
I
I
I
Discounting
I
Discounting factor
Either
V (i) = max V (i)
or
V (i) = min V (i)
Policy Iteration
I
Stationary policy
One-step analysis:
V (i) = ri,(i) +
j
I
Policy improvement:
V (i)
X
max ri,d +
pi,j (d) V (j)
dD(i)
j
Value Iteration
X
pi,j (d)Vt1 (j)
V (t, i) = max ri,d +
dD(i)
j
V (0, i) = 0
Let t