Lecture 18

15-251: Great Theoretical Ideas in Computer Science Lecture 18
A day in the life of me

Work
Random Walks
image: David B. Wilson

Work 40%

Work 40%
60% 10%
60% 60%
Surf
Email
30%
Surf

Work 50% 60% 50% 10% 60% 40%

Work 40% 50% 60% 10% 60%
9:01 9:00
50%
Email
30%
Surf
Email
30%
Surf

Work 40% 50% 60% 10% 60%

Work 40% 50% 60% 10% 60%
9:02 9:01
50%
9:03 9:02
50%
Email
30%
Surf
Email
30%
Surf

Work 50% 40%
Markov Chain Definition

Directed graph, self-loops OK Always assumed strongly connected in 251 60% Each edge labeled by a positive probability At each node (state), the probabilities on outgoing edges sum up to 1
9:04 9:03
50%
60% 10%
Email
30%
Surf
Markov Chain Example

.1 1 4 .7 .1 3 2 .4 .6 1 .2
Markov Chain Notation

Suppose there are n states.
.1 1 4 1 .7 3 .1 .2 2 .4 .6
nn transition matrix K: Ki,j = Pr [i j in 1 step] K[u,v]
.9
.9

Suppose there are n states.
.1 1 4 1 .7 3 .1 .2 2 .4 .6

For time t = 0, 1, 2, 3, Xt denotes the state (node) at time t. Somebody decides on X0. Then X1, X2, X3, are random variables. X0 = W
Work 50% 60% 50% 10% 40%
nn transition matrix K: K[i,j] = Pr [i j in 1 step]

0 0 0 1 .2 .6 .1 0 .7 .4 0 0 .1 0 .9 0
.9
K=
Rows sum to 1
(stochastic matrix)
Email
30%
Surf
60%

For time t = 0, 1, 2, 3, Xt denotes the state (node) at time t. Somebody decides on X0. Then X1, X2, X3, are random variables. X0 = W X1 = S
Work 50% 60% 50% 10% 40%

For time t = 0, 1, 2, 3, Xt denotes the state (node) at time t. Somebody decides on X0. Then X1, X2, X3, are random variables. X0 = W X1 = S X2 = E
Work 50% 60% 50% 10% 40%
Email
30%
Surf
60%
Email
30%
Surf
60%

For time t = 0, 1, 2, 3, Xt denotes the state (node) at time t. Somebody decides on X0. Then X1, X2, X3, are random variables. X0 = W X1 = S X2 = E
Work 50% 60% 50% 10% 40%
Work 50%
40%
60% 50% 10%
K=
W S E
Email
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
Pr [X1 = S | X0 = W] = .6 Pr [X1 = S | X0 = E] = 0 Pr [X6 = W | X5 = S] = .1 Pr [Xt+1 = j | Xt = i] = K[i,j]
X3 = W
Email
30%
Surf
60%
Work 50%
40%
Work 50%
40%
60% 50% 10%
K=
W S E
Email
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
60% 50% 10%
K=
W S E
Email
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
Pr [X2 = W | X0 = S] =
+ +
(by Law of Total Prob)
Pr [X2 = W | X0 = S] =
(by Law of Total Prob)
Pr [X1 = W | X0 = S] Pr [X2 = W | X1 = W, X0 = S]
Pr [X1 = W | X0 = S] Pr [X2 = W | X1 = W] + Pr [X1 = S | X0 = S] Pr [X2 = W | X1 = S] + Pr [X1 = E | X0 = S] Pr [X2 = W | X1 = E] = .1 .4 + .6 .1 + .3 .5 = .25
In general, what is Pr [X2 = j | X0 = i] ? Conditioning on X1, using Law of Total Prob
In general, what is Pr [X3 = j | X0 = i] ? Conditioning on X2, using Law of Total Prob
i K
j K In general, Pr [Xt = j | X0 = i] = Kt [i,j].
A random initial state

Often assume the initial state X0 is also chosen randomly in some way
W S E
X0 ~ 0 =
Work 50%
50%
40%
20%
30%
W S E
60% 50% 10%
e.g., X0 ~
50%
20%
30%
K=
W S E
Email
30%
Surf
60%
.4 .1 .5
.6 .6 0
0 .3 .5
a distribution vector
(nonnegative, adds to 1)
distribution vector for X0 usually denoted 0
Pr [X1 = W] = .5 .4 + .2 .1 + .3 .5 = .37
In general, if X0 ~ 0, what is Pr [X1 = j] ? Conditioning on X0, using Law of Total Prob
The Invariant Distribution

0 K (AKA the Stationary Distribution)
I.e., the distribution vector for X1 is 1 = 0 K And, the distribution vector for Xt is t = 0 Kt
Work 50%
40%
60% 50% 10%
K=
W S E
Email
30%
Surf
60%
Whats up with that?
Recall: Kt [i,j] = Pr [i j in exactly t steps] When t is large, the distribution vector for Xt hardly depends on the initial distribution 0.
Invariant Distribution calculation

Raising K to a large power is annoying. This limiting row (assuming the limit exists) is called the invariant distribution . In the long run, 29.4% of the time Im working, 44.1% of the time Im surfing, 26.5% of the time Im on email. is invariant: if you start in this distribution and you take one more step, youre still in the distribution. i.e.,
= K
For fixed K, this yields a system of equations.
= K
[W] [S] [E]
Fundamental Theorem
.4 .1 .5 .6 .6 0 0 .3 .5
[W] [S] [E]
Given a
(finite, strongly connected)
Markov Chain with
[W] = .4 [W] + .1 [S] + .5 [E] [S] = .6 [W] + .6 [S] + 0 [E] [E] = 0 [W] + .3 [S] + .5 [E] and you can add [W] + [S] + [E] = 1
transition matrix K, there is a unique invariant distribution satisfying = K.
Solution:
(and [i] > 0 for all i)
Fundamental Theorem
is also the limiting row of Kt as t unless the chain has some stupid periodicity: 100% 1 100% 2
Expected Time from u to u

In a Markov Chain with invariant distribution , suppose [u] = .
If you walked for N steps, you would expect to be at state u about times.
The average time between successive visits to u would be about No limiting dist., but = ( ) is still invariant. .
Not hard to turn this into a theorem.
Mean First Recurrence Thm

In a Markov Chain with invariant distribution , Muu = E [# steps to hit u again if starting from u] =
Markov Chain Summary

K [i,j] = Pr [i j in 1 step] Kt [i,j] = Pr [i j in exactly t steps] If t is distribution at time t, t = 0 K a unique invariant distribution s.t. = K E [# steps to go from u to u] =
Interlude: PageRank
1997: Web search was horrible.
Interlude: PageRank
1997: Web search was horrible. You search for CMU, it finds all the pages containing CMU & sorts by # occurrences.
$20Billionaires
Nevanlinna Prize
Interlude: PageRank
Lorem Ipsum Dolor Sit Amet Lorem Ipsum Dolor Sit Amet Lorem Ipsum Dolor Sit Amet
Measure importance with Random Surfer model:

Follows a random outgoing link with prob. Jumps to a completely random page with prob. 1 is a parameter ( 85%)
Lorem Ipsum Dolor Sit Amet
Lorem Ipsum Dolor Sit Amet
Random walks on undirected graphs
PageRank: compute the invariant distribution , rank pages u by highest [u] value!
Connected undirected graph. Each step: go to a random neighbor.
Connected undirected graph. Each step: go to a random neighbor.

1/3 1/2 1/2 1/3 1/2 1/3
.6
1/2
1/3
1/3
1/3
Notation: n nodes, m edges, degr. of node i is di
Notation: n nodes, m edges, degr. of node i is di
What is the transition matrix K?
What is the transition matrix K?

Adjacency matrix: Transition matrix: d1 d2 d3 d4
1 4 3 2
(not symmetric unless all degs same)
1/3 1/2 1/3 1/2 1/2 1/3 1/2
1/3
.6
(symmetric)
1/3
1/3
What is the invariant distribution ?

Assuming no stupid periodicity, same as the limiting distribution.
(periodicity iff bipartite, actually)
Theorem:
In random walk on undirected graph G, inv. distribution =
Higher degree higher limiting prob? Could [u] just be proportional to degree du?
Proof:
( di = 2m)
Corollary:
In random walk on undirected
(connected)
Examples
graph G,
Mvv = E [# steps to hit v if starting from v] =
Proof:
Mean first recurrence theorem.
:
Mvv:
1/4 4
1/2 2
1/4 4
Examples
Pn+1, the path on n+1 nodes:
Examples
The clique on n nodes:
:
Mvv:
2n n n n 2n
( 1/n 1/n 1/n 1/n )
Mvv = n
Examples
The lollipop on n nodes:
n/2 path n/2 clique
Proposition:
Let (u0,v0) be an edge in G. Mu0v0 = E [# steps to hit v0 starting from u0] 2m1 2m. u0
Proof:
u1 u2
Suppose v0 is connected to u0, u1, u2, , uk. uk
v0
Mvv n2/8
Mvv n/2
Theorem:
Let G be a connected graph. Let u and v be any two vertices. Then Muv = E [# steps to hit v starting from u] 2mn n3
Examples
Pn+1, the path on n+1 nodes: u
E [# steps to hit v starting from u] 2mn = 2n(n+1) = O(n2) Youll see (hmwk or recitation): its indeed (n2)
Proof:
Pick a path u, w1, w2, , wr, v. At most n nodes. E[# of steps to go uv] E[# of steps to go uw1w2wrv] = E[#uw1]+E[#w1w2]++E[#wrv] 2m + 2m + + 2m 2mn.
Examples
The clique on n nodes: v
Thm: E [# steps to hit v starting from u] 2mn n3
Examples
The lollipop on n nodes:
v
u
Thm: E [# steps to hit v starting from u] n3 Actually: the expectation really is (n3) !
Actually: # steps to hit v starting from u ~ Geom( ), so expectation is n1.
An application
CONN problem:
Given graph G, possibly disconnected, and two vertices u and v. YES/NO: are u and v connected? Easily solved in O(m) time using DFS/BFS. Requires marking nodes, hence n bits of memory need to be allocated.
(Assume input is read-only.)
Say youre in a labyrinth. And you have very little memory.
CONN problem:
Given graph G, possibly disconnected, and two vertices u and v. YES/NO: are u and v connected? You cant even keep track of where youve been!
Difficulty:
Do it without allocating any memory. You can only use a constant number of integer variables.
10
A randomized algorithm for CONN:

[Aleliunas,Karp,Lipton,Lovsz,Rackoff79]

one variable
four variables z := u for t0 = 11000 for t = 1 ... 1000n3 for t1 = 1n for t2 = 1n z := random-neighbor(z) for t3 = 1n if z = v, return YES end for return NO couple more variables
z := u for t = 1 ... 1000n3 z := random-neighbor(z) if z = v, return YES end for return NO True answer is NO: alg. always says NO True answer is YES: alg. says YES w/prob 99.9% Why?
Assume a vbl can hold a number between 1 & n.
Suppose u and v are indeed in the same connected component. Say we do a random walk from u until we hit v. Let T = # steps it takes, a random variable. E [T] n3, by our theorem. Pr [T > 1000n3] < by Markovs Inequality.

z := u for t = 1 ... 1000n3 z := random-neighbor(z) if z = v, return YES end for return NO
For 25 years, this was one of the most famous examples of a problem with a known randomized solution, but no known deterministic solution. In 2004, Omer Reingold gave a deterministic solution! You can escape a labyrinth using O(1) memory and no random coins!
Definitions: Markov Chains Transition matrix Distribution vectors Invariant distribution Theorems: Fundamental theorem Mean first recurrence Inv dist. in undir graphs 2mn bound for uv Skills: Finding inv. distribs Analyzing rand walks
Study Guide
11

Lecture 18

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Lecture 18

Transféré par

Droits d'auteur :

Formats disponibles

15-251: Great Theoretical Ideas in Computer Science Lecture 18

A day in the life of me

image: David B. Wilson

A day in the life of me

A day in the life of me

A day in the life of me

A day in the life of me

A day in the life of me

A day in the life of me

A day in the life of me

Markov Chain Definition

Markov Chain Example

Markov Chain Notation

nn transition matrix K: Ki,j = Pr [i j in 1 step] K[u,v]

Markov Chain Notation

Markov Chain Notation

nn transition matrix K: K[i,j] = Pr [i j in 1 step]

Markov Chain Notation

Markov Chain Notation

Markov Chain Notation

60% 50% 10%

Pr [X1 = S | X0 = W] = .6 Pr [X1 = S | X0 = E] = 0 Pr [X6 = W | X5 = S] = .1 Pr [Xt+1 = j | Xt = i] = K[i,j]

60% 50% 10%

60% 50% 10%

(by Law of Total Prob)

(by Law of Total Prob)

Pr [X1 = W | X0 = S] Pr [X2 = W | X1 = W] + Pr [X1 = S | X0 = S] Pr [X2 = W | X1 = S] + Pr [X1 = E | X0 = S] Pr [X2 = W | X1 = E] = .1 .4 + .6 .1 + .3 .5 = .25

In general, what is Pr [X2 = j | X0 = i] ? Conditioning on X1, using Law of Total Prob

In general, what is Pr [X3 = j | X0 = i] ? Conditioning on X2, using Law of Total Prob

j K In general, Pr [Xt = j | X0 = i] = Kt [i,j].

A random initial state

60% 50% 10%

distribution vector for X0 usually denoted 0

In general, if X0 ~ 0, what is Pr [X1 = j] ? Conditioning on X0, using Law of Total Prob

The Invariant Distribution

60% 50% 10%

Whats up with that?

Invariant Distribution calculation

For fixed K, this yields a system of equations.

[W] [S] [E]

(finite, strongly connected)

Markov Chain with

transition matrix K, there is a unique invariant distribution satisfying = K.

(and [i] > 0 for all i)

Expected Time from u to u

Not hard to turn this into a theorem.

Mean First Recurrence Thm

Markov Chain Summary

Measure importance with Random Surfer model:

Lorem Ipsum Dolor Sit Amet

Lorem Ipsum Dolor Sit Amet

Random walks on undirected graphs

Connected undirected graph. Each step: go to a random neighbor.

Connected undirected graph. Each step: go to a random neighbor.

Notation: n nodes, m edges, degr. of node i is di

Notation: n nodes, m edges, degr. of node i is di

What is the transition matrix K?

What is the transition matrix K?

1/3 1/2 1/3 1/2 1/2 1/3 1/2

What is the invariant distribution ?

Mvv = E [# steps to hit v if starting from v] =

( 1/n 1/n 1/n 1/n )

Suppose v0 is connected to u0, u1, u2, , uk. uk

Actually: # steps to hit v starting from u ~ Geom( ), so expectation is n1.