Vous êtes sur la page 1sur 6

CS154 Homework Set 3

Aleksandr Palatnik
Completed: November 28, 2010

1 Bayesian Networks
For reference, we’ll refer to the following rules for active paths:

1. X → Y → Z is active if Y ∈
/O

2. X ← Y ← Z is active if Y ∈
/O

3. X ← Y → Z is active if Y ∈
/O

4. X → Y ← Z is active if Y ∈ O or Descendants(Y ) ∈ O

Now looking at the problem:

1. False, by rule 1 for a → b → f .

2. True. A path connecting a and g has to pass through d or f , but since they’re unobserved,
by rule 4 there is no active path through them.

3. False. Path igef b:

(a) Rule 3 active: i ← g → e


(b) Rule 1 active: g → e → f
(c) Rule 4 active: e → f ← b

4. True. Condsidering the path from j to d, we look at the path from j to e. This path has
to pass through g and possibly h. Looking at path jige, g creates an inactive path since
it is observed (by rule 3). Looking at path jihge, h being observed allows an active path
by rule 4, but g creates an inactive path again by rule 3. Therefore, there is no active
path from j to d.

5. True. Any path between b and i has to pass through d or f . Because neitgher d or f is
observed, rule 4 cannot be applied to create an active path through those nodes

6. False. Consider the path jiged:

(a) Rule 2 active: j ← i ← g


(b) Rule 3 active: i ← g → e
(c) Rule 1 active: g → e → d

7. False. Consider the path ihgef bc:

(a) Rule 4 active: i → h ← g


(b) Rule 3 active: h ← g → e
(c) Rule 1 active: g → e → f
(d) Rule 4 active: e → f ← b
(e) Rule 3 active: f ← b → c

1
2 Markov Decision Processes
1. This problem can be considered to have only 3 states: < 2, 1, 0 >, each representing the
distance between P1 and P2. The state changes are dictated by the following probabilities:

State Change Probability


2→2 p
2→1 1−p
1→1 p
1→0 1−p

with no other possible transitions.


The expected reward of being in a state is a combination of the expected value on the
transition and the expected value of being in the following state. This can be represented
recursively as such:
E2 = p(1 + γE2 ) + (1 − p)(0 + γE1 )
E1 = p(1 + γE1 ) + (1 − p)(γE0 − 10))
E0 = 0
where γ is the discount factor reducing the value of subsequent rewards. We can then
rearrange the equations as such:
E2 = p(1 + γE2 ) + (1 − p)(0 + γE1 )
⇒ E2 (1 − γp) = p + (1 − p)γE1
p + (1 − p)γE1
⇒ E2 =
1 − γp

E1 = p(1 + γE1 ) + (1 − p)(γE0 − 10))


p + (1 − p)(−10)
⇒ E1 =
1 − γp
Combining these two equations, we get:
p + (1 − p)γ p+(1−p)(−10)
1−γp
E2 =
1 − γp
p γ(p − p2 − 10(1 − p)2 )
= +
1 − γp (1 − γp)2
Using γ = .95 and p = .9, the long term future reward with the starting configuration
described is 5.755.
2. Since moving right is the same as above, there is no need to recalculate the expected value
for that. For the down direction, we can calculate the expected value as such:
1
E = p( + γE) + (1 − p)(0 + γE)
2
p
⇒ E(1 − pγ − (1 − p)γ) =
2
p p
⇒E= =
2(1 − pγ − (1 − p)γ) 2(1 − γ)
which for γ = .95 and p = .9 produces an expected value of 9, easily making the down
direction the better choice.
By setting the two expected-value functions equal to each other and setting the value of
γ, we get the cut-off point (via the solve function of Mathematica) to be p = .93.

2
3 Information Gathering
1. Given the table, we can estimate the following probabilities:

x P(S = x) i P(Yi = 1)
1 1/4 1 1/2

2 1/6 2 1/2

3 1/3 3 1/2

4 1/4 4 7/12

We can use this information to calculate the entropy prior to observation H(x) using
Shannon entropy:
X
H(x) = − P (S = x)Log2 P (S = x)
x
= 1/4Log2 (4) + 1/6Log2 (6) + 1/3Log2 (3) + 1/4Log2 (4)
= 1/2 + .4308 + .5283 + 1/2
= 1.9591
The next step is to compute the entropy after observing each Yi :
H(S|Y1 = 1) = 1/3Log2 (3) − (0)Log2 (0) + 1/3Log2 (3) + 1/3Log2 (3)
= Log2 (3) = 1.585
H(S|Y1 = 0) = 1/6Log2 (6) + 1/3Log2 (3) + 1/3Log2 (3) + 1/6Log2 (6)
= 1/3Log2 (6) + 2/3Log2 3 = 0.8616 + 1.0566 = 1.9182
H(S|Y2 = 1) = 1/2Log2 (2) + 1/2Log2 (2)
= Log2 (2) = 1
H(S|Y2 = 0) = 1/3Log2 (3) + 1/6Log2 (6) + 1/2Log2 (2)
= .5283 + .4308 + .5 = 1.4591
H(S|Y3 = 1) = 1/2Log2 (2) + 1/2Log2 (2)
= Log2 (2) = 1
H(S|Y3 = 0) = 1/2Log2 (2) + 1/3Log2 (3) + 1/6Log2 (6)
= 1.4591
H(S|Y4 = 1) = 1/7Log2 (7) + 1/7Log2 (7) − 2/7Log2 (2/7) − 3/7Log2 (3/7)
= .8021 + .5164 + .5239 = 1.8424
H(S|Y4 = 0) = −2/5Log2 (2/5) + 1/5Log2 (5) − 2/5Log2 (2/5)
= 1.0575 + .4644 = 1.5219
Now that we have the post-observation entropies, we can calculate the conditional en-
tropies:
X
H(S|Y1 ) = P (y1 )H(X|Y1 = y1 ) = 1/2(1.585) + 1/2(1.9182) = 1.7516
y1

H(S|Y2 ) = 1/2(1) + 1/2(1.4591) = 1.2296


H(S|Y3 ) = 1/2(1) + 1/2(1.4591) = 1.2296
H(S|Y4 ) = 7/12(1.8424) + 5/12(1.5219) = 1.7089
This produces the following vector for information gain:

I(S; Y ) = H(S) − H(S|Y ) = 1.9591 − H(S|Y ) =< .2075, .7295, .7295, .2502 >

3
This implies that questions 2 and 3 are the most informative since they have the highest
information gain. Suppose we start with question 2 in building the decision tree. We now
have to do a bunch of additional calculations to find which of the remaining questions is
most informative:

Case Y2 = 1:
H(S|Y2 = 1) = H(S1′ ) = 1 (As calculated before)
H(S1′ |Y1 = 1) = 2/3Log2 (3/2) + 1/3Log2 (3) = .9183
H(S1′ |Y1 = 0) = 1/3Log2 (3) + 2/3Log2 (3/2) = .9183
H(S1′ |Y3 = 1) = −1 ∗ Log2 (1) = 0
H(S1′ |Y3 = 0) = 3/4Log2 (4/3) + 1/4Log2 (4) = .8113
H(S1′ |Y4 = 1) = 1/2Log2 (2) + 1/2Log2 (2) = 1
H(S1′ |Y4 = 0) = 1/2Log2 (2) + 1/2Log2 (2) = 1
H(S1′ |Y1 ) = 1/2(.9183) + 1/2(.9183) = .9183
H(S1′ |Y3 ) = 1/3(0) + 2/3(.8113) = .5409
H(S1′ |Y4 ) = 1/3(1) + 2/3(1) = 1
I(S1′ |Y ) =< .0817, .4591, 0 >
This implies that after question 2, we should ask question 3. At this point we
consider the cases for question 3. If it returns 1, this means that the result is species
3, otherwise it could be 1 or 3, so we do the math in that case:
H(S1′ |Y3 = 0) = H(S13
′′
) = .8113
′′
H(S13 |Y1 = 1) = −1Log2 (1) = 0
′′
H(S13 |Y1 = 0) = 1/2Log2 (2) + 1/2Log2 (2) = 1
′′
H(S13 |Y4 = 1) = 1/2Log2 (2) + 1/2Log2 (2) = 1
′′
H(S13 |Y4 = 0) = −1Log2 (1) = 0
′′
H(S13 |Y1 ) = 1/2(0) + 1/2(1) = .5
′′
H(S13 |Y4 ) = 1/2(1) + 1/2(0) = .5
′′
I(S13 |Y ) =< .5, .5 >
This indicates that either question is equally useful at this point. Suppose we choose
question 1, if it’s result is 1, then the species is 1, otherwise we ask question 4. If
that results in 1, it’s species 3, otherwise species 1.
Case Y2 = 0:
H(S|Y2 = 0) = H(S1′ ) = 1.4591 (As calculated before)
H(S1′ |Y1 = 1) = 1/3Log2 (3) + 2/3Log2 (3/2) = .9183
H(S1′ |Y1 = 0) = 2/3Log2 (3/2) + 1/3Log2 (3) = .9183
H(S1′ |Y3 = 1) = 1/4Log2 (4) + 3/4Log2 (4/3) = .8113
H(S1′ |Y3 = 0) = −1Log2 (1) = 0
H(S1′ |Y4 = 1) = 1/5Log2 (5) + 1/5Log2 (5) + 3/5Log2 (5/3) = 1.3710
H(S1′ |Y4 = 0) = −1Log2 (1) = 0
H(S1′ |Y1 ) = 1/2(.9183) + 1/2(.9183) = .9183
H(S1′ |Y3 ) = 2/3(.8113) + (/1)(3)(0) = .5409
H(S1′ |Y4 ) = 5/6(1.3710) + 1/6(0) = 1.1425
I(S1′ |Y ) =< .5408, .9182, .3166 >

4
This means that question 3 is the next most informative. If the result of question 3
is 0, the result is species 2. The dataset at this point contains no results for question
4 being 0, so it is no longer useful. This means that the next question is question 1.
If the result of question 1 is 0, the species is species 4, otherwise it is inconclusive
between 3 or 4.

These result produce the following decision tree:

Question 2
1 0

Question 3 Question 3
1 0
0 1

Species 3 Question 1 Question 1 Species 2


1 0
0 1

Species 1 Question 4 Species 3 or 4 Species 4


1
0

Species 3 Species 1

In practice, you can stop expanding the tree as soon as one result has a much higher
probability of being correct than any of the others since the cost of expanding further
may be greater than the cost of giving the wrong result.

2. Looking at each single question:

Question 1:
X X
E1 = max(maxa P (x|Y1 = 1)U (x, a), maxa P (x|Y1 = 0)U (x, a))
x x
E1 = max(maxa (1/3U (1, a) + 1/3U (3, a) + 1/3U (4, a)),
maxa (1/6U (1, a) + 1/3U (2, a) + 1/3U (3, a) + 1/6U (4, a)))
E1 = −1

Question 2: (Repeating the same calculation)

M athematicaCode :M ax[M ax[T able[(1/2)U [1, i] + (1/2)U [3, i], {i, 0, 4}]],
M ax[T able[(1/3)U [2, i] + (1/6)U [3, i] + (1/2)U [4, i], {i, 0, 4}]]]
E2 = −1

Question 3:

M athematicaCode :M ax[M ax[T able[(1/2)U [3, i] + (1/2)U [4, i], {i, 0, 4}]],
M ax[T able[(1/2)U [1, i] + (1/3)U [2, i] + (1/6)U [3, i], {i, 0, 4}]]]
E3 = −1

Question 4:

M athematicaCode :M ax[M ax[T able[(1/7)U [1, i] + (1/7)U [2, i] + (2/7)U [3, i] + (3/7)U [4, i], {i, 0, 4}
M ax[T able[(2/5)U [1, i] + (1/5)U [2, i] + (2/5)U [3, i], {i, 0, 4}]]]
E4 = −1

5
Questions 1,4:
X X
E14 = max(maxa P (x|Y1 = 1, Y4 = 1)U (x, a), maxa P (x|Y1 = 1, Y4 = 0)U (x, a)),
x x
X X
maxa P (x|Y1 = 0, Y4 = 1)U (x, a), maxa P (x|Y1 = 0, Y4 = 0)U (x, a))
x x
M.Code :M ax[
M ax[T able[(1/4)U [1, i] + (1/4)U [3, i] + (1/2)U [4, i], {i, 0, 4}]],
M ax[T able[(1/2)U [1, i] + (1/2)U [4, i], {i, 0, 4}]],
M ax[T able[(1/3)U [2, i] + (1/3)U [3, i] + (1/3)U [4, i], {i, 0, 4}]],
M ax[T able[(1/3)U [1, i] + (1/3)U [2, i] + (1/3)U [3, i], {i, 0, 4}]]
]
E14 = −1

Questions 2,3:
X X
E14 = max(maxa P (x|Y2 = 1, Y3 = 1)U (x, a), maxa P (x|Y2 = 1, Y3 = 0)U (x, a)),
x x
X X
maxa P (x|Y2 = 0, Y3 = 1)U (x, a), maxa P (x|Y2 = 0, Y3 = 0)U (x, a))
x x
M.Code :M ax[
M ax[T able[(1)U [3, i], {i, 0, 4}]],
M ax[T able[(3/4)U [1, i] + (1/4)U [3, i], {i, 0, 4}]],
M ax[T able[(1/4)U [3, i] + (3/4)U [4, i], {i, 0, 4}]],
M ax[T able[(1)U [2, i], {i, 0, 4}]]
]
E23 = 1

Since adding a question does not affect the value of information additively, this implies
that value of information is not submodular.

Vous aimerez peut-être aussi