Académique Documents
Professionnel Documents
Culture Documents
Andrew Rosenberg
March 4, 2010
1/1
Today
2/1
x0
x1 x2
x3 x5 x4
(xi , i ) =
m(xi , i ) m(i )
3/1
b a e c
Pass messages (small tables) around the graph. The messages will be small functions that propagate potentials around an undirected graphical model. The inference technique is the Junction Tree Algorithm
4/1
Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities
5/1
Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities
6/1
Moralization
Converts a directed graph to an undirected graph. Moralization marries the parents.
Insert an undirected edge between two nodes that have a child in common. Replace all directed edges with undirected edges.
x0
x1 x2 x1 x2
x3 x5 x4 x3 x5 x4
7/1
x0
Moralization
x3 x1 x0 x2 x4 x5
p(x0 )p(x1 |x0 )p(x2 |x0 )p(x3 |x1 )p(x4 |x2 )p(x5 |x1 , x4 ) x3 x1 x0 x2 x4 x5
8/1
9/1
Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities
10 / 1
Introduce Evidence
Given a moral graph. Identify what is observed xE . Reduce Probability functions since we know that some values are xed. So only keep probability functions over remaining nodes xF
1 (x1 , x2 , x3 , x4 )(x4 , x5 )(x4 , x6 )(x4 , x7 ) Z 1 (x1 , x2 , x3 = x3 , x4 = x4 )(x4 = x4 , x5 )(x4 = x4 , x6 )(x4 = x4 , x7 ) Z 1 (x1 , x2 )(x5 )(x6 )(x7 ) Z .4 .12 .1 .15
11 / 1
Introduce Evidence
Normalization Calculation Avoid it until the end, when we want to calculate an individual marginal.
12 / 1
Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities
13 / 1
Junction Trees
14 / 1
Triangulation
Need to guarantee that a Junction Graph, made up of the cliques and separators of an undirected graph is a Tree To do this, we make triangles (3 node cliques)
I.e. eliminate any (chordless) cycles of 4 or more nodes.
15 / 1
Triangulation
There are potentially many choices for which edge to add.
Wed like to keep the largest clique size small Small tables. However, Triangulation that minimizes the largest clique size is NP-complete. Suboptimal triangulation is acceptable (poly-time) and generally doesnt introduce too many extra dimensions.
16 / 1
Triangulation
There are potentially many choices for which edge to add.
Wed like to keep the largest clique size small Small tables. However, Triangulation that minimizes the largest clique size is NP-complete. Suboptimal triangulation is acceptable (poly-time) and generally doesnt introduce too many extra dimensions.
17 / 1
Triangulation Examples
18 / 1
Triangulation Examples
19 / 1
Triangulation Examples
20 / 1
Triangulation Examples
21 / 1
Triangulation Examples
22 / 1
Triangulation Examples
23 / 1
Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities
24 / 1
25 / 1
Also: A Junction Tree has the largest total separator cardinality. |(B, C )| + |(C , D)| > |(C , D)| + |(D)|
26 / 1
Given a set of cliques, must connect the nodes, such that the Running Intersection Property holds.
A valid Junction Tree maximizes the cardinality of the separators
Kruskals algorithm
1 2 3
Initialize a tree with no edges Calculate the size of separators between all pairs Connect two cliques with the largest separator cardinality (that doesnt create a loop Repeat 3 until all nodes are connected.
27 / 1
Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities
28 / 1
Now what?
We have a valid Junction Tree! What can we do with it. (or...who cares?) Probabilities in Junction Trees. p(X ) = p(X ) = 1 Z 1 Z (xC )
C C
(xC ) S (xS )
This is equivalent to de-absorbing smaller cliques from maximal cliques. Doesnt change anything, just a less compact description.
29 / 1
(xC ) S (xS )
We can represent CPTs as clique and separator potential functions (with a normalization term).
30 / 1
X
A
p(A, B, D) p(B, D)
p(B, D)
X
D
p(B, C , D)
p(B, D)
The Junction Tree Algorithm sends messages between cliques and separators until consistency is reached.
31 / 1
Send a message from each clique to its separator. The message is what the clique thinks the marginal should be Normalize the clique by each message from its separators such that it agrees.
If not....
32 / 1
S
W V
= = =
X
V \S
S
V W
= = =
X
W \S
S W S V X
V \S V
S V S
W
= = =
X S V S
V \S
X S V S V \S X W = S
W \S
33 / 1
When convergence is reached clique potentials are marginals and separator potentials are submarginals p(x) never changes because of this message passing.
S 1 V S W 1 V W 1 V W = = p(x) = Z S Z S Z S
This implies that, so long as p(x) is correctly represented in the potential functions, the junction tree algorithm can be used to make each potential correspond to an appropriate marginal without changing the overall probability function.
34 / 1
Convert the Directed Graph to the Junction Tree Initialize separators to 1, and the clique tables to CPTs.
p(X ) = p(x1 )p(x2 |x1 )p(x3 |x2 )p(x4 |x3 )p(x5 |x3 )p(x6 |x5 )p(x7 |x5 ) p(X ) = = Q 1 C (Xc ) Q Z S (XS ) 1 p(x1 , x2 )p(x3 |x2 )p(x4 |x3 )p(x5 |x3 )p(x6 |x5 )p(x7 |x5 ) 1 11111
35 / 1
= = =
X
A B
AB (A = 1) =
X
A
Conditionals
p(B, C |A = 1) = P
36 / 1
Moralization
Polynomial in # of nodes (variables)
Introduce Evidence
Polynomial in # of nodes (variables)
Triangulate
Suboptimal=Polynomial, Optimal=NP
Propagate Probabilities
Polynomial in number of cliques, Exponential in size of cliques
37 / 1
Bye
Next
Clustering Preview.
38 / 1