Vous êtes sur la page 1sur 38

Lecture 10: Junction Tree Algorithm CSCI 780 - Machine Learning

Andrew Rosenberg

March 4, 2010

1/1

Today

Junction Tree Algorithm


Ecient calculation of marginals in a graphical model

2/1

Example of Graphical Model.

x0

x1 x2

x3 x5 x4

(xi , i ) =

m(xi , i ) m(i )

3/1

Ecient Computation of Marginals

b a e c

Pass messages (small tables) around the graph. The messages will be small functions that propagate potentials around an undirected graphical model. The inference technique is the Junction Tree Algorithm

4/1

Junction Tree Algorithm

Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

5/1

Junction Tree Algorithm

Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

6/1

Moralization
Converts a directed graph to an undirected graph. Moralization marries the parents.
Insert an undirected edge between two nodes that have a child in common. Replace all directed edges with undirected edges.

x0

x1 x2 x1 x2

x3 x5 x4 x3 x5 x4
7/1

x0

Moralization

x3 x1 x0 x2 x4 x5

p(x0 )p(x1 |x0 )p(x2 |x0 )p(x3 |x1 )p(x4 |x2 )p(x5 |x1 , x4 ) x3 x1 x0 x2 x4 x5

1 (x0 , x1 )(x0 , x2 )(x1 , x3 )(x2 , x4 )(x1 , x4 , x5 ) Z

8/1

Another Moralization Example

9/1

Junction Tree Algorithm

Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

10 / 1

Introduce Evidence
Given a moral graph. Identify what is observed xE . Reduce Probability functions since we know that some values are xed. So only keep probability functions over remaining nodes xF
1 (x1 , x2 , x3 , x4 )(x4 , x5 )(x4 , x6 )(x4 , x7 ) Z 1 (x1 , x2 , x3 = x3 , x4 = x4 )(x4 = x4 , x5 )(x4 = x4 , x6 )(x4 = x4 , x7 ) Z 1 (x1 , x2 )(x5 )(x6 )(x7 ) Z .4 .12 .1 .15

p(X ) p(XF |XE )

Replace potential functions with slices Requires a dierent normalization term

11 / 1

Introduce Evidence

Observing xE separates nodes.

Normalization Calculation Avoid it until the end, when we want to calculate an individual marginal.

12 / 1

Junction Tree Algorithm

Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

13 / 1

Junction Trees

Ultimately we want to construct Junction Trees.


Each node is a clique of variables in a modal graph. Edges connect cliques There is a unique path from a node to the root Between each connected clique node there is a separator node. Separators contain intersections of variables

14 / 1

Triangulation

How do we construct a Junction Tree?

Need to guarantee that a Junction Graph, made up of the cliques and separators of an undirected graph is a Tree To do this, we make triangles (3 node cliques)
I.e. eliminate any (chordless) cycles of 4 or more nodes.

15 / 1

Triangulation
There are potentially many choices for which edge to add.

Wed like to keep the largest clique size small Small tables. However, Triangulation that minimizes the largest clique size is NP-complete. Suboptimal triangulation is acceptable (poly-time) and generally doesnt introduce too many extra dimensions.
16 / 1

Triangulation
There are potentially many choices for which edge to add.

Wed like to keep the largest clique size small Small tables. However, Triangulation that minimizes the largest clique size is NP-complete. Suboptimal triangulation is acceptable (poly-time) and generally doesnt introduce too many extra dimensions.
17 / 1

Triangulation Examples

18 / 1

Triangulation Examples

19 / 1

Triangulation Examples

20 / 1

Triangulation Examples

21 / 1

Triangulation Examples

22 / 1

Triangulation Examples

23 / 1

Junction Tree Algorithm

Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

24 / 1

Constructing Junction Trees


Multiple trees can be constructed from the same graph.

Junction Trees must satisfy the Running Intersection Property


On the path connecting clique node V to clique node W , all other clique nodes must include the nodes in V W .

25 / 1

Constructing Junction Trees


Multiple trees can be constructed from the same graph.

Junction Trees must satisfy the Running Intersection Property


On the path connecting clique node V to clique node W , all other clique nodes must include the nodes in V W .

Also: A Junction Tree has the largest total separator cardinality. |(B, C )| + |(C , D)| > |(C , D)| + |(D)|
26 / 1

Forming a Junction Tree

Given a set of cliques, must connect the nodes, such that the Running Intersection Property holds.
A valid Junction Tree maximizes the cardinality of the separators

Kruskals algorithm
1 2 3

Initialize a tree with no edges Calculate the size of separators between all pairs Connect two cliques with the largest separator cardinality (that doesnt create a loop Repeat 3 until all nodes are connected.

27 / 1

Junction Tree Algorithm

Junction Tree Algorithm Moralization Introduce Evidence Triangulate Construct Junction Tree Propagate Probabilities

28 / 1

Now what?
We have a valid Junction Tree! What can we do with it. (or...who cares?) Probabilities in Junction Trees. p(X ) = p(X ) = 1 Z 1 Z (xC )
C C

(xC ) S (xS )

This is equivalent to de-absorbing smaller cliques from maximal cliques. Doesnt change anything, just a less compact description.

29 / 1

Conversion from Directed Graph


Example Conversion. p(X ) = 1 Z
C

(xC ) S (xS )

We can represent CPTs as clique and separator potential functions (with a normalization term).

30 / 1

Junction Tree Algorithm


Need to make marginals consistent.
(A, B, D) (B, D) (B, C , D) p(A, B, D) p(B, D) p(B, C , D)

X
A

p(A, B, D) p(B, D)

p(B, D)

X
D

p(B, C , D)

p(B, D)

The Junction Tree Algorithm sends messages between cliques and separators until consistency is reached.
31 / 1

Junction Tree Algorithm

Send a message from each clique to its separator. The message is what the clique thinks the marginal should be Normalize the clique by each message from its separators such that it agrees.

If they agree, nished! V = S = p(S) = S =


V \S W \S

If not....

32 / 1

Junction Tree Algorithm Message Passing

S
W V

= = =

X
V \S

S
V W

= = =

X
W \S

S W S V X
V \S V

S V S
W

= = =

X S V S
V \S

X S V S V \S X W = S
W \S

33 / 1

Junction Tree algorithm

When convergence is reached clique potentials are marginals and separator potentials are submarginals p(x) never changes because of this message passing.
S 1 V S W 1 V W 1 V W = = p(x) = Z S Z S Z S

This implies that, so long as p(x) is correctly represented in the potential functions, the junction tree algorithm can be used to make each potential correspond to an appropriate marginal without changing the overall probability function.

34 / 1

Converting From DAG to Junction Tree

Convert the Directed Graph to the Junction Tree Initialize separators to 1, and the clique tables to CPTs.
p(X ) = p(x1 )p(x2 |x1 )p(x3 |x2 )p(x4 |x3 )p(x5 |x3 )p(x6 |x5 )p(x7 |x5 ) p(X ) = = Q 1 C (Xc ) Q Z S (XS ) 1 p(x1 , x2 )p(x3 |x2 )p(x4 |x3 )p(x5 |x3 )p(x6 |x5 )p(x7 |x5 ) 1 11111
35 / 1

Run JTA to set potential functions to marginals.

Evidence in Junction Tree


Initialize the same way. AB BC B = = = p(A, B) p(C |B) 1

Update with a slice instead of the whole table. B


BC AB

= = =

X
A B

AB (A = 1) =

X
A

p(A, B)(A = 1) = p(A = 1, B)

p(A = 1, B) BC = p(C |B) = p(A = 1, B, C ) B 1 AB = p(A = 1, B)


BC B,C BC

Conditionals

p(B, C |A = 1) = P

36 / 1

Eciency of The Junction Tree Algorithm


All steps are ecient. 1 Construct CPTs
Polynomial in # of data points
2

Moralization
Polynomial in # of nodes (variables)

Introduce Evidence
Polynomial in # of nodes (variables)

Triangulate
Suboptimal=Polynomial, Optimal=NP

Construct Junction Tree


Polynomial in the number of cliques Identifying Cliques = Polynomial in the number of nodes.

Propagate Probabilities
Polynomial in number of cliques, Exponential in size of cliques

37 / 1

Bye

Next
Clustering Preview.

38 / 1

Vous aimerez peut-être aussi