Académique Documents
Professionnel Documents
Culture Documents
Lei Shi
Department of Computer Science and
Engineering
State University of New York at Buffalo
Outline
Introduction
Apriori-based Subgrah Mining
Pattern Growth Subgraph Mining
Summary
Graph Classification
Graph clustering
Important node identification
Bridge and hub identification
(1)
A
A
(2)
A
C
subgraph
C
Support
C
B
(3)
A
3
Graph isomorphism
Outline
Introduction and Background
Apriori-based Subgrah Mining
Pattern Growth Subgraph Mining
Summary
Apriori-based Approach
0e0 00e1 00
Apriori-based method
Apriori Property
Candidate Generation
Create a set of candidate size k+1
-from given two frequent ksubgraphs
-containing the same (k-1)subgraph
-Result in several candidates size
k+1
Apriori-based method
Graph candidate generated Example
Apriori-based method
FlowChart
Apriori-based method
Experiment Result
-Chemical Compound Dataset, which contains 340
compounds,24 different atoms (vertices)
Outline
Introduction
Apriori-based Subgrah Mining
Pattern Growth Subgraph Mining
Summary
Motivation of gSpan
DFS code
An edge is
presented
by 5 tuples.
(i, j , li , l( i , j ) , l j )
(0,1, X , a, Y )
DFS code
Second Step: DFS Lexicographic Order
gSpan
gSpan
Summary
Graph representation
Flattern representation vs. DFS code
Pattern-Growth Approach
freq(g )
Where
freq(g ) is the percentage of graphs in D
that contain g.
Apriori-based approach
AGM/AcGM: Inokuchi, et al. (PKDD00)
FSG: Kuramochi and Karypis (ICDM01)
PATH#: Vanetik and Gudes (ICDM02, ICDM04)
FFSM: Huan, et al. (ICDM03) and SPIN: Huan et al. (KDD04)
FTOSM: Horvath et al. (KDD06)
Search Order
breadth vs. depth
complete vs. incomplete
Generation of Candidate Patterns
apriori vs. pattern growth
Discovery Order of Patterns
DFS order
path tree graph
Elimination of Duplicate Subgraphs
passive vs. active
Support Calculation
embedding store or not
Frequent Subgraph
Examples:
Example (cont.)
Outline
Introduction and Background
DFS code
Yan, X. and Han, J. 2002. gSpan : Graph-Based Substructure
Pattern Mining. In Proceedings of the 2002 IEEE international
Conference on Data Mining (Icdm02) (December 09-12, 2002).
ICDM. IEEE Computer Society, Washington, DC, 721