Académique Documents
Professionnel Documents
Culture Documents
Overview
• Review
• Index classification
• Cost estimation
Faloutsos 2
CMU SCS 15-415
1
Faloutsos SCS 15-415
Files
Faloutsos 5
CMU SCS 15-415
Faloutsos 6
CMU SCS 15-415
2
Faloutsos SCS 15-415
Faloutsos 7
CMU SCS 15-415
Faloutsos 8
CMU SCS 15-415
Faloutsos 9
CMU SCS 15-415
3
Faloutsos SCS 15-415
• A: Indices, like:
Faloutsos 10
CMU SCS 15-415
Directory
2 2.5 3 3.5
Data entries:
1.2* 1.7* 1.8* 1.9* 2.2* 2.4* 2.7* 2.7* 2.9* 3.2* 3.3* 3.3* 3.6* 3.8* 3.9* 4.0*
(Index File)
(Data file)
Data Records
Indexes
Faloutsos 12
CMU SCS 15-415
4
Faloutsos SCS 15-415
Faloutsos 13
CMU SCS 15-415
Overview
• Review
• Index classification
– Representation of data entries in index
– Clustered vs. Unclustered
– Primary vs. Secondary
– Dense vs. Sparse
– Single Key vs. Composite
– Indexing technique
• Cost estimation
Faloutsos 14
CMU SCS 15-415
Details
Faloutsos 15
CMU SCS 15-415
5
Faloutsos SCS 15-415
Directory
2 2.5 3 3.5
Data entries:
1.2* 1.7* 1.8* 1.9* 2.2* 2.4* 2.7* 2.7* 2.9* 3.2* 3.3* 3.3* 3.6* 3.8* 3.9* 4.0*
(Index File)
(Data file)
Data Records
Faloutsos 17
CMU SCS 15-415
6
Faloutsos SCS 15-415
Overview
• Review
• Index classification
– Representation of data entries in index
– Clustered vs. Unclustered
– Primary vs. Secondary
– Dense vs. Sparse
– Single Key vs. Composite
– Indexing technique
• Cost estimation
Faloutsos 21
CMU SCS 15-415
7
Faloutsos SCS 15-415
>=123
>=456
Faloutsos 22
CMU SCS 15-415
Indexing - non-clustered
Faloutsos 23
CMU SCS 15-415
Index entries
CLUSTERED direct search for UNCLUSTERED
data entries
8
Faloutsos SCS 15-415
Faloutsos 25
CMU SCS 15-415
Faloutsos 26
CMU SCS 15-415
Faloutsos 27
CMU SCS 15-415
9
Faloutsos SCS 15-415
Overview
• Review
• Index classification
– Representation of data entries in index
– Clustered vs. Unclustered
– Primary vs. Secondary
– Dense vs. Sparse
– Single Key vs. Composite
– Indexing technique
• Cost estimation
Faloutsos 29
CMU SCS 15-415
Faloutsos 30
CMU SCS 15-415
10
Faloutsos SCS 15-415
Overview
• Review
• Index classification
– Representation of data entries in index
– Clustered vs. Unclustered
– Primary vs. Secondary
– Dense vs. Sparse
– Single Key vs. Composite
– Indexing technique
• Cost estimation
Faloutsos 31
CMU SCS 15-415
30
44
50
Faloutsos 32
CMU SCS 15-415
Overview
• Review
• Index classification
– Representation of data entries in index
– Clustered vs. Unclustered
– Primary vs. Secondary
– Dense vs. Sparse
– Single Key vs. Composite
– Indexing technique
• Cost estimation
Faloutsos 33
CMU SCS 15-415
11
Faloutsos SCS 15-415
– “Lexicographic” order.
Faloutsos 34
CMU SCS 15-415
Overview
• Review
• Index classification
– Representation of data entries in index
– Clustered vs. Unclustered
– Primary vs. Secondary
– Dense vs. Sparse
– Single Key vs. Composite
– Indexing technique
• Cost estimation
Faloutsos 35
CMU SCS 15-415
12
Faloutsos SCS 15-415
Overview
• Review
• Index classification
– Representation
– …
• Cost estimation
Faloutsos 37
CMU SCS 15-415
Cost estimation
• Heap file
• Sorted
• Clustered
• Unclustured tree index
• Unclustered hash index
Methods Operations(?)
Faloutsos 38
CMU SCS 15-415
Cost estimation
Methods Operations
13
Faloutsos SCS 15-415
Cost estimation
Assume that:
• Clustered index spans 1.5B pages (due to empty space)
• Data entry= 1/10 of data record
Faloutsos 40
CMU SCS 15-415
Cost estimation
Faloutsos 41
CMU SCS 15-415
Cost estimation
#B
Faloutsos 42
CMU SCS 15-415
14
Faloutsos SCS 15-415
Cost estimation
Faloutsos 43
CMU SCS 15-415
Cost estimation
Cost estimation
index – cost?
– levels of index +
– blocks w/ qual. tuples
#1
#2
sec. key – clustering index
h + #qual-pages
…
#B
h
Faloutsos 45
CMU SCS 15-415
15
Faloutsos SCS 15-415
Cost estimation
index – cost?
– levels of index +
– blocks w/ qual. tuples
#1
...
#2
sec. key – non-clust. index
h’ + #qual-records
…
(actually, a bit less...)
#B
h’
Faloutsos 46
CMU SCS 15-415
Cost estimation
m: # of qualifying pages
m’: # of qualifying records
Faloutsos 47
CMU SCS 15-415
Cost estimation
Faloutsos 48
CMU SCS 15-415
16
Faloutsos SCS 15-415
Faloutsos 49
CMU SCS 15-415
Faloutsos 50
CMU SCS 15-415
Summary
• To speed up selection queries: index.
• Terminology:
– Clustered / non-clustered index
– primary / secondary index
• Typically, B-tree index
• hashing is only good for equality search
• At most one clustered index per table
– many non-clustered ones are possible
– composite indexes are possible
Faloutsos 51
CMU SCS 15-415
17