Académique Documents
Professionnel Documents
Culture Documents
Excerpt from
Chapter8, Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke
3/4/2015
Outline
What is an Index?
Leaf
Pages
(Sorted by search key)
Leaf pages contain data entries, and are chained (prev & next)
Non-leaf pages have index entries; only used to direct searches:
index entry
P0
K 1
P1
K 2
P 2
K m Pm
B+ Tree
Note how data entries
in leaf level are sorted
Root
17
Entries <= 17
5
2*
3*
Entries > 17
27
13
5*
7* 8*
14* 16*
22* 24*
30
27* 29*
Alternative 1:
Alternatives 2 and 3:
10
11
To build clustered index, first sort the Heap file (with some
free space on each page for future inserts).
Overflow pages may be needed for inserts. (Thus, order of
data recs is `close to, but not identical to, the sort order.)
CLUSTERED
Index entries
direct search for
data entries
Data entries
UNCLUSTERED
Data entries
(Index File)
(Data file)
Data Records
Data Records
12
Outline
13
14
15
16
Sorted Files:
Indexes:
Alt (2), (3): data entry size = 10% size of record
Tree: 67% occupancy (this is typical).
Implies file size = 1.5 data size
17
Assumptions (contd)
Scans:
Leaf levels of a tree-index are chained.
Index data-entries plus actual file scanned for
unclustered indexes.
Range searches:
We use tree indexes to restrict the set of data
records fetched, but ignore hash indexes.
18
Cost of Operations
(a) Scan
(b) Equality
(c ) Range
(1) Heap
BD
0.5BD
BD
2D
(2) Sorted
BD
Dlog 2B
D(log 2 B +
# pgs with
match recs)
(3)
1.5BD
Dlog F 1.5B D(log F 1.5B
Clustered
+ # pgs w.
match recs)
(4) Unclust. BD(R+0.15)
D(1 +
D(log F 0.15B
Tree index
log F 0.15B) + # pgs w.
match recs)
(5) Unclust. BD(R+0.125) 2D
BD
Hash index
Search
+ BD
Search
+D
Search
+BD
Search
+D
Search
+D
Search
+ 2D
Search
+ 2D
Search
+ 2D
Search
+ 2D
19
20
Outline
21
22
Choice of Indexes
23
24
25
SELECT E.dno
FROM Emp E
WHERE E.age>40
SELECT E.dno
FROM Emp E
WHERE E.hobby=Stamps
26
11,80
11
12,10
12
12,20
13,75
<age, sal>
10,12
20,12
75,13
12
bob 12
10
13
cal 11
80
joe 12
20
sue 13
75
<age>
10
Data records
sorted by name
80,11
<sal, age>
20
75
80
<sal>
Data entries
sorted by <sal>
27
28
Index-Only Plans
29
30
Summary
Many alternative file organizations exist, each
appropriate in some situation.
If selection queries are frequent, sorting the
file or building an index is important.
31
Summary (contd)
32
Summary (contd)
33