Vous êtes sur la page 1sur 71

Balanced search trees: 2‐3‐4 trees.

Example 2‐3‐4 tree


2‐3‐4 (or 2‐4) trees improve the efficiency of insertItem and deleteItem methods
4
of 2‐3 trees, because they are performed on the path from the root to the leaf.
However, they require more memory for storing 3 data items and 4 pointers in
2 6 8
each node.
Definition: A 2‐3‐4 tree is a general tree which satisfies the following properties: 1 3 5 7 9 10 11
1 Each node may store three data items.
Class Node234tree {
2 Each node may have four children.
3 The second and third data items in any node may be empty, in which case Node234tree firstChild;
sentinel value emptyFlag is stored there (assume emptyFlag := 0). If they are Node234tree secondChild;
not empty, the first data item precedes the second one according to the Node234tree thirdChild;
specified ordering relationship, the second data item precedes the third data Node234tree fourthChild;
item. Node234tree parent;
p ;
4. For each node, data in the first child precedes the first data item in the node; int firstItem;
data in the second child follows the first data item, but precedes the second; int secondItem;
data in the third child follows the second data item, but precedes the third; int thirdItem;
data in the fourth child follows the third data item.
.... class methods follow }
5 All leaf nodes are on the same level.

Search in 2‐3‐4 trees Insertion in 2‐3‐4 trees


The search algorithm is similar to that in 2‐3 trees and binary search Step 1 Search for the item to be inserted (same as in 2‐3 trees).
Step 2 Insert at the leaf level. The following cases are possible:
trees. In the example 2‐3‐4 tree, the search for 10 is carried out as follows:
• The termination node is a 2‐node. Then, make it a 3‐node, and insert the new item
1 Compare 10 to the only item in the root.
1. root 10 > 4,
4 continue the search appropriately.
in the second child. • The termination node is a 3‐node. Then, make it a 4‐node, and insert the new item
appropriately.
2. 10 > 6 and 10 > 8, continue the search in the third child.
• The termination node is a 4 node. Split is, pass the middle to the parent, and insert
3. 10 > 9, 10 = 10. Stop. the new item appropriately.
General rules for inserting new nodes in 2‐3‐4 trees:
As in 2‐3 trees, the efficiency of the search operation is guaranteed to be
Rule 1: During the search step, every time a 2‐node connected to a 4‐node
O(log n). On average, it will be better that the search efficiency in 2‐3
is encountered, transform it into a 3‐node connected to two 2‐nodes.
trees because the height of a 2‐3‐4 tree might be less than the height of the
trees, Rule
l 2: During the
h search h step, every time a 3‐node
d connected
d to a 4‐node
d
2‐3 tree with the same data. is encountered, transform it into a 4‐node connected to two 2‐nodes.
Note that two 2‐nodes resulting from these transformations have the same number
of children as the original 4‐node. This is why the split of a 4‐node does not affect
any nodes below the level where the split occurs.

1
Efficiency of search and insert operations Deletion in 2‐3‐4 tree
Consider our example tree
Result 1: Search in a 2‐3‐4 tree with N nodes takes at most O(log N) time. This
is in case if all nodes are 2 nodes. If there are 3‐nodes and 4‐nodes on the tree,
4
the search will take less than (log N) time.

2 6 8
Result 2: Insertion into a 2‐3‐4 tree takes less than O(log N) time, and on
average requires less than 1 node split.
1 3 5 7 9 10 11

The following special cases (with multiple sub‐cases each) are possible:

Case 1 (three sub‐cases): The item is deleted from a leaf node (a node with external children), which
currently contains 2 or 3 items. Easy sub‐cases – delete the item transforming a 4‐node into a 3
node, or a 3 node into a 2 node. No other nodes are affected. Example: delete 9 – the existing 4
node, containing 9, 10, and 11 is transformed into a 3 node, containing 10 and 11. Deleting from a 2‐
node (the third sub‐case) requires an item from the parent node to be drawn, which in turn must be
replaced by an item from the sibling note (if the sibling node is NOT a 2‐node as well). See case 2.

Deletion in 2‐3‐4 tree (contd.)


Case 2 (with several more sub‐cases) Delete from a node that has non‐external children.
For example, delete 8. This case can be reduced to case 1 by finding the item that
precedes the one to be deleted in in‐order traversal (7, in our example) and exchanging
the two items. If 7 were part of a 3‐ or 4‐ node, 8 would have been deleted easily.
However, since 8 is now the only item in the node, we have a case of underflow. This
requires that an item from the parent node be transferred to the underflow node, and
substituted in the parent node by an item from the sibling node.
In our example, 7 will be transferred back to where it was, and 9 will move to the parent
node to fill the gap.
However, if the sibling node is also a 2‐node, the so‐called fusing takes place. That is, the
two 2‐node siblings are “fused” in a single 3‐node, after an item is transferred from the
parent node
node. The later suggests that the parent can now handle one less child
child, and it
indeed has one child less after two of its former children are fused.
The last sub‐case suggests that a parent node is also a 2‐node. Then, it must in turn
borrow from its parent, etc., resulting in the 2‐3‐4 tree becoming one level shorter.

2
Efficiency of 2-3 Tree
• As for any search tree, the efficiency depends
on the tree’s height
• A 2-3 tree of height h with the smallest number
of keys is a full tree of 2-nodes
2-3-(4) Trees level items
0 1

1 2
h−1 2h−1

n ≥ 1 + 2 + ... + 2h −1
• So
h ≤ log 2 ( n + 1)

2-3 Tree Visualization


• A 2-3 tree of height h with the largest number of • http://slady.net/java/bt/view.php?w=600&h=450
keys is a full tree of 3-nodes, each two keys and • Text input box and three buttons
three children
– Enter a number in the text box and click ”Insert”
Insert
to have it entered into the 2-3 tree
n ≤ 2 ⋅1 + 2 ⋅ 3 + ... + 2 ⋅ 3 h −1
– Similarly, you can delete from the tree and
• So search the tree
– All three buttons work on data entered in the
h ≥ log 3 (n + 1) same text box

1
Practice 2-3-4 Trees
• Construct a 2-3 tree for the list 9, 5, 8, 3, 2, • Same as 2-3 tree, except now nodes can
4, 7 by successive insertions have four subtrees (and 3 items)
• Traversal and searching operations are
simply extensions of the same procedures
for Binary Search Trees (BST) and 2-3 Trees
• 2-3-4 trees are useful because the steps
required to resolve insertion and deletion
dilemmas are reduced compared to 2-3 trees

Insertion The difficulty with insertions


• In order to insert an item into a 2-3-(4) tree it • Recall that in 2-3 trees we blindly followed
is first necessary to locate the leaf that the this path to the leaf. Then we (over)filled
item will be inserted into nodes to accommodate the new item
• This requires that we start from the root and • Overfilled nodes were resolved by pushing
make comparison decisions until we arrive at items up to the parent
a leaf • This push had the possibility of overfilling the
• This effectively traverses a “path” from the parent as well and the problem was
root node down to that leaf propagated up the tree

2
Taking advantage (of descent) The simple modification
• 2-3-4 Trees make use of the descent from • While searching for the leaf on the way down if
the root to the leaf we encounter a “4” node then we immediately
split that node…
• 2
2-3
3 trees blindly descend to the leaf and 50
make modifications to the tree only on the
30
way back up as nodes become overfull 30 50 70 70

• 2-3-4 also make modifications to the tree 10 15 20 40 60 80 90 10 15 20 40 60 80 90


during the descent to find the leaf
• Why
Wh would
ld you d
do such
h craziness?
i ?
– To make room for items that might be pushed up!
(now 50, 30 and 70 can hold at least one more item
that might be pushed up from below)

Reasoning Special case


• 2-nodes and 3-nodes can always hold one
more item root
• The only nodes that can
can’tt hold one more item
are 4-nodes
• Items are only pushed up to ancestors
• If we encounter an ancestor on the way
down that won’t be able to accept p an item we
can transform it into 2-nodes that can before
the insertion is actually performed

3
Other cases Other cases
2-node parent 3-node parent

Guarantee Example
• This guarantees that the exact same • Adding 16 to the following tree
insertion routine presented for 2-3 trees can – Step 1a: (descent) Is current node a 4-node?
be p
performed without ever having g to worryy
* Yes *
about “resolving” overfull nodes 30 50 70 50

30 70
10 15 20 40 60 80 90

10 15 20 40 60 80 90

4
• Step 1b: Compare search key with current • Step 1a: Is current node a 4-node? Yes
node (16 < 50)
*
50
Yes
50 50
30 * 70 *
30 70 15 30 70
*
10 15 20 40 60 80 90 *
10 15 20 40 60 80 90 10 20 40 60 80 90
• Step 1a: Is current node (*) a 4-node? No.
• Step 1b: Compare search key with current
node (16<30)

• 16 < 30 which is the middle branch • The insertion …


50 50

15 30 70 15 30 70

*
10 20 40 60 80 90 10 16 20 40 60 80 90

• Th
The arrived
i d att lleaff iis a 2
2-node
d ((or maybe
b a3
3- • N
Nothing
thi can propagate
t up b
because th
the
node) so you can simply insert the new item necessary space was made during descent
and nothing will have to propagate up to the leaf

5
2-3-4 Tree Visualization
• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90, Inserting 60, 30, 10, 20 ...
100

... 50, 40 ...

Inserting 50, 40 ... Inserting 70 ...

... 70, ... ... 80, 15 ...

6
Inserting 80, 15 ... Inserting 90 ...

... 90 ... ... 100 ...

2-3-4 Deletions
• Like with the 2-3 tree, to delete we
Inserting 100 ... – find the node containing the value to be deleted
– find the value that comes next in the tree
– swap the next value with the value to be deleted
– delete the swapped value which is now in a leaf
node which is
• either a 3-node or 4-node and we do not have to
physically delete a node, or
• a 2-node and we have to take care of removing the
node from the tree

7
2-3-4 Deletions 2-3-4 Deletions
• Bottom-up strategy • Top-down strategy
– if the 2-node has a sibling that is a 3-node or 4- – when searching for the node containing the
node, redistribute the values between the sibling,g g ((collapse)
value to be deleted we will merge p ) anyy
parent and current node 2-node (except root) into a larger node
– if the 2-node has no nearby 3-node or 4-node – in this way, we can be assured that any physical
siblings, then merge the sibling with the parent removal will take place only from a 3-node or 4-
and move up to the parent level and perform the node
deletion recursively – there are manyy mergeg situations like there were
6 split situations
• the cases depend on the type of node that is the given
2-node’s parent and a next sibling to the left or right

2-3-4 Deletions 2-3-4 Deletions


Turning a 2-node into a 3-node ... Turning a 2-node into a 3-node ...
Case 1: an adjacent sibling has 2 or 3 items Case 2: each adjacent sibling has only one item
steal item from sibling by rotating items and moving subtree
Æ"steal"
Æ Æ "steal"
steal item from parent and merge node with sibling
(note: parent has at least two items, unless it is the root) (note: parent has at least two items, unless it is the root)

30 50 20 50 30 50 50

10 20 40 * 10 30 40 10 40 * 10 30 40
"rotation" merging

25 25 25 35 25 35

8
2-3-4 Deletions Example
Turning a 2-node into a 4-node ...
• Delete 40
40
Case 3: parent is root and parent and each adjacent sibling
has
a only
o y one
o item
Æ merge node with parent and sibling 20 50

30 10 30 40 14 32 43 62 70 79

40
* 25
10 35
merging 10 18 25 33 42 47 57 60 66 74 81

25 35

Red-Black Trees — Idea


Here again is our transformed 4-node
• We can represent a 4-node of the 2-3-4 tree using only 2-
nodes if we add two new nodes (shown in red) 7
4 7 12 4 12
7
4 7 12 2 5 9 18 2 5 9 18
4 12
2 5 9 18
2 5 9 18 We can also apply this process to a 3-node (in two different ways)

4 7 7 4
• Note that the ordering property of the 2-3-4 Tree translates or
4 9 2 7
into the ordering property of a Binary Search Tree 2 5 9
2 5 5 9
• The
Th text
t t talks
t lk about
b t redd andd black
bl k links
li k but
b t “links”
“li k ” or
children references do not have the ability to contain 2-nodes are essentially left alone
information about their “color” 4 4

2 5 2 5

9
• A Red-Black Tree is a Binary-Tree representation of a
2-3-4 Tree
– Think of black nodes as representing 2-3-4 Tree nodes
• Every red-black tree has an equivalent 2-3-4
– Think of red nodes as being the extra ones required to make a representation
Binary Tree out of the 2-3-4 Tree
28
20 28 I
20 42 I
5 12 15 24 35 42 12 24 35 48
C N
3 7 9 14 19 22 25 30 40 45 48 50 5 15 22 25 30 40 45 50
C NR A G M R
3 9 14 19

7 E H L P X
A EGH LM P SX
– It is no longer true that every leaf is at the same level. However,
given a node, every path from it down to a leaf goes through the S
same number of black nodes

Practice
• Thus we have the following restriction in the • If you take any 2-3-4 tree you can illustrate it
definition of red-black trees as a Red-Black Tree…
• There must be an equal number of black • Figure 12-20
12 20 from the book as a 2-3-4
2 3 4 tree
nodes on every path from the root to a node
with fewer than two children

10
• Same tree as a Red-Black tree • Red-black trees are undercover 2-3-4 trees
• We can tell if a node is really a 4-node by
checking if both its left and right children are
colored red
• A node is a 3-node if its not a 4-node and
has one red child

• There are 15 other valid variations

Red-black trees: benefits Red-black trees: insertion


• 2-3 and 2-3-4 trees typically waste a lot of • Insertion in a red-black tree can be
memory on unused items and child implemented as a direct translation of the
references top-down 2-3-4 splitting algorithm
– search from root to leaf and then insert the new
• 2-nodes are really 4-nodes with 2/3 of value
memory for that node being unused – we must maintain height-balancing. how?
• Red-black trees have the advantages of 2-3- • in the 2-3-4 tree, we split any 4-node on the way down
the tree using one of 6 cases
4 trees without the overhead • we do the same for our red-black
red black tree, implementing
the 6 cases from the point of view of red-black nodes
rather than 4-nodes
• most of the splitting is done by re-coloring nodes, but
there are some cases that require additional rotations

11
Red-black trees: insertion Red-black trees: insertion
• To split a root 4-node: recolor the two • To split a non-root 4-node: flip the color of all
children black three nodes
W W
Y Y
X Y Z Y
W W Y Y Y

X Z X Z X Z
X Y Z X Z X Z X Z

Red-black trees: insertion Convention


• One problem is that we may end up with two • Leaf nodes are null links; the rest are internal
adjacent red nodes nodes
– If the parent is a 3-node oriented the wrong way round 30
– We fix this with a single rotation and further recoloring
15 70

VW VWY V V
60 85
10 20
W
W W W
XYZ X Z
V Y 80 90
Y Y
V Y
50 65
X Z
X Z X Z X Z
5
40 55

12
Black heights
Definition of black-height 3
30
• The number of black nodes on any path 2
f
from a node,
d x, in
i aR R-B
B tree
t to
t a 15 70

descendent leaf, but excluding node x 1 10 60 85


20
itself, is the node’s black-height,
denoted bh(x) 50 65
80 90
1 5
• If T is an R-B
R B tree, we define bh(T) as
the black-height of its root 40 55
0

Node Height Theorem 1 – Any red-black tree with root x, has


• The height of a node v is the number of nodes on at least n = 2bh(x) – 1 internal nodes, where
the longest path from v to a leaf, but excluding bh(x) is the black height of node x
node v itself
Proof: by
y induction on heightg of x
A height of the tree is 4
– Base step: x has height 0 (i.e. null leaf node)
• What is bh(x)?
height of node B is 3 B C – Inductive step: x has positive height and 2 children
• Each child has black-height of bh(x) or bh(x) –
1 (Why?)
D E F G • Since the height of a child of x is less than the
height of x itself, we can apply the inductive
hypothesis to conclude that each child has ≥
H I
2bh(x)-1 - 1 internal nodes. So, the tree rooted at
x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) -
1 internal nodes

13
Theorem 2 – In a red-black tree, at least Theorem 3 – A red-black tree with n internal
half the nodes on any path from the root nodes has height h <= 2log(n + 1)
to a leaf, not including the root, must be Proof: Let h be the height of the red-black
black tree with root xx. By Theorem 2
2,
bh(x) >= h/2
Proof: If a red node on the path has a From Theorem 1, n >= 2bh(x) - 1
child, then the color of the child must be Therefore n >= 2 h/2 – 1
black
n + 1 >= 2h/2
log(n + 1) >= h/2
2log(n + 1) >= h

14
2-3-(4) Trees
Efficiency of 2-3 Tree
• As for any search tree, the efficiency depends
on the tree’s height
• A 2-3 tree of height h with the smallest number
of keys is a full tree of 2-nodes
level items
0 1

1 2
h−1 2h−1

h −1
n ≥ 1 + 2 + ... + 2
• So
h ≤ log 2 (n +1)
• A 2-3 tree of height h with the largest number of
keys is a full tree of 3-nodes, each two keys and
three children
h −1
n ≤ 2 ⋅1 + 2 ⋅ 3 + ... + 2 ⋅ 3
• So

h ≥ log 3 (n + 1)
2-3 Tree Visualization
• http://slady.net/java/bt/view.php?w=600&h=450
• Text input box and three buttons
– Enter a number in the text box and click ”Insert”
to have it entered into the 2-3 tree
– Similarly, you can delete from the tree and
search the tree
– All three buttons work on data entered in the
same text box
Practice
• Construct a 2-3 tree for the list 9, 5, 8, 3, 2,
4, 7 by successive insertions
2-3-4 Trees
• Same as 2-3 tree, except now nodes can
have four subtrees (and 3 items)
• Traversal and searching operations are
simply extensions of the same procedures
for Binary Search Trees (BST) and 2-3 Trees
• 2-3-4 trees are useful because the steps
required to resolve insertion and deletion
dilemmas are reduced compared to 2-3 trees
Insertion
• In order to insert an item into a 2-3-(4) tree it
is first necessary to locate the leaf that the
item will be inserted into
• This requires that we start from the root and
make comparison decisions until we arrive at
a leaf
• This effectively traverses a “path” from the
root node down to that leaf
The difficulty with insertions
• Recall that in 2-3 trees we blindly followed
this path to the leaf. Then we (over)filled
nodes to accommodate the new item
• Overfilled nodes were resolved by pushing
items up to the parent
• This push had the possibility of overfilling the
parent as well and the problem was
propagated up the tree
Taking advantage (of descent)
• 2-3-4 Trees make use of the descent from
the root to the leaf
• 2-3 trees blindly descend to the leaf and
make modifications to the tree only on the
way back up as nodes become overfull
• 2-3-4 also make modifications to the tree
during the descent to find the leaf
The simple modification
• While searching for the leaf on the way down if
we encounter a “4” node then we immediately
split that node…
50

30 50 70 30 70

10 15 20 40 60 80 90 10 15 20 40 60 80 90

• Why would you do such craziness?


– To make room for items that might be pushed up!
(now 50, 30 and 70 can hold at least one more item
that might be pushed up from below)
Reasoning
• 2-nodes and 3-nodes can always hold one
more item
• The only nodes that can’t hold one more item
are 4-nodes
• Items are only pushed up to ancestors
• If we encounter an ancestor on the way
down that won’t be able to accept an item we
can transform it into 2-nodes that can before
the insertion is actually performed
Special case

root
Other cases
2-node parent
Other cases
3-node parent
Guarantee
• This guarantees that the exact same
insertion routine presented for 2-3 trees can
be performed without ever having to worry
about “resolving” overfull nodes
Example
• Adding 16 to the following tree
– Step 1a: (descent) Is current node a 4-node?

* Yes *
30 50 70 50

30 70
10 15 20 40 60 80 90

10 15 20 40 60 80 90
• Step 1b: Compare search key with current
node (16 < 50)
*
50

30 * 70

*
10 15 20 40 60 80 90

• Step 1a: Is current node (*) a 4-node? No.


• Step 1b: Compare search key with current
node (16<30)
• Step 1a: Is current node a 4-node? Yes

Yes
50 50
*
30 70 15 30 70
*
10 15 20 40 60 80 90 10 20 40 60 80 90
• 16 < 30 which is the middle branch
50

15 30 70

*
10 20 40 60 80 90

• The arrived at leaf is a 2-node (or maybe a


3-node) so you can simply insert the new
item and nothing will have to propagate up
• The insertion …
50

15 30 70

10 16 20 40 60 80 90

• Nothing can propagate up because the


necessary space was made during descent
to the leaf
2-3-4 Tree Visualization
• http://www.cse.ohio-state.edu/~bondhugu/acads/234-tree/index.shtml

• Insert 60, 30, 10, 20, 50, 40, 70, 80, 15, 90,
100
Inserting 60, 30, 10, 20 ...

... 50, 40 ...


Inserting 50, 40 ...

... 70, ...


Inserting 70 ...

... 80, 15 ...


Inserting 80, 15 ...

... 90 ...
Inserting 90 ...

... 100 ...


Inserting 100 ...
2-3-4 Deletions
• Like with the 2-3 tree, to delete we
– find the node containing the value to be deleted
– find the value that comes next in the tree
– swap the next value with the value to be deleted
– delete the swapped value which is now in a leaf
node which is
• either a 3-node or 4-node and we do not have to
physically delete a node, or
• a 2-node and we have to take care of removing the
node from the tree
2-3-4 Deletions
• Bottom-up strategy
– if the 2-node has a sibling that is a 3-node or 4-
node, redistribute the values between the sibling,
parent and current node
– if the 2-node has no nearby 3-node or 4-node
siblings, then merge the sibling with the parent
and move up to the parent level and perform the
deletion recursively
2-3-4 Deletions
• Top-down strategy
– when searching for the node containing the value
to be deleted we will merge (collapse) any 2-
node (except root) into a larger node
– in this way, we can be assured that any physical
removal will take place only from a 3-node or 4-
node
– there are many merge situations like there were
6 split situations
• the cases depend on the type of node that is the given
2-node’s parent and a next sibling to the left or right
2-3-4 Deletions
Turning a 2-node into a 3-node ...
Case 1: an adjacent sibling has 2 or 3 items
"steal" item from sibling by rotating items and moving subtree
(note: parent has at least two items, unless it is the root)

30 50 20 50

10 20 40 * 10 30 40
"rotation"

25 25
2-3-4 Deletions
Turning a 2-node into a 3-node ...
Case 2: each adjacent sibling has only one item
 "steal" item from parent and merge node with sibling
(note: parent has at least two items, unless it is the root)

30 50 50

10 40 * 10 30 40
merging

25 35 25 35
2-3-4 Deletions
Turning a 2-node into a 4-node ...
Case 3: parent is root and parent and each adjacent sibling
has only one item
 merge node with parent and sibling

30 10 30 40

10 40
* 25 35
merging

25 35
Example
• Delete 40
40

20 50

14 32 43 62 70 79

10 18 25 33 42 47 57 60 66 74 81
Red-Black Trees — Idea
• We can represent a 4-node of the 2-3-4 tree using only 2-
nodes if we add two new nodes (shown in red)

7
4 7 12
4 12
2 5 9 18
2 5 9 18

• Note that the ordering property of the 2-3-4 Tree translates


into the ordering property of a Binary Search Tree
• The text talks about red and black links but “links” or
children references do not have the ability to contain
information about their “color”
Here again is our transformed 4-node

7
4 7 12 4 12

2 5 9 18 2 5 9 18

We can also apply this process to a 3-node (in two different ways)

4 7 7 4
or
4 9 2 7
2 5 9
2 5 5 9

2-nodes are essentially left alone


4 4

2 5 2 5
• A Red-Black Tree is a Binary-Tree representation of a
2-3-4 Tree
– Think of black nodes as representing 2-3-4 Tree nodes
– Think of red nodes as being the extra ones required to make a
Binary Tree out of the 2-3-4 Tree
28
20 28
20 42
5 12 15 24 35 42 12 24 35 48

3 7 9 14 19 22 25 30 40 45 48 50 5 15 22 25 30 40 45 50

3 9 14 19

– It is no longer true that every leaf is at the same level. However,


given a node, every path from it down to a leaf goes through the
same number of black nodes
• Every red-black tree has an equivalent 2-3-4
representation

I
I

C N

C NR A G M R

E H L P X
A EGH LM P SX

S
• Thus we have the following restriction in the
definition of red-black trees
• There must be an equal number of black
nodes on every path from the root to a node
with fewer than two children
Practice
• If you take any 2-3-4 tree you can illustrate it
as a Red-Black Tree…
• Figure 12-20 from the book as a 2-3-4 tree
• Same tree as a Red-Black tree

• There are 15 other valid variations


• Red-black trees are undercover 2-3-4 trees
• We can tell if a node is really a 4-node by
checking if both its left and right children are
colored red
• A node is a 3-node if its not a 4-node and
has one red child
Red-black trees: benefits
• 2-3 and 2-3-4 trees typically waste a lot of
memory on unused items and child
references
• 2-nodes are really 4-nodes with 2/3 of
memory for that node being unused
• Red-black trees have the advantages of 2-3-
4 trees without the overhead
Red-black trees: insertion
• Insertion in a red-black tree can be
implemented as a direct translation of the
top-down 2-3-4 splitting algorithm
– search from root to leaf and then insert the new
value
– we must maintain height-balancing. how?
• in the 2-3-4 tree, we split any 4-node on the way down
the tree using one of 6 cases
• we do the same for our red-black tree, implementing
the 6 cases from the point of view of red-black nodes
rather than 4-nodes
• most of the splitting is done by re-coloring nodes, but
there are some cases that require additional rotations
Red-black trees: insertion
• To split a root 4-node: recolor the two
children black

Y Y
Y
X Y Z

X Z X Z X Z
Red-black trees: insertion
• To split a non-root 4-node: flip the color of all
three nodes
W W

W W Y Y Y

X Y Z X Z X Z X Z
Red-black trees: insertion
• One problem is that we may end up with two
adjacent red nodes
– If the parent is a 3-node oriented the wrong way round
– We fix this with a single rotation and further recoloring

VW VWY V V

W W
W W
XYZ X Z
V Y V Y
Y Y

X Z
X Z X Z X Z
Convention
• Leaf nodes are null links; the rest are internal
nodes
30

15 70

60 85
10 20

80 90
50 65
5
40 55
Definition of black-height
• The number of black nodes on any path
from a node, x, in a R-B tree to a
descendent leaf, but excluding node x
itself, is the node’s black-height,
denoted bh(x)
• If T is an R-B tree, we define bh(T) as
the black-height of its root
Black heights
3
30
2
15 70

1 10 60 85
20

80 90
50 65
1 5
40 55
0
Node Height
• The height of a node v is the number of nodes on
the longest path from v to a leaf, but excluding
node v itself
A height of the tree is 4

height of node B is 3 B C

D E F G

H I
Theorem 1 – Any red-black tree with root x, has
at least n = 2bh(x) – 1 internal nodes, where
bh(x) is the black height of node x

Proof: by induction on height of x


– Base step: x has height 0 (i.e. null leaf node)
• What is bh(x)?
– Inductive step: x has positive height and 2 children
• Each child has black-height of bh(x) or bh(x) –
1 (Why?)
• Since the height of a child of x is less than the
height of x itself, we can apply the inductive
hypothesis to conclude that each child has ≥
2bh(x)-1 - 1 internal nodes. So, the tree rooted at
x has at least (2bh(x)-1 - 1)+(2bh(x)-1 - 1)+1= 2bh(x) - 1
internal nodes
Theorem 2 – In a red-black tree, at least
half the nodes on any path from the root
to a leaf, not including the root, must be
black

Proof: If a red node on the path has a


child, then the color of the child must be
black
Theorem 3 – A red-black tree with n internal
nodes has height h <= 2log(n + 1)
Proof: Let h be the height of the red-black
tree with root x. By Theorem 2,
bh(x) >= h/2
From Theorem 1, n >= 2bh(x) - 1
Therefore n >= 2 h/2 – 1
n + 1 >= 2h/2
log(n + 1) >= h/2
2log(n + 1) >= h
 
 


¯  


 

 
 ¿ 
   

¯  

  
 
  
    


   


  

 

p1 p2 p1 p2 p1 k3 p2

n1 n n2 n1 n n2 n1 n2
k1 k2 k3 k4 k1 k2 k3 k4 k1 k2 k4

c1 c2 c3 c4 c5 c1 c2 c3 c4 c5 c1 c2 c3 c4 c5

 
     
   


   
       !