Vous êtes sur la page 1sur 10

3MEEE TRANSACTIONS

334

Multidimensional

ON COMPUTERS, VOL.

c-33, NO. 4,

APRIL

1984

Height-Balanced Trees

VIJAY K. VAISHNAVI
Abstract -A new multidimensional balanced tree structure is multidimensional data. Inverted files are such "multipresented for the efficient management of multidimensional data. dimensional" data structures. These are extensions of data
It is shown that the data structure can be used to manage a set of structures originally designed for one-dimensional data and
n k-dimensional records or data items such that the records
can be searched or updated in O(log2 n) + k time, which is opti- do not provide an efficient solution to the problem. Lum [20]
mal. The data structure is a multidimensional generalization of introduces the technique of combined indexes, which conthe height-balanced trees and retains much of their simplicity and catenates several attributes into a single key. This technique,
efficiency. The insertion algorithm, in particular, retains a very however, needs excessive storage and update time for it to be
important property of the height-balanced trees: an insertion of a really useful. Grid file [23] is very efficient for searching a
record results in the application of a restructuring operation at
record, but its storage and update time need to be investigated.
most once.

Index Terms Computational geometry, dynamic databases,


height-balanced trees, information storage and retrieval, multidimensional algorithms, multidimensional balanced tree structures, multidimensional data structures.

I. INTRODUCTION

T HERE is a wide variety of data structures available [ 16]


for managing a set of one-dimensional data items
or records identified by a single attribute or key. Heightbalanced trees (also known as AVL-trees) [1] were the first
"balanced tree structure". proposed for representing a set of n
one-dimensional data items such that the items can be
retrieved, inserted into, and deleted in at most O(10g2 n)
time. Even though a number of other balanced tree structures
have been proposed (for example, [2], [12], [18], [24], [26],
[27], [32]), the height-balanced tree remains an attractive
data structure. Some of the important properties of this data
structure are its efficient storage utilization and efficient
"rebalancing" after an insertion or a deletion occurs. An
insertion of an item into a height-balanced tree requires
at most one application of a "restructuring operation."
Empirical tests by Knuth [16] indicate that the management
of such a tree in the memory is reasonably fast. Brown [9] and
Mehlhom [22] have recently analyzed this data structure. The
present paper is an attempt at a "neat" generalization of the
height-balanced trees so as to be suitable for the management
of multidimensional data.
Many present day applications require the management of
multidimensional data in which each data item or record is
identified by several attributes. These include the management of geometric objects and collection of records which
need to be accessed based on the value of any one of their
several attributes or combination thereof. There is thus
much interest in data structures which can efficiently manage
Manuscript received March 18, 1983; revised October 31, 1983. This work
was supported in part by a research grant from the Research Council of the
College of Business Administration, Georgia State University.
The author is with the Department of Information Systems, Georgia State
University, University Plaza, Atlanta, GA 30303.

Recently, various generalizations of tree structures for


one-dimensional data have been proposed. These are, however, either suitable only for fairly static data sets or need
complex restructuring operations. Such data structures include quad trees [10], k-d trees [6], multidimensional trees
[25], multiple attribute tree structure (MAT) [11], [15], segment trees [8], [33], multilayered tree structures [30], [34],
completely balanced multidimensional trees [7], and multidimensional B-trees [13], [28].
Bentley and Saxe [7] present the "k-dimensional completely balanced search trees" for representing a set of n
k-dimensional records. In such a tree, the root node stores Xk,
the kth attribute of a record such that xk is the median value
in the multiset of the kth attributes of all the records. The
"LOSON-subtree" at the root is a k-dimensional completely
balanced search tree for those records whose kth attribute is
less than Xk. The "HISON-subtree" at the root is similarly a
k-dimensional completely balanced search tree for those
records whose kth attribute is greater than Xk. The
"EQSON-subtree" (the middle subtree) at the root is a
(k - 1)-dimensional completely balanced search tree for the
(k - 1)-dimensional projection records obtained by omitting
Xk, the identical kth attribute of the records. Such trees can be
searched for a record in O(log2 n) + k time, which is optimal. These trees are "severely" balanced and are natural
generalizations of the completely balanced search trees for
one-dimensional data. As is expected, these trees cannot be
updated efficiently. Recently there have been several attempts [13], [28] at proposing "relaxed" balancing restrictions on the "k-dimensional search trees" which ensure
O(log2 n) + k bound on their height and make it possible to
design simple and efficient update algorithms for them. Such
attempts have, however, been only partially successful.
The data structure proposed by Scheuermann and Ouksel
[28] is a k-dimensional search tree which does not balance
over all the attributes of the records but balances attributewise, using B-trees. The bound on the height of such a
tree, and hence on the time complexity of the search and the
update algorithms, is obviously O(k log2 n). Guting and
Kriegel [ 13] suggest balancing the k-dimensional t-ary search

0018-9340/84/0400-0334$01.00 C 1984 IEEE

335

VAISHNAVI: MULTIDIMENSIONAL HEIGHT-BALANCED TRSES

trees, t 3 3, by constraints which are generalizations of the


corresponding constraints for the B-trees of order t. The resulting trees achieve O(log2 n) + k bound on the time complexity of their searching and update algorithms, but the
update algorithms are far more complex than those for the
B-trees. Mehlhorn [21] shows that the D-trees suggested by
him, which are based on the weight-balanced trees [24],
achieve the desired bounds on their searching and update
algorithms, but the algorithms are simple only at the cost of
nonlinear storage. All these data structures attempt to capture
the "right" generalization of "balancing" in the corresponding one-dimensional data structures, which is rather a difficult task [23].
In this paper we generalize the height-balancing constraint
for the height-balanced trees and apply it to the multidimensional search trees, resulting in the "multidimensional
height-balanced trees." We show that these trees have
O(log2 n) + k bound on their search and update times. We
believe that this generalization is "neat" and natural for various reasons. The restructuring operations used are almost the
same as those for the height-balanced trees. The update algorithms seem to maintain their simplicity and efficiency. Insertion of a record in the tree requires the application of a
restructuring operation at most once.
In Section II, we formally define the multidimensional
height-balanced trees and probe their structure. We also
prove the desired bound on their height. In Section III, we
describe the restructuring operations and prove their correctness. We also describe the restructuring algorithm which
is the heart of the deletion algorithm. In Section IV, we describe the insertion algorithm which follows from the restructuring operations. Finally, in Section V, we summnarize the
main results and draw certain conclusions.
II. k-DIMENSIONAL HEIGHT-BALANCED TREES
AND THEIR STRUCTURE

A k-dimensionalfile or k-file is a collection of n records.


Each record is identified by an ordered k-tuple (xk, * ,xI) of
values which are called the keys or attributes. A key xi,
1 i k, in the record, is called a key of dimension i.
A k-dimensional search tree, k > 1, T for representing a
k-file F is defined as
-

T = (u, TI, Te, Th)

where u is the root of the tree storing a key, say, s of dimension k; TI, Te, and Th are called, respectively, the LOSON-,
the EQSON-, and the HISON-subtrees of u; 1 is the left direct
son, e is the EQSON, and h is the right direct son, of u; TF,
the LOSON-subtree, is a k-dimensional search tree for all the
records in F whose keys of dimension k are less than s; Te,
the EQSON-subtree, is a (k - 1)-dimensional search tree
for the (k - 1)-dimensional projection of records in F obtained by omitting s, their identical key of dimension k; Th,
the HISON-subtree, is a k-dimensional search tree for all the
records in F whose keys of dimension k are greater than s.
Example 2.1: Fig. 1 shows a two-dimensional search tree
for the 2-file F = {(1, a), (1, b), (2, a), (3, a), (3, b), (4, a),

3
a

Fyt)1

U) 1

2
Nb

(4)3

x
1

a
0

Ic

Lb

MC1

Fig. 1.

(4, b), (5, a), (5, b), (5, c), (5, d)}. Note that in the record
(3,b), b is a key of dimension 1 and 3 is a key of

dimension 2. In the figure, node I is the EQSON of D and


the tree rooted at I, the EQSON-subtree of D, is a onedimensional search tree for the set of records {a, b, c, d}, each
of which is a key of dimension 1. The figure provides some
additional information which will be used in later examples.

The height of a subtree in a k-dimensional search tree is


equal to zero if it is empty and otherwise is equal to one plus
the height of its tallest subtree. The height of a node is the
height of the subtree rooted at the node. The height of a key
is the height of the node in which. it is stored. Let P, storing
a key of dimension i, be a node at height h. Then, P is said
to be an (h, i)-node and of dimension i. We denote the height
of P by ht. (P), where P is a node, a key, or a subtree.
Let a node R at height h, have a left direct son P, at height
h2; h2 = 0 implies that the subtree is empty. Then, R is
said to be the right direct father of P. Additionally, the nonexisting nodes on the path between R and P at the heights
h2 + 1,.*.*, h, - 1 are also said to be the left direct sons of
R [see Fig. 2(a)]. Two nonexisting left direct sons of a node
at the heights h and h + 1 are said to constitute a nonexisting
(n.e.) pair (of nodes) at height h. If P is the left direct or
indirect son of R, then Q, the right direct son of P, is a left
indirect son of R, and R is the right indirect father of Q
[see Fig. 2(b) and (c)]. The concepts of right direct sons,
left direct fathers, right indirect sons, etc., are defined in a
symmetric manner.
Example 2.2: In the two-dimensional search tree shown
in Fig. 1, the height of each node is shown beside the node.
B and D are the left and the right direct sons of node A,
respectively, and H is a right indirect son of A. A is the right
direct father of node B, the left direct father of node D, and
the left indirect father of node H. B has nonexisting right
direct sons at the heights 3, 2, and 1. This gives rise to two
n.e. pairs at the heights 2 and 1, respectively.
a
Let P, a node in a k-dimensional search tree, have a left
(respectively, right) n.e. pair of direct sons at height h. Let
T1 of height h, be the EQSON-subtree of P, and T2 of height
h2 be the EQSON-subtree of the left (respectively, right)
direct/indirect father of P (if it exists). T2 and T1 are said to
be the left and the right adjoining EQSON-subtrees of the
n.e. pair, respectively. The n.e. pair is said to be directly supported by T1 if h, h, and is said to be indirectly supported
by T2 if h, < h and h2 h. It is said to be supported if it is
directly or indirectly supported.
,

IEEE TRANSACTIONS ON COMPUTERS, VOL. c-33, NO. 4, APRIL 1984

336
R
P

h1

R
P

~~~~p
Q)

(a)

(b)

(c)

Fig. 2.

Example 2.3: Let us apply some of the above concepts to


the two-dimensional search tree shown in Fig. 1. The heights
of the nodes are shown beside the nodes. Node B has n.e.
pairs of right direct sons at heights 2 and 1. Their adjoining
EQSON-subtrees are the EQSON-subtrees of B and A. The
n.e. pair at height 1 is directly supported by the EQSONsubtree of B fsince its height is greater than or equal to 1. The
n.e. pair at height 2 is indirectly supported by the EQSONsubtree of A.
m
The purpose behind introducing the above concepts is to
have a framework for stating the "height-balancing" restriction on the structure of k-dimensional search trees and for
facilitating further discussion. We are now in a position to
give the definition of the k-dimensional height-balanced
trees.
A k-dimensional height-balanced or kHB-tree, k : 1,
representing a k-file F is a k-dimensional search tree storing
F with the restriction that every n.e. pair in the tree is
supported.
The reader may verify that the one-dimensional heightbalanced trees as defined above are the same as the wellknown AVL-trees.
Example 2.4: The tree shown in Fig. 1 is, in fact, a
2HB-tree.
The following concept provides some useful information
about the structure of the kHB-trees, but it does not characterize them.
Let P be an (h, i)-node in a k-dimensional search tree, and
let the tree rooted at P be an iHB-tree. The HB
(Height-Balancing)-condition of P is- said to be r if the height
of its EQSON-subtree is h - 1. Otherwise, it is
"="
if the height of its LOSON- as well as HISONsubtrees is h - 1;
if its LOSON-subtree is taller than its HISONsubtree;
if its HISON-subtree is taller than its LOSON"+"
subtree.
Height of the kHB-Trees: We would like to show that the
height of a kHB-tree storing n records is less than or equal to
O(log2 n) + k. We first show that the number of records in a
kHB-tree T of height h is greater than or equal to that in an
"almost kHB-tree" T1 of the same height. Thus, the height of
a kHB-tree storing n records cannot be more than that of an
almost kHB-tree storing the same number of records. We then
show that O(log2 n) + k is an upper bound on the height of
an almost kHB-tree, which leads to the desired result.
An almost kHB-tree (akHB-tree) is a kHB-tree with the

following modifications:
a) The definition of the direct support of an n.e. pair in the
tree is extended to an "n.e. quadruple" (of nodes). An n.e.
quadruple at height h is constituted by four nonexisting left
(right) direct sons of a node P at heights h, h + 1, h + 2, and
h + 3. Such a quadruple is directly supported if the height of
the EQSON-subtree of P is greater than or equal to h.
b) The structure of the tree is constrained by imposing
the requirement that every n.e. quadruple must be directly
supported.
It is easy to see that an almost kHB-tree can alternatively
be simply defined as a k-dimensional search tree such that if
T,, Te, and Th are the LOSON-, EQSON-, and HISON-subtrees
of a node, respectively, then either Te is the tallest subtree or the
difference between the heights of any two of its three subtrees
is at most 3.
Lemma 2.1: Given a kHB-tree T of height h, storing n
records, it can be transformed into an akHB-tree T1 of the
same height storing n or less records.
Proof: See Appendix A (Lemma A. 1).
Let N(h, k) be the minimum number of records stored in an
akHB-tree of height h.
Lemma 2.2: N(h,k) = N(h - k + 1,1), h k.
Proof: See Appendix B (Lemma B.3).
Lemma 2.3: The height of an akHB-tree storing n records
is at most 2.25 10g2(n + 2) + k - 2.112.
Proof: N(h,k) = N(h - k + 1, 1), h k (Lemma
2.2). Therefore, h - k + 1 2.25 log2(N(h,k) + 2) 1.112 (see Luccio and Pagli [19]). Thus, h < 2.25 log2(n +
2) + k - 2.112.
Theorem 2.1: The height of a kHB-tree storing n records
iS O(log2 n) + k.
Proof: Follows from Lemmas 2.1 and 2.3.
-

III. RESTRUCTURING OPERATIONS AND


A RESTRUCTURING ALGORITHM

Let us first introduce the following concepts about a


kHB-tree with possibly a certain number of unsupported
n.e. pairs.
Let a node P in T have an n.e. pairf of left (respectively,
right) direct sons at height h which is the only possibly unsupported n.e. pair in the tree rooted at P as well as on the
path between P and the root of T. Then, P is said to have a left
(respectively), right structure violation or simply a structure
violation, if f is not directly supported. f is said to be the
associated n.e. pair of the structure violation. The structure
violation is said to be strong if f is also not indirectly supported. The structure violation is said to be special if the
height of either the LOSON- (respectively, HISON-) subtree
or the EQSON-subtree of P is h - 1. The restructuring operations described later are used to correct a structure violation.
Note that, by definition, a structure violation must have the
possibility of being strong. The following result gives an
important property of a structure violation.
Lemma 3.1: Let T be a kHB-tree with possibly a certain
number of unsupported n.e. pairs. If there is a structure
violation to a node P at height h withf as the associated pair,
then the height of P must be h + 2.

337

VAISHNAVI: MULTIDIMENSIONAL HEIGHT-BALANCED TREES

Proof: Assume that the height of P is greater than

h + 2. Then, P has an n.e. pairf of left (respectively, right)


direct sons at height h + 1. f must be supported because by
the definition of a structure violationf is the only n.e. pair in
the tree rooted at P which may not be supported. Then, f too

[*]
B (b) h

A La) h or

h-l(*)

or
must be supported because f and f have the same adjoining [r]
1
ch-1
[=1 or A
[-1
EQSON-subtrees, which is a contradiction.
A node P in T is said to have a double-sided structure
5
violation at height h if P1, a left direct/indirect son of P, has
4h-3
3
a right strong structure violation at height h, and P2, a right
direct/indirect son of P, has a left strong structure violation
1
3
4h-2
h-2
at height h. The associated n.e. pairs of the strong structure
(a)
(b)
violations to P1 and P2 are said to be the associated n.e. pairs
Fig. 3.
of the double-sided structure violation.
In case of the height-balanced trees, the restructuring operations used in the insertion and the deletion algorithms are the
"rotation" and the "double rotation." These very operations
[*1
along with a variation of the double rotation are useful for the
B b h~(*)
insertion and the deletion algorithms of the kHB-trees. We
will call them simply as the restructuring operations. In the
following we will describe these operations (omitting their
symmetric variants) and establish their "correctness." A re7
A a 0-2
Cc
h-2
structuring operation applied at the root of a subtree to correct
4h- 3
a structure violation is said to be correct if the restructured
6 - 7
2
subtree does not contain any unsupported n.e. pair; its height
may, however, decrease by one. In the following, a "*" in
7
5
3
1
square brackets against a node indicates that there is a struc- <h-3
ture violation to the node. A "*" in parentheses against the
height of a node, similarly, indicates that the height of the
node has just changed.
(a)
(b)
Let T be a kHB-tree with an (h, i)-node (B in Fig. 3, C in
4.
Fig.
Figs. 4 and 5) having a right structure violation at height
h - 2. Let its associated n.e. pair be the only n.e. pair in the
tree that may possibly be unsupported. In such a situation, the
rotation or the double rotation (type A or B) is applied at the
[*]
node. The rotation is applied when the HB-condition of A,
h
Cc
A (a) h-l(*)
the left direct son of the node, is ""=r," or "6-" (Fig. 3).
[+1
The double rotation (type A or B) is applied when the HB^
7
A a h-1
h-2
condition of A is "+" (Figs. 4 and 5). The type A double
rotation (Fig. 4) is applied when the structure violation (to C)
h-2
B
1
3
is special or the height of the HISON-subtree of B is h - 3,
<h-3
and the type B double rotation (Fig. 5) is applied when this
is not the case.
5
3
5
Fig. 3 and its symmetric variant describe the rotation
(b)
operation. The restructured subtree is an iHB-tree with no
(a)
Fig. 5.
change in its height or with decrease in its height by 1
(Fig. 3).
A structure violation will result (to P) only in the following
Figs. 4 and 5 and their symmetric variants describe the
type A and the type B double rotation operations, re- situations:
a) The decrease in the height of the LOSON- (respectively,
spectively. The restructured subtree in each case is an
HISON-) subtree of P by 1 to, say, height h - 1, creates a new
iHB-tree with decrease in its height by 1.
Some of the above restructuring operations may decrease n.e. pair, say,f of left (respectively, right) direct sons at height
the height of a node by 1 to h - 1. The decrease in height h, which is not directly supported. Note that, if the height of P
may propagate all the way to the root without causing any is more than h + 2, then, by Lemma 3. l,f must be supported.
structure violation. On the other hand, the propagation of the
b) The decrease in the height of the EQSON-subtree of P
height decrease may stop at a node P. As a result, there may by I to height, say, h - 1, makes an n.e. pair at height h or
or may not be a structure violation or a double-sided structure more lose its support or direct support.
A double structure violation will result to P only if the
violation to P.

338

EQSON-subtree of P decreases in height by 1 to, say, h - 1,


and, as a result, at height h, an n.e. pair of right direct sons
of P1 (a left son of P) as well as an n.e. pair of left direct sons
of P2 (a right son of P) lose their support; PI and P2 must both
be at height h + 2.
We will now prove that the restructuring operations are
correct. We will do this by showing that they remove the
structure violation with or without decreasing the height of
the restructured subtree by 1 and without introducing any
unsupported n.e. pair in the subtree.
Theorem 3.1: The rotation operation is correct.
Proof: Unless otherwise stated, the following refers to
Fig. 3(b).
a) The height of subtree 3 is h - 2.
The height of node B is h - 1 because the height of
subtree 3 is h - 2. Thus, the height of A is h. The structure violation to B at height h - 2 [Fig. 3(a)] obviously
gets corrected.
The only new n.e. pair that may get created is one at height
h - 2. This will be created if the height of subtree 1 is h - 3
or less which can be the case only if the HB-condition of A
[Fig. 3(a)] is "r." In such a case, the n.e. pair is, however,
directly supported since the height of subtree 2 must be
h - 2.
All the other n.e. pairs continue to be supported by at least
one of their respective adjoining EQSON-subtrees.
b) The height of subtree 3 is less than h - 2.
The height of subtree 1 is h - 2 [if the HB-condition of A
is " -" in Fig. 3(a)] or the height of subtree 2 is h - 2 [if the
HB-condition of A is "r" in Fig. 3(a)]. Thus, the height of
A is h - 1.
The right structure violation to B obviously gets corrected.
If the height of 3 is less than h - 3, then the height of 2 or
4 must be h - 3 because otherwise there is an unsupported
n.e. pair of right direct sons of A at height h - 3 in Fig. 3(a).
There is an n.e. pair between A and B only if the height of
B is less than h - 3, but in that case the n.e. pair will obviously be directly supported. Any other n.e. pair continues to
be supported.
Theorem 3.2: The type A double rotation (Fig. 4) is
correct.

IEEE TRANSACTIONS ON COMPUTERS, VOL. c-33, NO. 4, APRIL 1984

off must be h - 3 or more. Observe that this is also the left


adjoining EQSON-subtree of every n.e. pair between A and
B [Fig. 4(b)]. Thus, every such n.e. pair will be supported
by its left adjoining EQSON-subtree.
Theorem 3.3: The type B double rotation (Fig. 5) is

correct.
Proof: Unless otherwise stated, we have the following
in Fig. 5(b).
The type B double rotation is applied only when the structure violation to C [Fig. 5(a)] is not special and height of the
subtree 5 is not h - 3. Thus, the height of C is h - 3 or less.
The height of B [Fig. 5(a)] is h - 2 and therefore the height
of the subtree 3 or that of 4 must be h - 3. Hence, the height
of B is h - 2, and that of A is h - 1.
The only n.e. pairs that may get created are those between
B and C (in case the height of C is h - 5 or less). All other
n.e. pairs in the subtree continue to be supported by at least
one of their respective adjoining EQSON-subtrees. Let us
therefore examine the n.e. pairs between B and C.
Let the height of C be h - 5 or less. This implies that the
heights of the subtrees 5, 6, and 7 are h - 6 or less. In
Fig. 5(a), the n.e. pair of right direct sons of B at height
h - 4 must be supported. Thus, the height of the subtree 4 or
that of 6 must be h - 4 or more. Height of 6 cannot be h - 4
or more because it contradicts the assumption that the height
of C is h - 5 or less. Therefore, the height of 4 must be
h - 4 or more. But then, 4 will support any n.e. pair between
B and C.
The restructuring operations lead to the following restructuring algorithm for removing a structure violation or a
double-sided structure at height h to a node P in a kHB-tree
of height H.
Algorithm R:
1) There is a possibly strong structure violation at
height h, to P:
Apply an appropriate restructuring operation;
if the restructured subtree does not decrease in height
then STOP
else apply the algorithm recursively;
2) The height of a subtree decreases in height by 1:
if the decrease in height causes a structure violation or
a double-sided structure violation to a higher node
then apply the algorithm recursively

Proof: We have the following in Fig. 4(b).


Since either the right structure violation to C [Fig. 4(a)]
is special or the height of the subtree 5 is h - 3, the height
else STOP;
of node C is h - 2. Thus, the height of B is h - 1.
3) There is a double-sided structure violation at height h,
The only n.e. pairs that may get created are those between to P:
A and B (in case the height of A becomes h - 4 or less). We
There are strong structure violations at height h, to P,
will examine such n.e. pairs as to whether they can be un(a left son of P) and to P2 (a right son of P). Let SI and
supported. All other n.e. pairs in the tree continue to be
S2 be the left and the right direct sons of P; Apply the
supported by one or both of their respective adjoining
algorithm recursively to correct the strong structure
EQSON-subtrees.
violations to P1 and P2 in the trees rooted at SI and S2,
Let the height of A be h - 4 or less and thus the height
respectively, till
of the subtree 1 is h - 5 or less. Thus, in Fig. 4(a), A has an
a) The algorithm terminates without decreasing the
n.e. pair, say, f of direct sons at height h - 3. f must clearly heights of SI or S2:
be supported by one of its adjoining EQSON-subtrees, and
STOP;
hence the height of at least one of them must be h - 3 or
b) The height of Si or S2 decreases by 1:
more. However, the height of 2, the right adjoining EQSONApply the algorithm recursively;
subtree of f [Fig. 4(a)] cannot be h - 3 or more because that
c) The height of both SI and S2 decreases by 1 which
contradicts the assumption that the height of A is h - 4 or decreases the height of P by 1:
less. Thus, the height of the left adjoining EQSON-subtree
Apply the algorithm recursively.

VAISHNAVI: MULTIDIMENSIONAL HEIGHT-BALANCED TREES

The following result follows.


Theorem 3.4: Algorithm R is correct and its time complexity is O(H - h).
IV. UPDATE ALGORITHMS
In this section we will describe the insertion algorithm,
which is a simple application of the restructuring operations.
The deletion algorithm is almost entirely based on the
restructuring algorithm R described in the previous section
and is left to the reader. Its complexity obviously is
O(log2 n) + k.
Insertion Algorithm: Let x (Xk, * , xl) be the record to
be inserted in the given kHB-tree T. If T is empty then obvi, xl is created
ously a chain of k nodes storing the keys Xk,
and the resulting tree is a valid kHB-tree. If T is not empty,
the tree is searched for x. The search will end in some
(h - 1, i)-node P, 1 i S k, storing the key, say, p, such
that P does not have a left (respectively, a right) son. A chain
of i nodes storing the keys xi,
,x, is made the left
(respectively, right) subtree of P. If h is greater than i + 1,
then the insertion of x does not increase the height of P and
the resulting tree is a valid kHB-tree. In case h is equal to
i + 1, then the insertion of x increases the height of the
subtree rooted at P by 1. The path followed from the root of
the tree to P is now traced backwards. The following rules
determine the action to be taken at each node:
a) If the HB-condition of the current node is "r" or "=
and the last step originated from one of the tallest subtrees,
the associated information of the node (height and HBcondition) is updated. Next, the path upward is followed
(with height increase).
b) If the current node has HB-condition of "r" or "= , and
the last step originated from a subtree which is not the tallest
subtree, the HB-condition of the current node is updated and
the algorithm terminates.
=

c) If the current node has HB-condition of "+" or "-,"

and the last step originated from the EQSON-subtree, or from


the shorter of the LOSON- and the HISON-subtrees, the
HB-condition is updated and the algorithm terminates.
d) If the HB-condition of the current node is "+" or "and the last step originated from the taller of the LOSON- or
the HISON-subtrees, the information associated with the
node (height) is updated. Then,
if the last two steps were taken in the same direction
(both from the LOSON-subtrees or both from the
HISON-subtrees), then an appropriate rotation (Fig. 3)
is performed, and the algorithm terminates
else if the last but one step originated from the
EQSON-subtree, then an appropriate rotation
(Fig. 3) is performed, and the algorithm terminates
else if the last but one step and the last but two steps
were taken in the same direction or the difference
between the heights of the current node and one
of its subtrees is 3 then an appropriate type A
double rotation (Fig. 4) is performed, and the
algorithm terminates
else an appropriate type B double rotation
(Fig. 5) is performed and the algorithm terminates.
The following result follows.
Theorem 4.1: The insertion algorithm is correct and its

339

time complexity is O(log2 n) + k.


The following result- states a very attractive feature of the
insertion algorithm.
Theorem 4.2: The insertion of a record to a kHB-tree
requires the application of a restructuring operation at
most once.
Example 4.1: The reader may verify that the 2HB-tree
shown in Fig. 1 results if the above insertion algorithm is
used to insert the following sequence of records into an empty
tree: (2, a), (1, a), (5, c), (3, b), (1, b), (5, b), (3, a), (5, d),
(5,a), (4,a), (4,b). Only the insertion of (4,b), the last
record, requires the application of a restructuring operation,
viz. type A double rotation. Deletion of the record (3, a) in
this tree causes a right-sided structure violation to B (see
Fig. 6). This is corrected by a rotation at B.
V. SUMMARY AND CONCLUSION
In this paper, we present a new multidimensional tree
structure called the multidimensional height-balanced trees.
We show that these trees can be used to manage a set of n
k-dimensional records or data items such that the records can
be searched or updated in O(log2 n) + k time, which is optimal. The data structure is a multidimensional generalization
of the well-known height-balanced trees (also known as the
AVL-trees). It retains much of the simplicity and the efficiency of the height-balanced trees. The update algorithms
are simple and use almost the same restructuring operations
as those used in the height-balanced trees. The insertion algorithm, in particular, retains a very important property: an
insertion of a record results in the application of a restructuring operation at most once.
The contribution of this paper is twofold: a) it provides a
simple and efficient basic data structure for the management
of multidimensional data; b) it presents a new insight on how
a one-dimensional balanced tree structure can be generalized
so that it retains its efficiency without compromising some of
its nice properties. Further work needs to be done in both of
these directions. The usefulness of the multidimensional
height-balanced trees as a basic tool for the efficient organization of geometric objects and multiattribute records
needs to be further investigated. Similar generalization of
some other important balanced tree structures like the B-trees
[2], the weight-balanced trees [24], and the RB-trees [12] needs
to be exa"mined. Some of this work is under progress.
It seems to be possible to use the two-dimensional heightbalanced trees as a basic data structure for implementing a
dynamic dictionary [4] in a simple and worst-case (distinct
from amortized worst-case [4], [29]) efficient manner. The
available solutions [3], [5], [14], [21] are either inefficient in
certain operations or do not offer simple direct algorithms for
all the operations or need nonlinear storage. This possibility
is under investigation [31].
APPENDIX A
Lemma A.1: Given a kHB-tree T of height h, storing n
records, it can be transformed to an akHB-tree T1 of the same
height storing n or less records.
Proof: T is first transformed into T by successively applying the following transform4tion procedure PI to each
node in the decreasing order of the dimension of the nodes

c-33,

IEEE TRANSACTIONS ON COMPUTERS, VOL.

340

.I1

fi

Ja2

(3

Ka

Ic

Lb

01b

4, APRIL 1984

NO.

l1

Pal

Fig. 6.

and within nodes of the same dimension in decreasing order


of their heights. T is an akHB-tree of height h storing n or less
records except that the records may not be stored in the right
order. The records stored in T are then changed so that they
are in the right order, resulting in T1.
Procedure PI: The procedure is applied to an (h, i)-node P
provided P satisfies Property A, stated as follows. Let U be
the direct/indirect father of P. There is no n.e. pair in the
subtree rooted at P which is indirectly supported by the
EQSON-subtree of U.
The procedure transforms Tp, the tree rooted at P, to Tp of
height h such that:
a) The direct sons of P satisfy Property A.
b) The subtrees rooted at the direct sons of P are
iHB-trees.
c) Tp is an aiHB-tree except that the records may not be
stored in the right order and b).
d) The number of records stored in Tp is less than or equal
to those stored in Tp.
The following rules describe the transformation for the
different cases; their various symmetric variants are omitted.
1. HB-condition of P is "r" (see Fig. 7).
Transformation: 4 = 4), 5 = 2, and 6 = 4 (i.e., subtree 4 is empty, 5 is the same as subtree 2, and 6 is empty).
2. HB-condition of P is not "r."
2.1. Ht. (2) = h - 2.
2.1.1. Ht. (4) = h - 2 (see Fig. 8).
Transformation: 6 = 4, 7 = 2, 8 = 4, 9 = 4, 10 =

4.

T P.-

I\

2
h-1

h-i

Fig. 7.
T P.-

T -

P ()h

h-i

2
h-2

h-i

10

Fig. 8.

Tp:

T -

P-

P (

1 h-i

8
10

11
h-2

12

(a)

2.1.2. Ht. (4) = h - 3 (see Fig. 8).


Transformation: 6 = 4, 7 = 4, 8 = , 9 = 2, 10=

Fig. 9). In this case,


3, in view of the fact
that P satisfies Property A. Let S be the right son of Q such
that its height has the smallest value greater than or equal to

T -:

7:

13:
R

2.1.3. Ht. (4) = t, t S h - 4 (see


the height of R must be h - 2 or h -

t+ 1.
2.1.3.1. Ht. (6) t [see Fig. 9(a)]. This implies that the
height of S will not decrease if 5 is replaced by an empty tree.
Transformation: 8 = flipped1 (13), 9 = 4, 10= 4,
'The tree flipped(1) is the tree T with its LOSON- and HISON-subtrees
exchanged.

14

1.5

(b)

Fig. 9. A curvy line indicates zero or more tree edges.

11 = 2, 12 = 4, where 13 [see Fig. 9(b)] is the same as 7


except that 5 is replaced by an empty tree.
Note that the height of 8 is at least h - 3. It is easy to
verify that, in Tp, there is no n.e. pair in 8 or in the HISON-

341

VAISHNAVI: MULTIDIMENSIONAL HEIGHT-BALANCED TREES

T P.-:
TP
subtree of P which is unsupported, or indirectly supported by
P
9 or the EQSON-subtree of U.
P
h
2.1.3.2. Ht. (6) < t [see Fig. 9(a)].
M
)
h-i
7
M
-i
13
4
Transformation: 8 = flipped (13), 9 = 6, 10 = 4,
/mh-3
11 = 2, 12 = 4, where 13 [see Fig. 9(b)] is 7 transformed
as: 14 = 4), 15 = 4.
1
2
8
9
The height of 5 is t or less, and that of 6 is less than t.
Sit
Sif
Therefore, the height of S in 13 [Fig. 9(b)] will continue to
be the same as in 7. Note that the height of S in 7 cannot be
3
4
5
10
11
12
more than t + 2 because otherwise there is an unsupported
Fig. 10.
n.e. pair in the tree. Thus, if in Tp, 7 is replaced by 13 and 4
P
is replaced by 6, then there cannot be an n.e. pair in the
Tp.
Tp '
tree rooted at R which is either unsupported or indirectly
P
h
supported by 6. Therefore, in Tp the difference between the
LOSON- and the HISON-subtrees of P is at most 2, and there
h- or
1
Q
6
7 Q t
h-3
h-2
h-1
is not an n.e. pair in the trees rooted at the left and the right
3
5
direct sons of P which is not supported or which is indirectly
8
0
>h-4
supported by 9 or the EQSON-subtree of U.
Fig. 1 1.
2.2. Ht. (EQSON-subtree of P) = m h - 3. Assume
that the height of the LOSON-subtree of P is h - 1. TransTp formations described in I and II below are applied one after
p
P
h
the other. The transformation described in I results in Tp and
that in II results in Tp.
1
11R
10
>h-4
h-i
h-1
I. Let SI be the left son of P such that its height has the
smallest value larger than or equal to m + 1 (see Fig. 10).
1'5
16
I. 1. Ht. (4) >, m.
S1
Transformation: 8 = 1, 9 = 2, 10 = 4, 11 = 4,
12 4, 13 = 6, 14 7.
12
13
14
1.2. Ht. (4) < m.
Transformation: 8 1, 9 = 2, 10= 4, 11 = 6,
5
12 = 4, 13 = 4, 14 = 7.
Fig. 12.
As a result of the transformation M satisfies Property A
while P continues to satisfy Property A. There may, howProof: By induction on h.
ever, be n.e. pairs in the HISON-subtree of P which are
Basis h = 2 or 3.
unsupported because they are no more indirectly supported
a) N(2, 2) = 1 = N(1, 1)
by the EQSON-subtree of P.
N(3, 2) = 2 = N(2, 1) [see Fig. 13(a)].
11. 1. Ht. (4) > h - 4 (see Fig. 11). Note that height of
b) N(2,2) = 1 N(1,2) + N(-2,2) + 1
Q will be h - 1 or h - 2 because P satisfies Property A.
because N(h, 2) = 0 if h < 2.
Transformation: 6 = 1, 7 = 2, 8 = 4, 9 = 4,
N(3,2) = 2 = N(2;2) + N(-1,2) + 1.
10 = 4).
Induction hypothesis: For 2 h s,
11.2. Ht. (4) = t, t < h - 4 (see Fig. 12). Let S be the
a) N(h, 2) = N(h - 1, 1),
right son of Q such that its height has the smallest value larger
b) N(h, 2) = N(h - 1, 2) + N(h - 4, 2). + 1.
than or equal to t + 1. Let the height of Q be r. The height
Induction step: To prove that the result is true for
of R will obviously be r - 1 or r - 2.
h = s + 1, s : 3. We will now prove this:
II.2. 1. Ht. (6) :,> t.
a) N(4, 2) = min{N(3, 1), N(3, 2) + 1}
Transformation: 10 = 1, 11 = 4, 12 = 4), 13 = 6,
[see Fig. 13(b)]
14 = 7, 15 = 8, 16 = 9.
N(3,2) + I = N(2, 1) + 1
11.2.2. Ht. (6) < t.
[by induction hypothesis a)]
Transformation: 10 = 1, 11 = 6, 12 = 4, 13 = 4,
= N(3, 1)
14 = 7, 15 = 8, 16 - 9.
+
N(s 1,2) = min{N(s, 1), N(s, 2) + N(s -3,1),
It is easy to see that Tp satisfies the desired properties.
N(s,2) + N(s - 3,2) + 1}, s 4.
[see Fig. 13(b)]
APPENDIX B
N(s,2) + N(s - 3,1)
Let N(h, k) be the minimum number of records stored in an
=N(s - 1, 1) + N(s - 3, 1)
akHB-tree of height h, k > 1.
[by induction hypothesis a)]
Lemma B .]: For h 2,
=N(s - 1, 1) + N(s - 4, 1)
+ 1 + N(s - 7, 1)
a) N(h, 2) = N(h - 1, 1),
=
+
+
1.
b) N(h, 2) N(h 1, 2) N(h 4, 2)
=N(s, 1) + N(s - 7, 1)
h

'> AXh

3EEE

342

P) 3

Yh

0
(b)

t+2

t+2

c-33,

NO.

4,

APRIL

1984

Induction step: To show e) and f) as follows:


e) N(s + 1, t + 1) = N(s, t), s t + 2.
LHS
N(t + 3, t + 1) = min{N(t + 2, t),
N(t + 2,t + 1) + 1}
N(t + 2,t + 1) + 1 = N(t + 1, t) + 1
[by induction hypothesis c)]
= N(t + 2, t)
[by induction hypothesis b)]
N(s + 1, t + 1) = min{N(s, t), N(s, t + 1)
+ N(s - 3,t), N(s,t + 1)
+ N(s - 3,t + 1) + 1},
s>t+ 3
[see Fig. 13(d)]
N(s,t + 1) + N(s - 3,t)
= N(s - 1, t) + N(s - 3, t)
[by induction hypothesis c)]
= N(s - 1,t) + N(s - 4,t) + 1 + N(s -7,t)
[by induction hypothesis b)]
= N(s, t) + N(s - 7, t)
[by induction hypothesis b)]
N(s,t + 1) + N(s - 3,t + 1) + 1
=N(s - 1, t) + N(s - 4, t) + 1
[by induction hypothesis c)]
= N(s, t) [by induction hypothesis b)].
Thus, N(s + 1, t + 1) = N(s, t).
f) N(s + 1,t + 1) = N(s,t + 1) + N(s - 3,t + 1)
+ 1, s t + 2
LHS
= N(s, t) (from the proof of e) above)
RHS
= N(s - 1, t) + N(s - 4, t) + 1
[by induction hypothesis c)]
= N(s, t) [by induction hypothesis b)]
Hence, the result.
Lemma P.3: For, k 1, h : k,
N(h, k) = N(h - k + 1, 1)
Proof: follows from Lemma B.2 a).
-

(a)

TRANSACTIONS ON COMPUTERS, VOL.

.0
(C)

(d)

Fig. 13.

N(s, 2) + N(s - 3, 2) + 1
= N(s - 1, 1) + N(s - 4, 1) + 1
[by induction hypothesis a)]
= N(s, 1)
Thus, N(s + 1,2) = N(s,1)
b) N(s,2) + N(s - 3,2) + 1
- N(s - 1, 1) + N(s - 4 1) + 1
[by induction hypothesis a)]
= N(s, 1)
N(s + 1, 2) = N(s, 1) (as proved above).
Thus, N(s + 1, 2) = N(s, 2) + N(s - 3, 2) + 1.
Lemma B.2: For i 2, h i,
a) N(h, i) = N(h - l, i - 1),
b) N(h, i) = N(h - 1, i) + N(h - 4,i) + 1.
Proof: by induction on i.
Basis i = 2, proved by Lemma B. 1.
Induction hypothesis: For 2 i t,
a) N(h,i) = N(h - 1,i - 1),
b) N(h,i) = N(h - 1,i) + N(h - 4,i) + 1.
Induction step: To prove that the result is true for i =
t + 1, t 2.
Proof: by induction on h.
Basis h = t + 1, or h = t + 2.
h=t+ 1
N(t + 1, t + 1) = N(t, t) = 1,
N(t + 1,t + 1) = 1
-N(t,t + 1) + N(t - 3,t + 1) + 1
h= t+2
N(t + 2, t + 1) = min{N(t + 1, t),
N(t + 1,t + 1) + 1}
[see Fig. 13(c)]
N(t+ 1,t+ 1)+ 1'= 2
N(t + 1, t) = N(t, t) + N(t - 3, t) + 1 = 2
[by induction hypothesis b)].
Thus, N(t + 2, t + 1) = N(t + 1,t).
Induction hypothesis: For, t + 1 h s,
c) N(h, t + 1) N(h-1- t),
d) N(h, t + 1) = N(h - 1, t + 1)
+ N(h - 4,t + 1) + 1.
-

ACKNOWLEDGMENT

The author is indebted to M. McNair for expertly typing


several versions of the manuscript, and to P. E. Kephart for
her efficient management of the support services.
REFERENCES
[1] G. M. Adel'son-Velskij and Y. M. Landis, "An algorithm for the organization of information," Sov. Math. Dokidy, vol. 3, pp. 1259-1263,
1962.
[2] R. Bayer and E. McCreight, "Organization and maintenance of large
ordered indexes," Acta Inform., vol. 1, pp. 173-189, 1972.
[3] S. W. Bent, "Dynamic weighted data structures," Ph.D. dissertation,
Stanford Univ, Stanford, CA, Rep. STAN-CS-82-916, June 1982.
[4] S. W. Bent, D. D. Sleator, and R. E. Tarjan, "Biased 2-3 trees," in
Proc. 21st Annu. IEEE Symp. on Foundations of Comput. Sci., 1980,
pp. 248-254.
, "Biased search trees," Bell Labs., Tech. Memo., 1983.
[5]
[6] J. L. Bentley, "Multidimensional search trees used for associative searching," Commun. Ass. Comput. Mach., vol. 18, pp. 509-517, 1975.
[7] J. L. Bentley and J. B. Saxe, "Algorithms on vector sets," SIGACTNews
vol. 1 1, pp. 36-39, 1979.
[8] J. L. Bentley and D. Wood, "An optimal worst-case algorithm for
reporting intersections of rectangles," IEEE Trans. Comput., vol. C-29,
pp. 571-577, 1980.

VAISHNAvI: MuLTDIMENSIONAL HEIGHT-BALANCED TREES

[9] M. Brown, "A partial analysis of random height-balanced trees," SIAM


[10]
[11]

[12]
[13]
[14]

[15]
[16]
[17]

[18]
[19]

[20]
[21]
[22]
[23]
[24]

J. Comput., vol. 8, pp. 33-41, 1979.


R. A. Finkel and J. L. Bentley, "Quad trees-A data structure for retrieval on composite keys," Acta Inform., vol. 4, pp. 1-9, 1974.
V. Gopalakrishna and C. E. Veni Madhavan, "Performance evaluation of
attribute-based tree organization," ACM Trans. Database Syst., vol. 6,
pp. 69-87, 1980.
L. J. Guibas and R, Sedgewick, "A dichromatic framework for balanced
trees," in Proc. 19th Annu. IEEE Symp. Foundations of Comput. Sci.,
1978, pp. 8-21.
R. H. Gueting and H. P. Kriegel, "Multidimensional B-tree: An efficient
dynamic file structure for exact match queries," in Proc. 10th GI Annu.
Conf., Informatik Fachberichte, Springer-Verlag, 1980, pp. 375-388.
, "Dynamic k-dimensional multiway search under time-varying
access frequencies," in Proc. 5th GI Conf. Theoretical Comput. Sci.,
Lecture Notes in Computer Science, vol. 104, 1981, pp. 135-145.
R. L. Kashyap, S. K. Subas, and S. Bing Yao, "Analysis of the multipleattribute-tree data-base organization," IEEE Trans. Software Eng.,
vol. SE-3, pp. 451-467, 1977.
D. E. Knuth, The Art of Computer Programming, Vol. 111: Sorting and
Searching. Reading, MA: Addison-Wesley, 1973.
H. P. Kriegel and V. K. Vaishnavi, "Weighted multidimensional B-trees
used as nearly optimal dictionaries," in Proc. 10th Int. Symp. Math,
Foundations of Comput. Sci., Lecture Notes in Computer Science,
vol. 118, 1981, pp. 410-417.
H. P. Kriegel, V. K. Vaishnavi, and D. Wood, "2-3 brother trees," BIT,
vol. 18, pp. 425-435, 1978.
F. Luccio and L. Pagli, "On the height of height-balanced trees," IEEE
Trans. Comput., vol. C-25, pp. 87-90, 1976.
V. Y. Lum, "Multi-attribute retrieval with combined indexes," Commun.
Ass. Comput. Mach., vol. 13, pp. 660-665, 1970.
K. Mehlhorn, "Dynamic binary search," SIAM J. Comput., vol. 8,
pp. 175-198, 1979.
-, "A partial analysis of height-balanced trees," SIAM J. Comput.,
vol. 11, pp. 748-760, 1982.
J. Nievergelt, H. Hinterberger, and K. C. Sevcik, "The grid file: An
adaptable, symmetric multi-key file structure," Eidgenoessische Technische Hochschule Zuerich, Zuerich, Swiss Institute fuer Informatik,
Rep. 46, 1981.
J. Nievergelt and E. M. Reingold, "Binary search trees of bounded
balance," SIAM J. Comput., vol. 2, pp. 33-34, 1973.

343

[25] J. A. Orenstein, "Multidimensional TRIEs used for associative searching," Inform. Proc. Lett., vol. 13, pp. 150-157, 1982.
[26] T. Ottmann, H. -W. Six, and D. Wood, "Right brother trees,"
Commun. Ass. Comput. Mach., vol. 21, pp. 796-776, 1978.
[27] T. Ottmann and D. Wood, "1-2 brother trees or AVL trees revisited,"
Comput. J., vol. 23, pp. 248-255.
[28] P. Scheuermann and M. Ouksel, "Multidimensional B-trees for
associative searching in database systems," Inform. Syst., vol. 7,
pp. 123-137, 1982.
[29] D. D. Sleator and R. E. Tarjan, "Self-adjusting binary search," in Proc.
15th Annu. Symp. on Theory of Computing, 1983, pp. 235-245.
[30] V. K. Vaishnavi, "Computing point enclosures," IEEE Trans. Comput.,
vol. C-31, pp. 22-29, 1982.
[31]
, "On the worst-case efficient implementation of weighted dynamic
dictionaries," in Proc. 21st Annu. Allerton Conf. on Communication,
Control, and Computing, 1983, pp. 647-655.
[32] V. K. Vaishnavi, H. P. Kriegel, and D. Wood, "Height balanced 2-3
trees," Computing, vol. 21, pp. 195-211, 1979.
[33] V. K. Vaishnavi and D. Wood, "Data structures for the rectangle
containment and enclosure problems," Computer Graphics and Image
Processing, vol. 13, pp. 372-384, 1980.
[34]
, "Rectilinear line segment intersection, layered segment trees and
dynamization," J. Algorithms, vol. 2, pp. 160-176, 1982.
Vijay K. Vaishnavi was born in Srinagar, Kashmir,

India. He received the B. E. degree (with distinction) in electrical engineering from the Jammu

and Kashmir University in 1968, and the M. Tech.


and the Ph.D. degrees in electrical engineering from
the Indian Institute of Technology, Kanpur, India.
He is an Associate Professor of Information
Systems at Georgia State University, Atlanta.
Previously, he has held faculty positions at
Ohio University, Athens; Concordia University,
Montreal; and the Indian Institute of Technology,
Kanpur. He has also been a Postdoctoral Fellow at McMaster University,
Hamilton, Ont., Canada. His current research interests include data structures,
algorithms, and artificial intelligence.
Dr. Vaishnavi is a member of the Association for Computing Machinery,
SIGACT, SIGMOD, the European Association for Theoretical Computer
Science, and the IEEE Computer Society.

Vous aimerez peut-être aussi