Vous êtes sur la page 1sur 41

Computers & GeosciencesVol. 18, No. 9, pp.

1213-1253, 1992
Printed in Great Britain. All rights reserved

0098-3004/92$5.00+ 0.00
Copyright 1992PergamonPress Ltd

COMPRESSION OF DIGITIZED MAP IMAGES


DAVID A. SOUTHARD
The MITRE Corporation E073, 202 Burlington Road, Bedford, MA 01730-1420, U.S.A.

(Received I0 February 1992; accepted 30 April 1992)


Abstract--An efficient way to store and retrieve digitized chart images using a lossy vector quantization
0/Q) compression technique is presented. A VQ codebook is located using the fast pairwise nearest
neighbor (PNN) clustering algorithm. A k-d tree data structure is used for efficient image compression.
Some unique features of this approach are that compressed files use a predictable amount of storage, and
can be decompressed quickly for display on equipment ranging from portable personal computers to
advanced graphics workstations. The method also applies to panchromatic and multispectral satellite
imagery. In this application, digitized chart images with compression ratios of 24: 1 exhibited good quality.
In certain applications, even higher compression ratios are feasible.

Key Words: Digital mapping, Image compression, Vector quantization, Clustering algorithms, k-d tree.

INTRODUCTION
This investigation into image compression techniques
was motivated by technical issues surrounding the
design of an automated mission planning program
termed the Air Force Mission Support System (AF
MSS) (Southard, 1992). The display of digitized
navigational charts is central to the function of this
system.
The Defense Mapping Agency (DMA), supplies
digitized charts on compact disk read-only memory
(CD ROM) media. These georeferenced digital images are termed Equal Arc-second Raster Chart
(ARC) Digitized Raster Graphics (ADRG) (Defense
Mapping Agency, 1989). It was estimated that AF
MSS would need on-line about 200 gigahytes (GB) of
A D R G data, in its raw form. Image compression is
one strategy that will be used to reduce this costly
storage requirement. The techniques are described
and evaluated, and a set of algorithms that meets the
goals set by AF MSS are presented. An ANSI C
source code for these algorithms is provided.

Applications
The methods selected have broad applicability to
many automated mapping, charting, and geodesy
systems. Digitized charts can replace paper charts in
many display applications. Digital mapping, remote
sensing and analysis, computer-aided design for civil
engineering, architecture, and environmental analysis, global positioning and navigation, and geographical information systems (GIS), are some examples.
Digitized chart images are useful in military mission
planning and command, control, and communications (C 3) systems.

Image compression techniques


Compression techniques can be classified into two
broad categories: lossless, and lossy. Lossless algor-

ithms include such well-known techniques as runlength coding, quadtree coding, Huffman coding, and
the Lempei-Ziv algorithm, which is used by the
UNIX compress utility (Welch, 1984). A lossless
algorithm reconstructs compressed data exactly to its
original state. The degree of compression depends on
the content of the data. On typical chart images,
lossless techniques can achieve up to about a 2 : 1
compression ratio (the ratio of the original image size
to the compressed image size). Table l shows the
compression of selected chart and satellite images
using run-length, Huffman, and Lempel-Aiz coding.
In some situations, however, these coding schemes
increased the size of the images.
Lossy algorithms admit higher compression ratios.
Lossy algorithms include differential pulse code
modulation (DPCM), vector quantization (VQ), and
the discrete cosine transform (DCT), which is the
central technique in the popular JPEG and MPEG
standards (Rabbani and Jones, 1991). Lossy algorithms will degrade the original image somewhat,
depending on the image content, the amount of
compression, and the properties of the algorithm. In
many situations, however, some degradation is permissible. The loss may be visually imperceptible.
The classification and compression techniques that
are presented are appropriate for display and analysis
of many types of scientific and engineering images.
The lossy character of these algorithms, however,
would make them inappropriate for archival data
storage, or for situations that require complete accuracy. In these instances Iossless algorithms must be
employed. For example, Digital Terrain Elevation
Data (DTED) supplied by the DMA for hilly terrain
in central Germany can be compressed to an average
compression ratio of 1.7 using Huffman coding, and
3.4 using Lempel-Ziv coding. It would be inappropriate to use a lossy technique, because this would

1213

1214

D. A. SOUTHARD
Table 1. Compression ratios for lossless compression of selected images
Lossless compression method
Image
ONC
TPC*
TPC
JOG
TLM
SPOTt
SPOT:~

Size
512 512
1000 x 1000
500 x 500
1024 1024
512 x 512
2391 x 2249
2391 x 2249

Run length
0.97
0.98
0.97
1.02
0.98
1.01
0.98

Huffman
1.06
1.10
1.25
1.12
1.11
1.71
1.04

Lempel-Ziv
0.99
1.00
1.27
1.12
1.06
1.93
0.99

Unless indicated, images have been spatially filtered and subsampled.


*Raw ADRG image.
tRaw panchromatic image, with little contrast.
~Processed with contrast stretch, and spatially filtered to remove imaging
artifacts.

compromise the character of the data. A reliefshaded image of these data, however, can be compressed successfully using the methods described here.
Vector quantization

Vector quantization is a general method that is


used extensively for applications in speech coding,
image coding, segmentation, classification, and recognition problems. There are many forms of VQ, as
well as hybrids of VQ with other techniques (Abut,
1990; Rabbani and Jones, 1991). Several forms of
VQ, along with other compression schemes, have
been proposed for compressing digital charts (Barad
and Martinez, 1989; Jaisimha and others, 1989;
Lohrenz and others, 1990; Potlapalli and others,
1989; Potlapalli, Barad, and Martinez, 1989). In this
work, several competing VQ algorithms were tested,
evaluating them with an emphasis on image quality
and processing speed, and the best techniques for this
application were applied.
In VQ, an image is divided into small units, termed
vectors. In this context, the term vector simply refers
to an n-tuple; there is no geometric interpretation.
The vectors comprise a group of adjacent image
pixels. Typically, a vector is a small rectangular
grouping of from 2 to 25 pixels. Suppose, for a
moment, that we have a predefined list of vectors,
named the codebook. To compress, the entry is determined in the codebook that most closely matches
each vector from the input image. In other words,
each image vector is classified into a category represented by a codebook entry. The index of each
entry is saved in the compressed image. To decompress, each index is used to look-up a vector in the
codebook and reassemble the image. Vector quantization has several salient characteristics that distinguish it from the other compression algorithms,
and which make VQ ideally suited for automated
mapping systems:
Nearly all the computation associated with compression and decompression falls on
the side of compression. Image reconstruction is a

Asymmetric.

fast table look-up operation. Most other compression algorithms tend to require about as much
compute time for decompression as for compression (cf. Rabbani and Jones, 1991).
Efficient. The compression stage comprises two
steps: codebook development, and the compression
itself. Once there is a codebook for a class of
images, such as for a particular type of chart, the
codebook can be reused to compress any image in
that class. Although codebook development is intensive computationally, compression is simple and
quick.
Predictable. The compression ratio does not
change from image to image. The size of the vectors
and the length of the codebook determine the
amount of compression. This property simplifies disk space management on a heavily loaded
system.
Flexible. The decompression can be tailored to the
local resources, ranging from a PC with a SuperVGA card, to a super graphics workstation with a
high-resolution display. The same compressed file
can be decompressed for a gray-scale display, an
8-bit (256 colors) look-up table display, or a 24-bit
RGB (16.8 million colors) display. This adaptability requires only that we modify the VQ look-up
table as appropriate. There is no need to compress
the image differently for each situation.
Random-access. Many algorithms require decompression to continue from beginning to end in a
predetermined way. The method described can
begin at any point in the compressed image, and go
in any direction.

Filtering

Digitized chart images may contain two interesting, but undesirable, artifacts. Charts usually are
printed using only a few colors of ink. Intermediate
shades are reproduced with the half-tone printing
process, in which arrays of tiny dots of ink replicate
intermediate tones of color. When digitized with an
optical scanning process, the half-toned areas appear
stippled. The stipple can best be described as a

Compression of digitized map images

7Y

vqnt vqoo.p
vq'.p@
vq,n,

1215

vq,xp@

Figure 1. Overview of vector quantization image compression.

salt-and-pepper texture in the background areas. The


cartographer originally intended these areas to look
like a constant shade of color.
The second artifact is an interaction of the halftone printing screens with the implicit array of optical
samples, which produces a moir# pattern in these
areas. This effect is understood as a beating of the
spatial frequency of the pixel array against those of
the half-tone screens. The moir6 pattern can be
distinct.
If either of these artifacts are present, the image
should be filtered before compression. By filtering
before compression, stipple and moir6 can be eliminated, and the legibility of the decompressed image
increased. Filtering may be accomplished by replacing each pixel by a weighted average of pixels in its
immediate neighborhood, as discussed by Pratt
(1991) or by Gonzalez and Wintz (1987). Eight
different filters were tested, and the best results were
obtained with a separable, piecewise cubic filter of
Mitchell and Netravali (1988). A separable filter
response function in two dimensions is F(x,y)=
f(x)f(y), where x and y represent the horizontal and
vertical distances from a sample point, measured in
pixel intervals from the sample point. To filter, the
weighted sum of all pixels that fall within the filter's
radius is calculated, where the weights are determined by the filter response function, then are
normalized by dividing by the sum of the weights.
Separable filters can be used both for subsampling
and interpolating images. Because they are calculated
using analytical functions, either integral or nonintegral sampling fators can be used. The equation
for Mitchell and Netravali's family of cubic filters
is:

where B and C are parameters that define a particular


cubic spline. Mitchell and Netravali recommended
the values (B, C) = (1, !)
3

"

Overview of VQ compression
Figure 1 illustrates the image compression procedure. The image is converted from the distribution
media to the local file system. After filtering, if
needed, a representative subset of vectors is extracted,
and assembled into a training image. The codebook
development program, vqinit, analyzes the training
image to determine a codebook. Then, the program
vqeomp uses the codebook to compress one or more
images from the class represented by the training
image. As an optional separate step, vqinit generates
a color table, and vqcmap uses the color table to
color-map the codebook. The decompression program, vqexp (VQ-expand), takes the codebook and
the compressed image, and assembles an image file. A
display program, specific to the local environment,
reads the image, and for color-mapped images, also
reads the color table, and writes the image to the
computer display.
CODEBOOK DEVELOPMENT

The key to VQ is the method by which a codebook


is obtained. The codebook determines the quality of
the reconstructed image. The codebook is located by
a process termed clustering. One or more training
images is analyzed to locate a codebook that will
minimize a distortion measure for the compressed
image. The training images represent the class of
images to be compressed with the codebook. The

1 ((12 - 9B - 6C)lxl 3 + ( - 18 + 12B + 6C)lxl 2 + (6 - 2B)

if Ixl < 1

f(x) = g ~ ( - B - 6C)1xl3 + (6B + 30C)lxl2+ (-12B - 48C)lxl + (8B + 24C) i f l _ < M < 2
otherwise

(1)

1216

D.A. SOUTHARD

training image could be the image to be compressed,


or it could be a collection of vectors randomly
sampled from images to be compressed. (The use of
the term training in this context should not be
confused with training data used in supervised multispectral classification.) For most applications the
distortion measure is squared error, although other
distortion measures are possible. There are two algorithms associated with VQ:

Linde-Buzo-Gray (LBG) algorithm. The LBG algorithm also is known as the generalized-Lloyd, or
K-means algorithm. It has been the "standard" VQ
algorithm since Linde, Buzo, and Gray (1980)
described it. This algorithm iteratively improves an
initial codebook until it reaches a local minimum of
the distortion measure. The particular result depends on the initial values selected for the codebook.
Pairwise nearest neighbor (PNN) algorithm. Equitz
(1989) proposed the PNN clustering algorithm.
The fast-PNN variant uses Bentley's (1975) k-d
tree to partition a set of training vectors. At each
step, the algorithm merges vector pairs that will
introduce the least error. Each leaf of the k-d tree
submits a candidate vector pair. As the clustering
proceeds, the algorithm merges leaves, and rebalances the tree. Eventually, the tree will contain only
the desired number of clusters.
Both algorithms were tested and it was determined
that the P N N clustering algorithm was not only faster
(an average of 400% faster on our test images), but
also led to better quality color images. LBG tends not
to preserve the relative ratios of the color components, so the colors may appear different than
intended. This results in serious legibility problems
for charts with fine print. The PNN algorithm,
because it partitions the data first, does not suffer
from this problem. Appendix 1 contains a detailed
description and program listing for the codebook
development program.

Color VQ
The basic VQ technique must be extended for color
images. There are two alternatives (Murakami, Asai,
and Itoh 1986):

Separate color components. The most usual approach is to separate the 24-bit RGB image into
three images: one image for red, one for green, and
one for blue; then to compress each component
image separately. One variation is to transform the
RGB image into YIQ color space, or another color
space that decorrelates the black-and-white (luminance) and color (chrominance) components of the
image. Human vision resolves luminance (Y) better
than chrominance. Then a higher compression
ratio can be used on the chrominance (I and Q)
components.

Combined color components. The color components of each pixei can be included in the vector.
The separate color component arrangement is
complicated. Three codebooks are needed, one for
each color component. On decompression, this
approach requires additional work to reassemble
the image for display. The decompressed image
would be in RGB form, so an additional color-mapping stage would be required before the image could
be used on display systems that use color look-up
tables.
The combined component method is better suited
to the purpose. The color-mapping can be folded into
the codebook, eliminating any extra steps during
decompression.

COMPRESSION

Suppose a pixel block has dimensions w x h, where


w is the block width and h is the block height. Each
pixel has c color components, so the vector length is
cwh. Suppose we compress using a codebook that has
M code vectors. On decompression, we again will
have a RGB color image. There, however, will be a
maximum of only Mwh unique colors, because each
code can contribute at most wh colors. For example,
if we have c = 3 (RGB), block dimensions of w = 2
and h = 2, and M = 256 codes, there will be a vector
length of 12, and at most 1024 unique colors will be
in the decompressed image. The compression ratio
will be

cbwh
["log2 M "1

(2)

where b is the number of bits used to store each color


component (i.e. b = 8 for 24-bit RGB color pixels).
The numerator represents the number of bits in the
image vector; the denominator represents the number
of bits needed to represent the code vector index. In
practice, it may not be desired to store the compressed image in a bit-packed format. If constrained
to an integral number of bytes, b bits wide, for the
index into the codebook, the effective storage compression ratio will be less, namely

cwh

(3)

Now the numerator is the number of bytes in a


vector, and the denominator is the number of bytes
needed to store the code index. For example, if
h = w = 4, and M = 4096, the compression ratio
according to Equation (2) is 32: 1. It is inconvenient
to pack the 12-bit code index values in the compressed image, so if 2 bytes per code index are used,
Equation (3) tells us the effective storage compression
ratio is 24:1.

1217

Compression of digitized map images


Until now, it has been assumed that the codebook
is reused repeatedly for many images, Consequently,
the size of the codebook itself has been ignored in the
compression ratio. If a new codebook is created for
each image, the size of the codebook may be significant compared to the size of the image. In this
situation the contribution of the codebook must be
accounted for also. Now the compresseion ratio is

WH
wh

WHc
V lg2 M ]
|
b
+ Mcwh

(4)

where W and H are the width and height of the image


in pixels, respectively. The numerator is the total
number of bytes in the uncompressed image; the
denominator is the number of bytes in the compressed image, plus the number of bytes in the
codebook. When the size of the codebook is small
compared to the compressed image, Equation (4)
reduces to Equation (3).

The full search, k-d tree, and classified approaches


were tested. The computing time of the full search
algorithm did not differ with chart type. With the
other algorithms, the time can differ depending on the
distribution of codes in an image. For 2 2 color
pixels, and a codebook of 256 codes, the k-d tree
algorithm was at least twice as fast as full search. The
k-d tree is even more effective for longer vectors and
larger codebooks. The classified VQ algorithm was
the quickest, but the quality was not as good as full
search or k-d tree. However, a simplistic classification
scheme was tested; the quality of the classified VQ
approach could be improved with a better vector
classification scheme. The LBG algorithm was eliminated, because its adverse effects on combined color
component vectors, and so the tree structured approach was not tested.
Appendix 2 describes the image compression program using the k-d tree approach, and provides
source code listings.

DECOMPRESSION

CODEBOOK STRUCTURE

The structure of the codebook has an impact on the


speed of compression:
Full search. In the full search scheme, one scans the
entire codebook linearly, and selects a code that
minimizes distortion measure.
Classified VQ. For classified VQ (Ramamurthi and
Gersho, 1986), the codebook is divided into several
smaller, specialized codebooks. Each codebook is
specialized for a perceptual class of feature: edges,
gradients, solids, and mixed. Human vision discriminates these types of features easily, but the
squared-error metric does not. A well-designed
classification scheme can improve the quality of the
decompressed image, and expedite both clustering
and compression, by confining searches to a small
subcodebook.
Tree structured. A tree structured codebook is
hierarchial. This structure arises naturally with the
LBG algorithm when a technique named "splitting" is used to develop the codebook (Abut, 1990;
Rabbani and Jones, 1991). Begin by matching
against a small number (say, two or four) codes.
Each code has another set of codes associated with
it. Once a match is located, it is necessary to select
from the children of the selected code. Matches
proceed hierarchically until the lowest level of the
codebook is reached. A tree structure codebook
can speed the matching process by successively
narrowing the search as we traverse the tree structure.
K-d tree. Bentley's k-d tree structure can be
used to do a nearest neighbor search (Friedman,
Bentley, and Finkel, 1977). This gives us the same
quality as a full search, but in considerably less
time.

It is necessary to display the decompressed charts


on systems ranging from PCs with a 256-color Super
VGA card, to super graphics workstations with 24-bit
RGB displays. This can be obtained by colormapping the decompressed image. However, because
all the colors in the decompressed image must originate from the VQ codebook, the codebook can be
mapped before decompression. This achieves the goal
of a VQ algorithm that decompresses directly to an
8-bit color image. The next question to be faced is
how to select a color table for the color-mapped
image?

Color tables
Color mapping reduces the number of colors used
in an image. RGB images are stored as triples of 8
bits each, that is, 24-bit RGB. Combinatorially, this
represents about 16.8 million colors. At most only a
few hundred colors are needed to display a chart
image. There are two basic appropaches: uniform
lattice color table, or customized color table.

Uniform lattice color table


If a set of colors is selected that forms a uniform
lattice in RGB color space, a color table is obtained
that can be used to map all color images. The
advantage of a uniform lattice color table is that only
one color table is needed, no matter how many
images are mapped. This is helpful especially when
seaming adjacent charts together, and when displaying several types of charts simultaneously. Some
low-end color display systems only have hardware for
one 8-bit color lookup table (LUT). High-end workstations generally will support either multiple 8-bit

1218

D. A. SOUTHARD
Table 2. Average times for VQ compression programs on 512 x 512 test images
Compression
Ave. time
ratio
Program
(sec)
Options/notes
3: 1

12 : 1

32:1

(24:1 effective storage


compression)

vqinit
vqcomp
vqexp

24.2

73.3

vqinit
vqcomp

49.1
26.2

vqexp
vqcmap

-h 1 -wl

- p 3 -1 256

0.7
- h 2 - w 2 - p 3 -1 256

vqexp

0.4
0.3
0.3

vqinit
vqcomp
vqexp

23.8
]9.7
6.3

full color

vqcmap

13.6
2.4

color mapped

vqexp

lookup tables, or a larger (i.e. 12-bit) lookup table


that can be segmented logically into several smaller
tables. A uniform lattice color table, then, will apply
to a wide range of workstation hardware.
Unfortunately, it was determined that uniform
lattice color tables compromise the quality of the
displayed charts. By attempting to cover the whole
gamut of color space, the color table can devote only
a few entries to any set of closely related colors.
Although only a few colors of ink are used in the
printed charts, many transitional shades of these
colors are needed to reproduce a clear image on the
computer display. Many VQ codes contain only
slightly different colors. Color mapping using a uniform lattice color table can map many code vectors
to the same set of values. This results in duplicate
entries in the codebook, and in many color table
entries going unused. Duplicate entries are useless
codes. They ultimately reduce the fidelity of VQ
decompression.

Customized color tables

The useless code problem can be eliminated by


using a customized color table. A customized color
table attempts to select the set of colors for each
compressed image, or class of images, that will result
in the least perceived change to the image. All color
table entries will be used in the decompressed image.
Six algorithms for developing custom color tables
were tested: popularity (Heckbert, 1982), median cut
(Heckbert, 1982), greedy seeding (Southard, 1992),
octree (Gervautz and Purgathofer, 1990), LBG clustering (Linde, Buzo, and Gray 1980), and PNN
clustering (Equitz, 1989).
The PNN clustering algorithm seems to have the
best balance of features. The perceived colors remain
faithful to the original chart because the algorithm
partitions the image vectors before clustering.
Although it is not as fast as the octree algorithm,
results are better. It is convenient that the same
algorithm can be used both for codebook develop-

full color
color mapped
-h4 -w4-p

3 -1 4096

ment and color-table development. Using a PNN


customized color table, a reasonable chart rendition was obtained with as few as 60 color table
entries.

Color mapping

A color from the image was matched to an entry


in the color table by selecting the entry with the least
squared error. For simplicity, the programs presented
here perform this calculation in RGB color space. It
has been noted, however, that better quality will be
obtained if all color coordinates are transformed first
to a perceptually uniform color space, such as the
CIE 1976 L *u*v * or L*a*b* color spaces (Wyszecki
and Stiles, 1982). These color spaces were designed
for measuring color differences. This method is useful
both for code vector classification during image
compression, and for color mapping before decompression.
Color mapping reduces the size of the codebook.
Besides the savings in space, an image decompressed
with the color-mapped codebook will work with a
color look-up table display with no additional processing. If a customized color table is used, the results
can be indistinguishable from the full RGB color
image. The compression ratio for a single image,
including the codebook, now becomes
WHc
wh

+ Mwh + Nc

where N represents the number of colors in the color


look-up table.
Appendix 3 discusses the image decompression
algorithm, and Appendix 4 presents the color mapping program.
RESULTS
Image quality is generally in inverse proportion to
compression ratio. Figure 2B shows a chart image

Compression of digitized map images

Figures 2A,B. Caption ot'erlea/:

1219

1220

D.A. SOUTHARD

Figure 2C
Figure 2. VQ compression of digitized chart image: A--original chart, B--after 12:1 compression:
C--after 24: compression.

Compression of digitized map images


compressed using a 2 x 2 combined component vector with a codebook having 256 entries, for a compression ratio of 12:1. Figure 2C shows the same
image compressed with a 4 4 combined component
vector with a codebook size of 4096 codes, having a
compression ratio of 32 : 1 using Equation (2), and an
effective storage compression of 24:1 using Equation
(3). For our application the higher compression ratio
worked as well. It seems that a larger vector and
codebook allow the clustering algorithm more degrees of freedom for tuning the codebook.
There are trade-offs, however. Larger vector sizes
and codebooks require more time for compression,
and more memory space for decompression. The
codebook indices will no longer fit neatly into l byte.
For example, in practice a 12-bit code may be stored
in 2 bytes, see Equation (3). This design eliminates
computational overhead for bit-packing and unpacking, but decreases the effective storage compression
ratio to 24:1. Finally, if we use the codebook to

1221

compress only one image, Equation (5) must bc used


to determine the effective compression ratio, which
now is reduced to only 7.9:1.
Table 2 shows average running times for each of
the programs described in the Appendices. We
measured these times on a Silicon Graphics 4D/340
workstation, using the UNIX time command.
VQ q/satellite imaget 3'

The VQ compression techniques used for digitized


chart images work well for monochrome satellite
images, too. A panchromatic SPOT image was compressed at two levels of compression: 2 x 2 vectors
with 256 codes for a ratio of 4: 1, and 4 x 4 vectors
with 4096 codes for a ratio of8:1 using Equation (3).
Figure 3 shows VQ applied to satellite imagery.
The programs in the Appendices also can be used
for compression and unsupervised classification of
multispectral imagery, such as provided by SPOT or
LANDSAT (Lillesand and Kiefer. 1987). For

Figure 3A. Caption on p. 1223.

1222

D, A. SOUTHARD

classification, we consider the ~'color" as comprising


all spectral bands. The LANDSAT Thematic
Mapper, for example, has seven spectral bands. The
P N N clustering algorithm can be used to group
pixels, each of seven components, into a small number of categories, which will correspond to various
classes of land cover. Furthermore, spatial code
vectors could be considered as unsupervised "texture
classes," groups of which may represent interesting image features. By arranging pixels and
spectral bands in various ways, one could produce
different combinations of classification and compression.

Compressing similar images


For VQ compression to be effective, the training
image must be representative of the images to be
compressed. A codebook can be reused as long as the
images belong to the same class. For example, a
codebook developed for a chart image containing

only desert would be unsuitable for images representing cities or forests. This notion of classes is easy to
understand intuitively, but image classes can be
difficult to identify a priori. This is a topic for further
study.

Selecting compression parameters


Selecting the right amount of compression depends
on the class of images to be compressed, and the
intended use after decompression. This must be determined through experimentation. A pixel block size
greater than 8 x 8 is not recommended. A large block
size requires a larger codebook: and at that size,
blocking in the decompressed image become apparent. As a rule of thumb, the block height and width
should be about the same size as the smallest significant details in the image. Then select the largest
codebook consistent with time and space performance constraints.

Figure 3B. Caption opposite.

Compression of digitized map images

1223

Figure 3C
Figure 3. VQ compression of SPOT satellite image: A--original: B after 4:1 compression; C--after 8 : 1
compression.
Acknowledgment This work was performed for the Electronic Systems Division of the U.S. Air Force Systems
Command, Hanscom Air Force Base, Massachusetts, under
contract No. F19628-89-C-0001.

REFERENCES

Abut, H., ed., 1990, Vector quantization: IEEE Press, New


York, p. 332 405.
Barad, H., and Martinez, A. B., 1989, Final report, digital
map research: Tech. Rept. 89-22, Signal & Image Processing Lab, Electrical Engineering Dept., School of
Engineering, Tulane Univ., Chapt. 3-5.
Bentley, J. L., 1975, Multidimensional binary search trees
used for associative searching: Comm. ACM, v. 18, no.
9, p. 509-517.
Defense Mapping Agency, 1989, Product specifications for
ARC Digitized Raster Graphics (ADRG): DMA
Aerospace Center, St. Louis, Missouri, unpaginated.
Equitz, W. H., 1989, A new vector quantization clustering
algorithm: 1EEE Trans. Acoust., Speech, Signal Processing, v. 37, no. 10, p. 1568 [575.

Friedman, J. H., Bentley, J. L., and Finkel, R. A., 1977, An


algorithm for finding best matches in logarithmic expected time: ACM Trans. Math. Software, v. 3, no. 3,
p. 209 226.
Gervautz, M., and Purgathofer, W., 1990, A simple method
for color quantization, in Glassner, A. S., ed., Graphics
gems: Academic Press, San Diego, California, p. 287 293.
Gonzalez, R. C., and Wintz, P., 1987, Digital image processing (2nd ed.): Addison Wesley Publ. Co., Reading,
Massachusetts, p. 163 173.
Heckbert, P., 1982, Color image quantization for frame
buffer display: Computer Graphics (Proc. SIGGRAPH),
v. 16, no. 3, p. 297 307.
Jaisimha, M. Y., Potlapalli, H., Barad, H., Martinez, A. B.,
Lohrenz, M. C., Ryan, J., and Pollard, J., 1989, Data
compression techniques for maps: Proc. IEEE 1989
Southeastcon, Charleston, South Carolina, p. 878 883.
Lillesand, T. M., and Kiefer, R. W., 1987, Remote sensing
and image interpretation (2nd ed.): John Wiley & Sons,
New York, p. 685~89.
Linde, Y., Buzo, A., and Gray, R. M., 1980, An algorithm
for vector quantizer design: IEEE Trans. Commun.,
v. COM-28, no. 1, p. 84 95.

1224

D. A. SOUTHARD

Lohrenz, M. C., Wischow, P. B., Rosche III, H., Trenchard,


M. E., and Riedlinger, L. M., 1990, The compressed
aeronautical chart database, support of naval aircraft's
digital moving map systems: IEEE PLANS '90, Position
Location and Navigation Symposium, Las Vegas,
Nevada, p. 67-73.
Mitchell, D. P., and Netravali, A. N., 1988, Reconstruction
filters for computer graphics: Computer Graphics (Prec.
SIGGRAPH), v. 22, no. 4, p. 221-228.
Murakami, T., Asai, K., and Itch, A., 1986, Vector quantization of color images: Prec. IEEE Conf. Acoust.,
Speech, Signal Processing, v. 1, p. 133-136.
Potlapalli, H., Barad, H., and Martinez, H., 1989, Digital
color map compression by classified vector quantization:
Prec. SPIE v. 1199, pt. 1, Visual Communications
and Image Processing IV, Philadelphia, Pennsylvania,

cation Techniques for Digital Map Compression:


Prec. 21st Southeastern Symposium on System Theory,
Tallahassee, Florida, p. 268-272.
Pratt, W. K., 1991, Digital image processing: John Wiley &
Sons, New York, p. 286--289.
Rabbani, M., and Jones, P. W., 1991, Digital image compression techniques: SPIE Optical Engineering Press,
New York, p. 45-221.
Ramamurthi, B., and Gersho, A., 1986, Classified vector
quantization: IEEE Trans. Commun., v. COM-34,
no. 11, p. 1105-1115.
Southard, D. A., 1992, Compression of digitized map
images: ESD-TR-92-052, Electronic Systems Division,
Hanscom Air Force Base, Massachusetts, unpaginated.
Welch, T. A., 1984, A technique for high performance data
compression: IEEE Computer, v. 17, no. 6, p. 8-19.
p. 55O-559.
Wyszecki, G., and Stiles, W. S., 1982, Color science, conPotlapaili,H., Jaisimha, M. Y., Barad, H., Martinez,A. B.,
cepts and methods, quantitative data and formulae (2nd
Lohrenz, M. C., Ryan, J.,and Pollard,J., 1989,Classified.): John Wiley & Sons, New York, p. 164-169.

APPENDIX

Program for VQ Codebook Development


The programs presented here in the Appendices are coded in ANSI C. They are designed to work in a generic UNIX
environment. The code could be adapted easily to a DOS environment. The standard C library functions for file I/O (e.g.
fopen, fprinff), memory allocation (e.g. malice), and sorting (e.g. qsort), are assumed available.

Description
The program vqinit calculates a VQ codebook from a training image. The command-line format is

vqinit(rows)(columns) [in.image] [out.code]


The arguments, (rows~ and (columns~, are required, and specify the size of the input image array. The [in.image]
argument is the name of the input image file. The input image is an array of pixel values stored in binary format, in
row-major order. For color images, each pixel comprises three bytes of R, G, and B, sequentially sorted in that order. If
this file name is omitted, the standard input file is default. An optional argument, [-p(pixel depth)], specifies the pixel
depth. A 1-byte pixel depth is used for a monochrome image; a 3-byte pixel depth is used for a RGB color image (this
is the default). The [out.code] argument is the name of the codebook file to be created. This codebook is created in an ASCII
text format. If this file name is omitted, the standard output file is default. An optional argument, [-1 (length)], specifies
the length of the codebook. The default length is 256. Two optional arguments, [ - h ( h e i g h t ) ] and [ - w ( w i d t h ) ] , specify
the size of the vector block. The default height and width is 2. To use this program to determine a customized color table,
simply use I-h1 - w l ]. The program then will cluster the colors of individual pixels.

Data structures
The central structure in this program is the k-d tree (Bentley, 1975). A k-d tree is a binary search tree for k-dimensional
data. It is an extension of a simple binary search tree, with the following properties:
Data items are k-dimensional records. The term vector is used to refer to a data item.
All records are stored in the terminal nodes (leaves) of the tree. Usually, a leaf contains more than one record. Such
a leaf also is termed a bucket node.
Nonterminal nodes of the tree do not contain records. Instead, they contain two values required for searching, and two
pointers, for the left and fight subtrees. The first value is termed the discriminator, or index, The index selects a value
from the record for comparison during a search. The second value is the partition, or threshold. The left subtree contains
indexed data values that are less than the threshold value; the right subtree contains indexed data values that are greater
than or equal to the threshold value. The nonterminal nodes are termed split nodes, because they split data records into
two distinct partitions.
For optimized searches, select the index of the record coordinate whose value has the greatest variance, and we select
the median of the indexed values for the threshold. For spatial partitioning, select the mean of the index values for the
threshold.
The structures used in this program are defined using a series of typedef declarations. The k-d tree is used to spatially
partition the data. The basic component of the k-d tree is the Cluster, which contains the centroid of a group of vectors,
and the number of vectors in the group. The clusters are coUected into Bucket nodes, which also maintain an array of
minimum and maximum values for the clusters in each bucket. This information is used to select index and threshold values
when the buckets become full, and must be split. A Split node contains the index and the threshold value, and pointers
to the left and right subtrees. A KDTreo is a union of a Bucket node and a Split node, with a variable to keep track of
which type is stored currently in the node. Finally, an array of Candidate nodes is used to keep track of potential clusters
for merging during the reduction steps.

Major functions
The major routines are rd_imago, build_tree, pnn, and wt_codebook. The function rd_image reads the input training
image file, and rearranges the pixels into vectors. The vectors then are passed to build_tree, which constructs a k-d tree

Compression of digitized map images

1225

by inserting each vector into the tree, one at a time. The insert routine recursively sorts each vector into the proper bucket,
and if a bucket is full, splits the bucket into two smaller buckets. When the tree is complete, it is passed to pnn, which
does the clustering. Clustering proceeds in multiple passes, by examining each bucket in the tree, and determining the pair
of clusters in each bucket that would introduce the least error if merged. A pair of clusters from a bucket is a candidate.
After all buckets have submitted a candidate, the candidates are sorted by the amount of error they would introduce. Some
buckets may not contain any close cluster, so only one-half of the candidates actually are merged at each iteration. After
each merge, some of the buckets may be less than one-half full. If so, the tree is re-balanced by pruning off those buckets,
and inserting their clusters into sibling branches. Merging passes continue until the desired codebook length has been
reached. At this point wt_eodobook outputs the remaining clusters to the codebook file.

Discussion
This program follows the description of Equitz (1989). The major portion of time is spent calculating the squared error
whereas searching for candidate cluster pairs. On some systems, however, the C library memory allocation routine, malloc,
could consume about 90% of execution time. This situation dramatically increases the time to build the tree. An optimized
version of malloc should be used, if available. Alternatively, a special routine could be written, which allocates all Cluster
nodes ahead of time, in one large block, and which doles them out one at a time. Another way to speed up the program
is to increase the proportion of clusters that are merged at each iteration.

Program L&ting
/**

FILE:

/**

PURPOSE:
USAGE:

vqinit.c
Vector quantization codebook generation for images.
vqinit <rows> <cols> [inimage] [codebook]
[-p <pixel_depth> ]
[-I <codebook_length>]
[-h <height>] [-w <width>]

ARGUMENTS:
inlmage
codebook
codebook_length
height
width
pixel_depth

:
:
:
:
:
:

input image file, default=stdin


output codebook file, default=stdout
length of codebook, default=256
vector block height, default=2
vector block width, default=2
number of bytes per pixel:
i: 8-bit monochrome
3: 24-bit RGB color (default)
7: LANDSAT TM

LANGUAGE: ANSI C, MIPS C compiler version


AUTHOR: David A. Southard, The MITRE Corporation
CREATED:

18 Oct 1991

NOTES/WARNINGS:
The PNN clustering algorithm is described by William H. Equitz,
'A New Vector Quantization Clustering Algorithm',
IEEE Trans. Acoust., Speech, Signal Processing, Vol. 37,
pp. 1568-1575, Oct. 1989.

/**/

The image row and column dimensions should be integer multiples


of the vector block height and width, respectively.
For simplicity, this program ignores left-over rows and columns.
An alternative, not implemented here, is to replicate
rows and/or columns to the next vector block boundary,
compress the augmented image, then discard the replicated rows
and/or columns when the image is decompressed.

#include <stdio.h>
#include <limlts.h>

/* standard I/O library definitions */


/* defines INT_MAX, etc. */

/* ......................................................................
/*
DATA STRUCTURES
/*
Structures for K-d tree used in PNN clustering algorithm
/ .......................................................................

*/
*/
*/
,/

#define MAXBKT
#define MINBKT

8
(MAXBKT/2)

/* max # of vectors per bucket */


/* min # of vectors per bucket */

typedef unsigned char

uchar;

/* ensure unsigned for portability */

typedef'enum ( BUCKET,

SPLIT }

CAGEO 18/9--1

NodeType;

1226

D.A. S o ~

typedef struct cluster_node

long
long
short
Cluster;

n;
error;
*array;

typedef struct bucket__node

/* structure for a cluster */


/* number of vectors in cluster */
/* total squared error for cluster */
/* buffer for vectors */
/* structure for bucket nodes */

long
short
short
Cluster
Bucket;

count;
*max;
*min;
**cluster;

typedef struct split node

long
long
struct kd_node
struct kd_node
Split;

index;
thresh;
*left;
*right;

typedef struct kd_node

/*
/*
/*
/*

number of n-dlmenslonal clusters */


pointer to block of max values */
pointer to block of mln values */
pointer to MAXBKT pointers */

/* structure for decision nodes */


/*
/*
/*
/*

index of split coordinate */


threshold for split coordinate */
for vectors < threshold */
for vectors >= threshold */

/* structure

for K-D tree nodes */

{
NodeType
union

type;

/* one of two types */

split;
bucket;
node;

/* decision node */
/* data node */

(
Split
Bucket
}

}
KDTree;

typedef struct candldate_node

/* structure for merge candidates */

KDTree
long
long
long
Candidate;

/* .....................

/*
/*
/*
/*

pointer to a bucket node */


merging error for candidates */
index of ist candidate member */
index of 2nd candidate member */

Z ................................................

/*
/* ......

*ptr;
err;
nol;
no2;

*/

_ ...............................................................

void
KDTree
KDTree
long
long
Candidate
void
void
void
KDTree
long
void
void
Cluster
KDTree
KDTree
void
KDTree
long
void
void
void

/*

*I

append ( KDTree *, long, long, short * );


balance ( KDTree * );
build_tree ();
candidate ( KDTree *, long *, long * );
compare_candldates ( Candidate *, Candidate * );
flnd..candidates ( KDTree *, Candidate * );
freenode ( KDTree * );
freecluster ( Cluster * );
get_args ( int, char ** );
insert ( KDTree *, long, long, short * );
match ( KDTree *, short * );
merge ( Candidate * );
myalloc ( long );
newcluster ( long, long, short * );
newnode ( NodeType );
pnn ( KDTree * );
setmaxmln ( KDTree * );
split ( KDTree * );
sqerr ( Cluster *, Cluster * );
traverse_tree ( KDTree * );
updmaxmln ( short *, short *, short * );
wt_codebook ( KDTree * );

/* ......................................

*/

Defaults, Manifest Constants

/* ......................................

#define
#define
#define
#define
#define
#define
#define
#define
#define

*/

FUNCTION PROTOTYPES

DFLT_DEPTH
DFLT_HEIGHT
DFLT_LEN
DFLT_WIDTH

3
2
256
2

ERROR

(-1)

FALSE
MINARGS
NORMAL
TRUE

0
2
0
(IFALSE)

*/
*/

/*
/*
/*
/*
/*
/*
/*
/*
/*

RGB plxels */
default vector block height */
default codebook length */
default vector block width */
error return code */
loglcal FALSE */
minimum # of command-line args */
normal return code */
logical TRUE */

1227

Compression of digitized map images


/* ......................................

*/

1"

Global Data

*I

cluster_size;
codebook_len = DFLT_LEN;
block_height = DFLT_HEIGHTI
b l o c k w i d t h = DFLT_.WIDTH;
n_buckets;
n_clusters;
n_cols;
n_rowsl
n_vectors;
pixel_depth = DFLT._DEPTH;

/*
1"
1"
/*
/*
/*
/*
/*
/*
/*

/* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

long
long
long
long
long
long
long
long
long
long

/* ......................

*/

/*

*/

Program

l* ......................

main

argc,
*argv[]

size of a cluster "1


size of codebook .1
height of vector block *I
width of vector block */
number of bucket nodes */
number of clusters */
number of columns in image */
number of rows in image */
number of vectors in image */
byte depth of a pixel */

*l

(
int
char

*!

/* VQ codebook generation */
/* argument count */
/* argument vector */

(
get_args ( argc, argv );
wt_codebook (
pnn (
build_tree ()

/* get co~mand-llne arguments */


/* output codebook */
/* compute PNN clusters */
/*
build k-d tree */

) )I
return

( NORMAL );

/* normal exit */

/* ......................

*/

/*

*/

get_args

/* . . . . . . . . . . . . . . . . . . . . . .

*/

char
*prototype = "<rows> <cols> [image] [codebook]
<height>] [-w <width>] [-i <codebook length>]';
void get_args
int
char

/* scan command-line arguments */


/* argument count */
/* argument vector */

i;
Jl

/* argument index */
/* positional argument index */

(
argc,
*argv[]

[-p <pixel_depth>]

[-h

(
register
register
if ( a r g c

< MINARGS + 1 )

fprintf ( stderr,
exit ( ERROR );

'usage: %s %s\n', argv[ 0 ], prototype

)1

for ( i = 1, J = 0 ; i < argc ; i++ )


{

if ( argv[ i ][ 0 ] == '-' ) switch

( argv[ i ][ 1 ] )

case

'p':
i++1
pixe1_depth = atoi ( argv[ i ] );
fprintf'( stderr, 'Pixel depth = %Id\n', plxel_depth ]1
break;
case 'h':
i++1
block_height = atol ( argv[ i ] );
fprintf ( stderr, "Block height = %id\n', block_height )1
breakl
case 'i':
i++;
codebook..len = atol ( a r g v [ i ] );
fprintf ( stderr, "Codebook length = %id\n', codebook_len ];
break;
case 'w':
i++;
block_width = atoi ( argv[ i ] )I
fprintf ( stderr, "Block width = %ld\n', block_width )1
break;
default:
fprintf ( stderr,
'Unrecognized option flag \'%s\'\n',
argv[ i ] );
break;
)

else switch

( ++J )

1228

D.A. Sotrrs.~

{
case 1 :
n_rows = atoi
break;

( argv[ i ] );

n_cols = atol
break;

( argv[i

case 2:
] );

case 3:
if

( freopen

( argv[ i ], 'r', stdln ) == NULL )

(
fprlntf ( stderr,
%s: Cannot open image file \'%s\'

for

readingl\n',
exit

argv[ 0 ], argv[ i ] );
( ERROR );

)
break;
case 4.
if ( freopen

( argv[ i ], 'w', stdout

) == NULL )

{
fprintf ( stderr,
%s: Cannot open codebook file \'%s\" for
outputlkn',
exit

argv[ 0 ], argv[ i ] );
( ERROR );

)
break;
default:
fprlntf

( stderr,
U n r e c o g n i z e d positional argument
argv[ i ] );

\'%s\'\n',

break;

/* ......................

*/

/*

*/

build_tree

/* ......................

KDTree *build tree

*/

()

/* construct k-d tree */

{
long
uchar
long
register
register
register
register
short
register short
register uchar
long
long
long
register
KDTree

code_wldth;
**bur;
buf_slze;
i;
J;
k;
i;
*v;
*pv;
*pb;
row_size;
xslze;
yslze;
y;
*tree = NULL;

n_buckets
n_clusters
xsize
yslze
n_vectors
code_wldth
cluster_slze
bur_size
row_slze

=
=
=
=
=
=
=
=
=

/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*

w i d t h of code block in bytes */


input image row buffers */
size of input row buffer */
index on block rows */
index on block columns */
index on pixel depth */
index on block width */
buffer for single vector */
p o i n t e r into vector buffer */
p o i n t e r into image buffer */
truncated row size in bytes */
truncated image height */
truncated image height */
index on compressed image rows */
tree to be constructed */

0;
0;
n_cols / block_wldth;
n_rows / block_helght;
xsize * ysize;
block_wldth * pixel_depth;
block_helght * code_width;
plxel_depth * n_cols;
buf_size - buf_slze % code_wldth;

v = (short *) myalloc ( cluster_slze * sizeof ( s h o r t ) ) ;


buf = (uchar **) myalloc ( block_helght * sizeof (uchar *) );
for ( i = 0 ; i < block_helght ; i++ )

(
buf[ i ] = (uchar *) myalloc

( buf_size * slzeof

(uchar)

);

)
for

( y = yslze

; y-- ; )

(
for ( i = 0 ; i < block_helght

; i++ )

{
fread

( buf[ i ], sizeof

( uchar ), buf_slze,

)
for

( 1 = 0 ; 1 < row_slze

; 1 += code_width )

stdln )

Compression of digitized map images


p v = v;
for ( i = 0 ; i < b l o c k _ h e i g h t

1229

; i++

{
p b = buf[ i ] + i;
for ( J = b l o c k _ w l d t h

; J--

; )

{
for

( k = pixel_depth

; k--

{
*pv++

= *pb++;

}
}
}
tree

for

= insert

( tree,

( i = 0 ; i < block_helght

; i++

I,

0, v );

/* free

row b u f f e r s

*/

{
free

( bur[

i ] );

)
free (
free (
fclose
return

b u f );
v );
( s t d i n );
( tree );

/*
/*
/*
/*

/t ......................

/*
KDTree

*/

insert

*/

I* ......................

*I

*insert
KDTree
long
long
short

/*
/*
/*
/*
/*

if

(
*tree,
n,
e,
*vector

( tree

== N U L L

free b u f f e r p o i n t e r s */
free v e c t o r b u f f e r */
close input file */
r e t u r n tree c o n s t r u c t e d */

insert one v e c t o r into k-d tree */


p o i n t e r to c u r r e n t tree node */
n u m b e r of v e c t o r s r e p r e s e n t e d */
total e r r o r r e p r e s e n t e d */
v e c t o r to be i n s e r t e d */

/* i n i t i a l i z e

tree

*/

(
tree = n e w n o d e ( B U C K E T );
tree->node.bucket.count
= I;
tree->node.bucket.cluster[
0 ] = newcluster
( n, e, v e c t o r );
memcpy ( tree->node.bucket.max,
vector, c l u s t e r _ s i z e * sizeof

(short)

memcpy

(short)

);
( tree->node.bucket.min,

vector,

cluster_slze

* sizeof

);

)
else if ( t r e e - > t y p e

== B U C K E T

/* put v e c t o r

in b u c k e t

*/

(
register
if

i;

( ( i = mtch

( tree,

vector

) ) >= 0 )

{
tree->node.bucket.cluster[
tree->node.bucket.cluster[

i ]->n += n;
i ]->error += e;

)
else

if

( tree->node.bucket.count

< MAXBKT

(
append

( tree,

n, e, v e c t o r

);

)
else

{
tree

= insert

( split

== S P L I T

( tree

), n,

e, v e c t o r

);

)
else if ( t r e e - > t y p e

/* c o n t i n u e

search

down

tree

*/

(
if

( vector[tree->node.split.lndex]

> tree->node.split.thresh

{
tree->node.split.right
= insert
tree->node.split.rlght,

(
n, e, v e c t o r

);

)
else

{
tree->node.spilt.left
= insert
tree->node.split.left,

(
n, e, v e c t o r

);

)
else

{
f p r i n t f ( stderr,exit ( E R R O R );

'Error

in insert:

unknown

tree n o d e

type\n'

);

1230

D.A. S o ~
return

( tree );

/* ................

/*

L .....

newnode

*/

/* ......................

K D T r e e *newnode (
NodeType
{
KDTree

*/

*/

klnd )

/* allocate and inlt a new tree node */


/* type of tree node */

*tree;

/* pointer to new node */

tree = (KDTree "1 m y a l l o c ( slzeof (KDTree) 1;


tree->type = kind;
switch ( kind )
(
case BUCKET:
/* initialize bucket node */
tree->node.bucket.count = 0;
t r e e - > n o d e . b u c k e t . m a x = (short *) myalloc ( cluster_slze * sizeof(short)
tree->node.bucket.mln = (short "1 myalloc ( cluster_slze * slzeof(short)
tree->node.bucket.cluster = (Cluster *'1 myalloc ( M A X B K T *
slzeof (Cluster *) 1;
n_buckets++;
break;
case SPLIT:
/* initialize spllt node */
tree->node.split.index
= -1;
tree->node.split.left
= NULL;
tree->node.spilt.right
= NULL;
break;
default:
fprintf ( stderr, Error is newnode: unknown node type\n" );
exit ( ERROR 1;
}
return ( tree 1;
/* ......................

*/

/*

*/

newcluster
/ * ......................
cluster *newcluster (
long
hum,
long
err,
short
*vector )
Cluster

*c;

*/

/*
/*
/*
/*

allocate and Inlt a n e w cluster */


number of vectors accumulated */
total error In cluster */
pointer to block of size cluster_slze

*/

/* pointer to new cluster */

= (Cluster *) myalloc

c->n

( slzeof

(Cluster));

hum;

c->error= err;
c->array= (short *) myalloc ( cluster_slze * slzeof ( short ) 1;
m e m c p y ( c->array, vector, cluster_slze * sizeof ( short ) 1;
n_clusters++;
return ( c 1;
/* ......................

*/

/*
match
I* ......................

*/
*I

#define N O T _ F O U N D

(-11

long m a t c h (
KDTree
short

*tree,
*vector )

/* search bucket for matchlng cluster */


/* pointer to bucket node */
/* pointer to v e c t o r to match */

register
register

i;
J;

/* outer loop counter: cluster index */


/* inner loop counter: vector element index */

for ( I = 0 ; i < tree->node.bucket.count ; i++ )


{
for ( J = 0 ; J < cluster_size ; J++ )
{
if ( tree->node.bucket.cluster[ i ]->array[ J ] I=
vector[ J ] ) break;
)
if ( J == cluster_slze ) return ( i );
)
return ( N O T _ F O U N D );
/* ......................

*/

/*
split
/* ......................

*/
*/

);
1;

Compression of digitized map images


KDTree

*split
KDTree

1231

/* split a bucket into two pieces */


*tree )

register
register
register
long
long
KDTree
long
long

i;
il;
Jr;
maxindex;
maxrange;
*newtree;
range;
threshold;

/*
/*
/*
/*
/*
/*
/*
/*

loop variable */
left branch cluster counter */
right b r a n c h cluster counter */
m a x range ordinate index */
m a x ordinate range */
n e w tree node */
current ordinate range */
split threshold */

/* ......................................................

*/

/*
Find the ordinate with the greatest variance
/, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*/
,/

m a x i n d e x = 0;
maxrange = tree->node.bucket.max[ 0 ] - tree->node.bucket.min[ 0 ];
for ( i = 1 ; i < cluster_size ; i++ )
{
range = tree->node.bucket.max[ i ] - tree->node.bucket.mln[ i ];
if ( range > maxrange )
{
m a x i n d e x = i;
maxrange = range;

}
)
1" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
/*
If clusters are identical, condense into one
/*
cluster and return immediately
l* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*/
*/
*/
"1

if ( maxrange == 0 )
{
for ( i = 1 ; i < tree->node.bucket.count ; i++ )
{
tree->node.bucket.cluster[ 0 ]->n +=
tree->node.bucket.cluster[ i ]->n;
tree->node.bucket.cluster[ 0 ]->error +=
tree->node.bucket.cluster[ i ]->error;
freecluster ( tree->node.bucket.cluster[ i ] );
)
t r e e - > n o d e . b u c k e t . c o u n t = I;
return ( tree );
)
/* ......................................................

*/

/*
Otherwise, allocate a new split node,
/*
and two n e w bucket nodes, initialize
/* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*/
*/
*/

threshold = ( tree->node.bucket.max[ maxindex ] +


tree->node.bucket.min[ m a x i n d e x ] ) / 2;
newtree
= newnode ( SPLIT );
newtree->node.spllt.lndex
= maxlndex;
newtree->node.split.thresh
= threshold;
newtree->node.split.left
= newnode ( BUCKET );
newtree->node.spllt.right
= newnode ( BUCKET );
/* ......................................................

*/

/*
Partitions existing clusters to left or right
/*
branches a c c o r d i n g to threshold test.
/* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*/
*/
*/

il = ir = 0;
for ( i = 0 ; i < tree->node.bucket.count ; i++ )
{
if ( tree->node.bucket.cluster[ i ]->array[ maxindex ] >
threshold )
{
newtree->node.split.rlght->node.bucket.cluster[
Jr++ ] =
newcluster (
tree->node.bucket.cluster[ i ]->n,
tree->node.bucket.cluster[ i ]->error,
tree->node.bucket.cluster[ i ]->array );

)
else
{
newtree->node.split.left->node.bucket.cluster[
if++ ] =
newcluster (
tree->node.bucket.cluster[ i ]->n,

1232

D, A. SOUTHARD
tree->node.bucket.cluster[
tree->node.bucket.cluster[

i ]->error,
i ]->array );

}
newtree->node.split.left->node.bucket.count

newtree->node.spllt.rlght->node.bucket.count

= il;
= Jr;

/* ......................................................

*/

I*

*/

Reset the m a x / m l n arrays for both buckets

/* ......................................................

setmaxmin
setmaxmin

*/

( newtree->node.split.left );
( newtree->node.split.right );

I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*I

/*
/*

*/
*/

D e - a l l o c a t e bucket node, and


return new split node.

I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

,i

freenode ( tree );
return ( newtree );
/* ......................

*/

/*

append

*/

/* . . . . . . . . . . . . . . . . . . . . . .

*/

void append (
KDTree
long
short

*tree,
num,
err,
*vector

register

i;

long

/*
/*
/*
/*
/*

append one cluster to bucket */


pointer to bucket node */
number of vectors represented */
max squared error for cluster */
vector to be appended */

/* new cluster index */

i = tree->node.bucket.count++;

tree->node.bucket.cluster[
if ( i == 0 )

i ] = newcluster ( n u m , err, vector


/* bucket was empty */

);

setmaxmin

( tree );

else

I* initialize maxlmln arrays *I


/* update max/mln arrays */

u p d m a x m i n ( tree->node.bucket.max,
v e c t o r );

/* . . . . . . . . . . . . . . . . . . . . . .

/*

setmaxmin

/* ......................

void setmaxmln
KDTree

(
*tree )

tree->node.bucket.min,

*/

*/

*/

I* set the max/mln arrays in a bucket *I


/* pointer to Bucket type tree node */

register

i;

/* loop counter over clusters in bucket */

if ( t r e e - > n o d e . b u c k e t . c o u n t

> 0 )

memcpy

( tree->node.bucket.max,
tree->node.bucket.cluster[ 0 ]->array,
cluster_size * slzeof ( short ) );
m e m c p y ( tree->node.bucket.min,
tree->node.bucket.cluster[ 0 ]->array,
cluster_size * slzeof ( short ) );
for ( i = i ; i < t r e e - > n o d e . b u c k e t . c o u n t
; i++ )
{

updmaxmin ( tree->node.bucket.max,
tree->node.bucket.min,
tree->node.bucket.cluster[

i ]->array );

/* ......................

/*

updmaxmln

I* . . . . . . . . . . . . . . . . . . . . . .

void updmaxmln
short
short
short

(
*max,
*min,
*vector )

*/

*/
*I

/*
/*
/*
/*

update the max/min arrays in a bucket


array of max values */
array of min values */
new cluster vector */

*/

Compression of digitized map images


register
for
(

i;

/* l o o p i n g

( i = 0 ; i < cluster_size
if
{

( max[

i ] < vector[

counter

1233

*/

; i++ )

]* for each o r d i n a t e

i ] )

I* g r e a t e r

max[ i ] = vector[ i ];
)
else if ( mln[ i ] > vector[ i ] )
{
mln[ i ] = vector[ i ];
)

*/

than m a x ? */

/* save it */
/* less than m l n ? */
/* save it */

/* ......................

*/

/*
cluster
*/
i* . . . . . . . . . . . . . . . . . . . . . . *i
KDTree

*pnn (
KDTree

*tree

/* PNN c l u s t e r i n g a l g o r i t h m */
/* k-d tree of input image */

(
Candidate
register
Candidate
long
while
(

*candidates;
/* a llst of m e r g e c a n d i d a t e s */
i;
/*
* n e x t _ c a n d l d a t e ; l * p o i n t e r into c a n d i d a t e llst */
n_candldates;
/* number of c a n d i d a t e s (to merge)

( n_clusters

> codebook_len

*/

tree
candidates

= b a l a n c e ( tree );
= (Candidate *) m y a l l o c ( n b u c k e t s *
sizeof ( C a n d i d a t e ) ) ;
n candldates
= f i n d _ c a n d i d a t e s ( tree, c a n d i d a t e s ) candidates;
qsort ( candidates, n _ c a n d l d a t e s , sizeof ( C a n d i d a t e ),
c o m p a r e _ c a n d l d a t e s );
n_candidates
= n_clusters - codebook_len < n_candldates
n_clusters - codebook_len z
n _ c a n d l d a t e s / 2;
for
(

( i = 0 ; i < n_candidates
merge

)
free
)
return

( candidates

( candidates

( tree

/ 2 ?

; i++ )

+ i );

);

);

)
I* . . . . . . . . . . . . . . . . . . . . . .
/*
balance
I* . . . . . . . . . . . . . . . . . . . . . .
KDTree

*balance
KDTree

(
*tree

,i
*/
*I
/* given a s u b - t r e e return a b a l a n c e d
/* current root of sub-tree */

version

"1

(
register
register
register
register
if
(

Cluster

*c;
i;
*tl;
*tr;

KDTree
KDTree

( tree->type

== S P L I T

/*
/*
/*
/*

p o i n t e r to cluster */
l o o p i n g counter "1
left sub-tree */
right sub-tree */

tl = t r e e - , n o d e . s p i l t . l e f t ;
tr = t r e e - > n o d e . s p l l t . r l g h t ;
if
(

( tl-,type
for
(

== B U C K E T && t l - > n o d e . b u c k e t . c o u n t
( i = 0 ; i < tl->node.bucket.count

< MINBKT
; i++ )

c = tl->node.bucket.cluster[
i ];
tr = insert ( tr, c->n, c-,error, c - > a r r a y
)
tree->node.spllt.rlght
freenode ( tree );
tree = tr;
)
else if
{

( tr->type
for
{

);

= NULL;

== B U C K E T && t r - > n o d e . b u c k e t . c o u n t

( i = 0 ; i < tr->node.bucket.count
c

= tr-,node.bucket.cluster[

; i++
i ];

< MINBKT
)

1234

D . A . SOUTHARD
~i = insert ( tl, c->n, c->error,
}
tree->node.spilt.left = NULL;
freenode ( tree );
tree = tl;
)
else
{

c->array

);

/* recursive descent of tree */


tree->node.spilt.left
tree->node.spllt.rlght

= balance
= balance

( tl );
( tr );

}
}
return

( tree );

/* ......................

*/

/*
freenode
*/
I* ...................... *i
void freenode (
KDTree
{
register

/* free memory allocated for KDTree */


*tree ) /* poiner to tree */
i;

/* looping counter */

if ( tree ]= NULL ) switch ( tree->type )


{
case BUCKET:
for ( i = 0 ; i < tree->node.bucket.count ; i++ )
{
freecluster ( tree->node.bucket.cluster[ i ] );
)
free ( tree->node.bucket.cluster );
free ( tree->node.bucket.max );
free ( tree->node.bucket.mln ); '
free ( tree );
n_buckets--;
break;
case SPLIT:
freenode ( tree->node.split.left );
freenode ( tree->node.spilt.right );
free ( tree );
break;
default:
fprlntf ( stderr, 'Error in freenode: unknown node type\n"
exit ( ERROR );
)
/* ......................

*/

/*
freecluster
*/
i, ...................... ,i
void freecluster (
Cluster
*cluster )
{
free ( cluster->array );
free ( cluster );
n_clusters--;
)
/* ......................

);

/* free a cluster array */


/* pointer to cluster */

*/

/*
find_candldates
*/
/* ...................... */
/* find candidate cluster pairs */
Candidate *flnd candldates (
/* tree to search */
KDTree
*tree,
/* pointer into candidate array */
Candidate
*cand )
{
if ( tree == NULL )
(
fprintf ( stderr, 'Error in flnd_candidates: null tree polnter\n"
exit ( ERROR );
)
switch ( tree->type )
{
case SPLIT:
cand = flnd_candldates ( tree->node.spilt.left, cand );
cand = flnd_candidates ( tree->node.spilt.right, cand );
break;
case BUCKET:
if ( tree->node.bucket.count > i )
{
cand->ptr = tree;

);

1235

Compression of digitized map images


cand->err = candidate
cand++;

( tree, &cand->nol,

&cand->no2

);

)
break;

default:
fprlntf ( stderr,
exit ( ERROR );
)
return

"Error in flnd_candldates:

( cand );

u n k n o w n tree type\n"

);

I* return pointer to next empty slot */

/*
candidate
/ * ......................

*/
*/

long candidate (
KDTree
*tree,
long
*mini,
long
*mlnJ )
{
register Cluster
**c;
register
i;
register
J;
register
i;
register
11;
long
mlnerr;
long
error;

/*
/*
/*
/*

find index of candidate for bucket */


bucket node of tree */
index of first in candidate pair */
index of second in candidate pair *I

c
= tree->node.bucket.cluster;
1
= tree->node.bucket.count;
ii
= 1 - I;
mlnerr
= LONG_MAX;
for ( i = 0 ; i < Ii ; i++ )
{
for ( J = i + 1 ; J < 1 ; J++ )

/*
/*
/*
/*
/*
/*
/*

cluster
looping
looping
looping
looping
minimum
current

pointer array */
index */
index */
limit */
limit - 1 */
error encountered */
error */

/*
/*
/*
/*
/*

pointer to cluster */
length of cluster */
length minus one */
a giant number */
possible first cluster */

/* possible second cluster */

{
error = sqerr ( c[ i ],
if ( error < mlnerr )
{
minerr
= error;
*mini
= i;
*mlnJ
= J;
)

c[ J ] );
/* possible candidate

);

/* return value of mln */

? */

/* save first index */


/* gave second index */

)
)
return

( mlnerr

/* ......................

*/

/*
sqerr
*/
I* ...................... *I

gum

'\

/* calculate weighted squared error for clusters

long sqerr (
Cluster
Cluster
(
register
register
register

*cl,
*c2 )

/* two clusters */

d;
i;
sum;

/* difference value */
/* ordinate index within cluster arrays */
/* error accumulator */

O;

for ( i = 0 ; i < cluster_slze

; i++ )

{
d
sum

= cl->array[
+= d * d;

i ] - c2->array[

i ];

)
return

( sum * cl->n * c2->n / ( cl->n + c2->n ) );

)
I* ......................
/* c o m p a r e _ c a n d l d a t e s
/ * ......................

*I
*/
*/

long c o m p a r e _ c a n d l d a t e s (
Candidate
*nol,
Candidate
*no2 )
(
return ( nol->err - no2->err
/* ......................

*/

/*
merge
/. ......................

*/
*/

/* compare weighted error for sort */


/* pointer to first candidate */
/* pointer to second candidate */
); /* <0 is LT, =0 is EQ, >0 is G T */

*/

D. A. S o ~

1236
void

merge (
Candidate

*cand )

/* merge a candidate pair */


/* a split node with two bucket children */

register
register
Cluster
Cluster

i;
m;
*cl;
*c2;

/*
/*
/*
/*

looping counter */
merged total */
first cluster */
second cluster */

if ( cand->no2 <= cand->nol )


{
fprintf ( stderr, 'Error in merge:
exit ( ERROR );
}
cl
c2
m
for ( i
{

=
=
=
=

cand->ptr->node.bucket.cluster[
cand->ptr->node.bucket.cluster[
cl->n + c2->n;
0 ; i < cluster size ; i++ )

candidates out of order\n"

cand->nol
cand->no2

cl->array[i] = ( cl->n * cl->array[i]


)
cl->n
= m;
cl->error = cand->err;
freecluster

);

];
];

+ c2->n * c2->array[i]

) / m;

( c2 );

cand->ptr->node.bucket.count--;
for ( i = cand->no2 ; i < cand->ptr->node.bucket.count ; i++ )
{
cand->ptr->node.bucket.cluster[ i ] =
cand->ptr->node.bucket.cluster[ i + 1 ];
)
/* ......................

*/

/*
Wt codebook
/* ......................

*/
*/

void wt codebook (
/* output cluster in codebook form */
KDTree
*tree )
/* tree holding m e r g e d clusters */
{
p r i n t f ( "%d\t# serial number of this codebook\n', time ( 0 ) );
p r i n t f ( "%Id %id %idkt# row, col, pixel size of vector blocks\n',
block_height, block_width, plxel depth );
p r l n t f ( "%idkt# size of codebookkn', n clusters );
traverse_tree ( tree );
)
/. ......................
/*
traverse tree
I* ......................

./
*/
*I

v o i d traverse_tree (
/* recursive descent for output */
KDTree
*tree ) /* pointer to tree h o l d i n g clusters */
{
if ( tree->type == SPLIT )
(
traverse_tree ( tree->node.spilt.left );
traverse tree ( tree->node.spllt.rlght );
}
else
/* BUCKET node */
{
register Cluster
*c;
/* pointer to a cluster */
register
i;
/* looping index for clusters */
static
index = 0; /* code index */
register
J;
/* byte index within cluster */
for
(

( i = 0 ; i < tree->node.bucket.count

; i++ )

c = tree->node.bucket.cluster[ i ];
for ( J = 0 ; J < cluster~size ; J++ )
{
prlntf ( '%3d ', c->array[ J ] );
)
prlntf ( "\t# code %3d (%5.3f%%)\n', index++,
I00.0 * c->n / n vectors );
)
)
)
/* ......................

*/

/*
myalloc
/* ......................

*/
*/

1237

Compression of digitized map images


v o i d *myalloc (
register long

nbytes )

/* memory allocation w/ error check */


/* number of bytes requested */

*p;

/* pointer to allocated m e m o r y */

(
register v o i d

p = (void *) malloc
if ( p == N U L L )

)
return

( nbytes ); /* allocate memory */


/* check for error */

fprintf ( stderr,
exit ( ERROR );
( p );

'Insufficient available memory\n'

);

/* return pointer */

APPENDIX 2
Program for VQ Compression
Description
The VQ compression program, vqcomp, compresses an image based on a codebook supplied as input. The command
line format is

vqcomp<rows><cols>(in.code> [in.image] [out,vq]

The arguments, <rows> and <cols>, are required, and specify the size of the input image. The argument <in.code> is
required, and is the name of a codebook file developed by the program vqinit, described in Appendix 1. The next argument,
[in.image], is the name of the image file. The input image is an array of pixel values stored in binary format, in row-major
order. If this file name is omitted, the standard input file is default. The pixel depth (l-byte monochrome, or 3-byte RGB
color) is taken from the codebook file. The compressed image is placed in the file named [out.vq], or if this is omitted,
in the standard output file. The compressed image file contains a serial number in its header, so that only the correct
codebook can be used for decompression.

Data structures
The central data structure is once again the k-d tree. The structure is nearly identical to that used in vqinit, but this
time it is used for a different purpose. Instead of spatially partitioning image data, the k-d tree is used to speed up the
search of the codebook for closest matches.

Major functions
The major functions are rd_codos, build_tree, and match. The routine rd_codes reads the codebook from the input file,
and returns a CodeBook structure. Then build_tree constructs a k-d tree from the codebook. This routine uses a recursive,
bottom-up method for building an optimized k-d search tree (Friedman, Bentley, and Finkel, 1977), in contrast to the
top-down approach adopted in vqinit. The main routine, using several nested loops, rearranges the image data into vector,
then calls match to locate the code that best matches each vector, using a squared error metric. The index of this code
is saved in the compressed image. If there are 256 codes, or less, the compressed image is stored as an array of bytes.
Otherwise, each code requires 2 bytes of storage.
The bulk of the work is done by match, which searches the k-d tree for a code that matches each vector. The details
of this search algorithm are provided by Friedman, Bentley, and Finkel (1977). The outline is as follows. The function match
calls search, which recursively descends the tree. As the k-d tree is traversed, the threshold value at each split node
progressively limits the search to a smaller and smaller k-dimensional box. The boundaries of the box are kept track of
as the tree is descended. Eventually, a bucket node is reached. All vectors in the bucket are searched for the closest match.
There are at most eight vectors in each bucket, so this cost is reasonable. The closest match has the minimum squared error.
The minimum squared error defines a k-dimensional ball around the data vector. If the ball is contained entirely within
the boundaries of the current box, and it is done. This is termed the ball-within-boundstest; it is implemented in the function
bwb. When a recursive call of search returns, if we are not done, the other branch of the tree is checked to see if its bounding
box overlaps the ball. This is termed the bounds-overlap-balltest; it is implemented in the function bob. If the bounds
overlap, the other branch will not necessarily have a closer match, but it is necessary to descend the branch to check if
it does. Continue to test, and descend, if necessary, sibling branches higher up in the tree until everything is done.
An effective optimizing technique is used throughout the routines search, bwb, and bob. Most of the time is spent
calculating the squared distance between two vectors. It is not really necessary to know the value in each situation. It is
necessary to know whether the minimum value is obtained, or whether a boundary value has been exceeded. In most
instances this calculation can be terminated early, before all k dimensions are processed, as soon as the partial sum exceeds
the current minimum value. This technique dramatically improves the performance of the k-d tree search algorithm.

Program Listing
/**
/**
/**
/**
/**
/**
/**
/**
/**
/**
/**

FILE:
PURPOSE:
USAGE:

vqcomp.c
Fast VQ image compression,

using K-d tree search method.

v q c o m p <rows> <cols> <codebook>

[input]

[output]

ARGUMENTS:

rows
cols
codebook

: number of rows in input image


: number of columns in input image
: name of file containing VQ codebook

1238

D . A . SOUTHAaD
input
output

p**

name of input image file (stdin)


(stdout)

: name of output image file

LANGUAGE: ANSI C
t**

AUTHOR: David A. Southard, The MITRE corporation


I**

/**/

cREATED: 29 Oct 1991


NOTES/WARNINGS:
K-d tree search is a version of the algorithm presented by
JH Friedman, JL Bentley, RA Finkel, "An Algorithm
for Finding Best Matches in Logarithmic Expected Time,'
ACM Trans. Math. Softw. 3(3) Sep. 1977 pp. 209-226.
/* standard I/O function definitions */
/* contains MAX_INT, etc. *I
/* math functions log (), etc. */

#include <stdlo.h>
#include <limlts.h>
#include <math.h>

/* ..............................................................

*/

/*
MANIFEST CONSTANTS
/ * ..............................................................

*/
*/

#define BITS
#define BUCKET SIZE
#define ERROR
#define FALSE
#deflneMINARGS
#define NORMAL
#define TRUE

I*
I*
/*
/*
/*
/*
/*

8
8
(-i)
0
3
0
(]FALSE)

number of bits per byte */


number of vectors per bucket */
error return code */
logical FALSE */
mln. number of arguments required */
normal return code */
logical TRUE */

/* ..............................................................

*/

/*
/*

*/
*/

DATA STRUCTURES
Vector Quantizatlon Codebook

I* ..............................................................

typedef unsigned char

uchar;

I* ensure unsigned for portability *I


/* structure defining a single code */

typedef struct
long
uchar
Code;

*I

index;
*vector;

/* index of the code */


/* pointer to code vector */
/* structure defining a book of codes */

typedef struct
{
long
num;
Code
*book;
CodeBook;
#define CODE(J)
#define VALUE(J,I)

/* number of code vectors in codebook */


/* pointer to a collection of Codes */
(codebook.book[J].vector)
(CODE(J)[I])

/* ..............................................................

*/

/*
Structures for K-d tree used in PNN search algorithm
*/
/, .............................................................. *i
typedef enum { BUCKET, SPLIT )

NodeType;

typedef struct spilt_node


{
long
disc;
long
part;
struct kd_node *left;
struct kd_node *right;
)
Split;

/* structure for decision nodes */

typedef struct kd_Dode


{

/* structure for K-D tree nodes */

NodeType

union
{
Split
codeBook
}
KDTree;

/*
/*
/*
/*

index of discriminator coordinate */


partition value for discriminator */
for vectors <= threshold */
for vectors > threshold */

type;

/* one of two types */

split;
bucket;
node;

/* decision node *I
/* data node *I

/* ..............................................................

*/

/*

FUNCTION PROTOTYPES

*/

long

bob ( uchar * );

/* ..............................................................

*/

1239

Compression of digitized map images


*build__tree ( CodeBook );
bwb ( uchar * );
cmpr ( Code *, Code * );
discrim ( CodeBook );
dlst ( long, uchar *, uchar * );
get__args ( int, char ** );
medianof ( long, CodeBook );
*newbucket ( CodeBook );
*newspllt ( long, long, KDTree *, KDTree * );
*match ( uchar *, KDTree * );
rd_codes ( FILE * );
search ( uchar *, KDTree * );
split_left ( long, CodeBook );
split_rlght ( long, CodeBook );
spreadof ( long, CodeBook );

KDTree
long
int
long
long
void
long
KDTree
KDTree
Code
CodeBook
long
CodeBook
CodeBook
long

/* ...........................................

/*

r ..................

GLOBAL DATA

*/

/* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

block_height;
block_wldth;
*codef;
*lower;
*matched;
nearest;
n_cols;
n_rows;
pixel_depth;
serno;
*upper;
vector slze;

long
long
FILE
uchar
Code
long
long
long
long
long
uchar
long

*/

I*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*

*/

height of vector block */


width of vector block */
codebook input file stream */
lower bounds for search */
pointer to matched code */
interim nearest code match index */
number of columns in input image */
number of rows in input image */
size of pixel */
codebook serial number */
upper bounds for search */
size of code vector */

/* ..............................................................

*/

/*

PROGRAM

*/

I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*I

main

(
int
char

argc,
*argv[]

CodeBook
long
uchar
register
uchar
register
register
register
long
register
register
register
short
KDTree
uchar
long
long
long
long
long

codebook;
code_width;
*cvq;
i;
**image;
J;
k;
I;
n_bytes;
*pl;
*pv;
*pvq;
*svq;
*tree;
*v;
x;
xsize;
y;
yslze;
zsize;

/* VQ compression */
/* argument count */
/* argument vector *I

long
long
long
long
uchar
uchar
short

/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*

the codebook */
width in bytes of a vector block */
(uchar) compressed row buffer */
vector block row index */
rows of input image */
vector block column index */
pixel byte (depth) index */
plxel byte (depth) index */
bytes in input image row */
pointer into image buffer */
pointer into vector buffer */
pointer into compressed row buffer */
(short) compressed row buffer */
the k-d search tree */
vector buffer */
x image index */
x size of compressed image */
y image index */
y size of compressed image */
z size (depth) of Compressed image */

/* ......................................

*/

/*

*/

get compression parameters

I* ......................................

*I

get_args (argc, argv );


codebook= rd_codes ( codef );
tree
= build_tree ( codebook );

/* command-llne arguments */
/* codebook */
/* make a search tree */

/* ......................................

*/

/*

*/

calculate derived parameters

I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*I

xslze
yslze
zsize

= n_cols / block_wldth;
/* compressed columns */
= n_rows / block__height;
/* compressed rows */
= ( log ((double) codebook.num ) / log (2.0) + BITS-1 ) / BITS;
/* bytes per code */
n_bytes = plxel_depth * n_cols;
/* size of image row buffer */
code_wldth = plxel_depth * block_wldth; /* byte width of a block */
I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

*I

/*

*/

write output file header info

/* ......................................

*/

1240

D.A. SouT~
fwrlte
fwrlte
fwrlte
fwrite

(
(
(
(

&eerno,
&xslze,
&yslze,
&zslze,

sizeof
slzeof
slzeof
slzeof

(
(
(
(

long
long
long
long

),
),
),
),

i,
I,
1,
1,

stdout
stdout
stdout
stdout

/* ......................................

*/

I*
allocate buffer memory
I* ......................................

*I
*I

upper
lower
8vq
cvq
image

=
=
=
=
=

(uchar
(uchar
(uchar
(short
(uchar
(uchar

*) malloc
*) malloc
*) malloc
*) malloc
*) malloc
**) malloc

(
(
(
(
(

; i++ )

image[ i ] = (uchar *) malloc

/*
/*
/*
/*

serial number */
number of columns */
number of rows */
bytes per code */

vector_size * slzeof ( u c h a r ) ) ;
vector_slze * slzeof ( u c h a r ) ) ;
vector_slze * slzeof ( u c h a r ) ) ;
xsize * sizeof ( s h o r t ) ) ;
x s l z e * slzeof ( u c h a r ) ) ;
( block_helght * slzeof (uchar *) );

for ( i = 0 ; i < block_helght

);
);
);
);

/* allocate row buffers */


( n_bytes * sizeof

I* ......................................
/*
Scan image, p e r f o r m compression

(uchar));

*I
*/

/* ......................................

*/

for ( y = 0 ; y < n rows ; y += block_helght )


(
for ( i = 0 ; i < block_helght ; i++ )
(
fread ( image[ i ], slzeof ( uchar ), n_bytes, stdin );
)
p V q = SVq;
for ( x = O, 1 = 0 ; x < n_cols ; x += block_wldth, 1 += code wldth )
(
p v = V;
for ( i = 0 ; i < block_helght ; i+ )
(
pl = image[ i ] + i;
for ( J = block_wldth ; J-- ; )
(
for ( k = plxel depth ; k-- ; )
{
pv++ = *pi++;
)
)
)
pvq++ = m a t c h ( v, tree )->index;

)
if ( zsize == sizeof ( uchar ) )
(
for ( i = 0 ; i < xsize ; i++ )
(
cvq[ i ] = svq[ i ]; /* move to byte buffer */

)
fwrlte

( cvq, sizeof

( uchar ), xsize,

stdout

);

fwrlte

( svq, sizeof

( short ), xslze,

stdout

);

)
else
{

)
( NORMAL );

return

char
void

/* normal exit */

/* ......................

*/

/*
get_argB
I* ......................

*/
*I

*prototype = "<rows> <cole> < c o d e b o o k >


get_args
int
char

register
register

argc,
*argv[]
l;
J;

[in.lmage]

[out.vq]';

/* scan command-llne arguments */


/* argument count */
/* argument vector */
/* argument index */
/* positional argument index */

if ( argc < MINARGS + 1 )


{
fprlntf ( stderr, "usagel %s %s\n', argv[ 0 ], p r o t o t y p e
exit ( ERROR );

);

Compression of digitized map images


for

( i = i, J = 0 ; i < argc

1241

; i++ )

{
if

( argv[

i ][ 0 ] == '-' ) switch

( argv[

i ][ 1 ] )

{
default:
fprintf

( stderr,
U n r e c o g n i z e d
argv[ i ] );

option

flag \'%s\'kn',

break;

}
else switch

( ++J )

{
case I:
nirows
break;

= atol

( argv[

i ] );

n_cols
break;

= atoi

( argv[

i ] );

case 2:
case 3:
if ( ( codef = fopen

( argv[

i ], 'r' ) ) == N U L L

{
fprintf
exit

( stderr,
'%s: Cannot open file \'%s\'
argv[ 0 ], argv[ i ] );
( ERROR );

for inputl\n',

}
break;
case 4:
if ( freopen

( argv[

i ], 'r',

stdin

) == N U L L

{
fprlntf
exit

( stderr,
%s: Cannot open file \'%s\'
argv[ 0 ] a r g v [ i
] );
( ERROR );

for inputlkn',

}
break;
case 5:
if

( freopen

( argv[

i ]

'w', stdout

) == N U L L

{
fprintf
exit

( stderr
'%s: Cannot open file \'%s\'
argv[ 0 ], argv[ i ] );
( ERROR );

for outputl\n',

)
break;
default:
fprintf

( stderr,
'Unrecognized
argv[ i ] );

positional

argument

\'%s\'kn',

break;

codeBook

/* ......................

*/

/*

*/

rd_codes

I* ......................

*I

rd_codes
FILE

(
*stream

/* read the c o d e b o o k from file s t r e a m


/* input file s t r e a m */

*/

{
CodeBook
register
register
long
fsoanf
fscanf
fscanf

cb;
i;
J;
val;

/*
/*
/*
/*

the c o d e b o o k */
l o o p i n g c o u n t e r */
looping c o u n t e r */
a code v e c t o r element

value

*/

( stream, "%id\t# serial number of this codebook\n', & s e r n o );


( stream, "%id %id %id\t# row, col, pixel size of v e c t o r blocks\n',
&block_height, &block_width, & p i x e l _ d e p t h );
( stream, '%Idkt# 81ze of codebook\n', & c b . n u m );

vector_slze = block_height * block_width * pixel_depth;


c b . b o o k = (Code *) m a l l o c ( c b . n u m * slzeof ( Code ) );
for

( i = 0 ; i < cb.num

; i++ )

{
cb.book[ i ].index
= i;
cb.book[ i ].vector
= (uchar *) m a l l o c
for ( J = 0 ; J < v e c t o r _ s l z e ; J++ )

}
CAGEO I$/~--J

fscanf ( stream, '%id ', &val );


cb.book[ i ].vector[ J ] = val;

( vector_size

);

1242

D.A.

fscanf

( stream,

SOUTHARD

# code %*d ( % * f % % ) \ n " ) ;

return

( cb );

/* ......................

*/

/*

*/

build_tree

/* ......................
KDTree

*build_tree
CodeBook

*/

(
codebook

KDTree

*tree;

if ( codebook.num

/* construct optimized
/* the codebook */
/* pointer

<= BUCKET_SIZE

search tree */

to tree node */

tree = newbucket

( codebook

);

)
else
(

register
register
register

d;
p;
pl;

/* index of discriminator */
/* partition value */
/* partition value index */

d
= discrlm ( codebook );
pi
= medianof ( d, codebook
if ( pl < 0 )

);

tree = newbucket

( codebook

);

)
else
{

P
tree

= VALUE( pl,
= newspllt (
build_tree (
build_tree (

d );
d, p,
spilt_left ( pl, codebook
spllt_rlght( pi, codebook

) ),
) ) );

)
}
return

( tree );

/* ......................

*/

/*

*/

newbucket

I s ......................

KDTree

*newbucket
CodeBook

*I

(
codebook

/* returns terminal k-d tree node */


/* codebook to place in new bucket */

KDTree

/* pointer

*tree;

(KDTree *) malloc
= BUCKET;
= codebook;

tree
tree->type
tree->node.bucket
return ( tree );

i. . . . . . . . . . . . . . . . . . . . . . .

.i

/*

*/

to a tree node */
( sizeof

( *tree ) );

dlscrim

I* . . . . . . . . . . . . . . . . . . . . . .

long dlscrim (
CodeBook

codebook

*I

/* find discriminator
/* the codebook */

for codebook

*/

register
register
register
register
d
max
for

d;
J;
max;
spread;

/*
/*
/*
/*

index of discriminator */
looping index */
maximum spread */
spread for a coordinate */

= 0;
= spreadof ( 0, codebook );
( J = 1 ; J < vector size ; J++ )
spread = spreadof ( J, codebook
if ( spread > max )

/* first index */
/* first value */
/* scan remaining */
);

/* find spread */
/* greater than max? */

(
d
max
)

)
return

(d);

= J;
= spread;

/* save index */
/* save max value */

Compression of digitized map images


/* ......................

/*

1243

*/

spreadof

*/

I* . . . . . . . . . . . . . . . . . . . . . .

long spreadof (
long
CodeBook

*I

J,
codebook

/* value spread of vector coordinate */


/* coordinate index */
/* codebook */

register
register
register
register

I*
/*
I*
/*

i;
max;
mln;
v;

max = mln = VALUE( O, J );


for ( i = I ; i < codebook.num

looping index */
max value */
mln value */
discriminator coordinates value */

; i++ )

/* inlt with first v a l u e */


/* scan remaining values */

{
v = VALUE( i, J );
if ( v > m a x ) m a x = v;
else if ( v < m i n ) mln = v;

/* fetch value *I
/* greater than max? */
I* less than mln? *I

)
return

/* spread value */

( max - mln );

/* ......................

*/

I*

*I

cmpr

I* . . . . . . . . . . . . . . . . . . . . . .

long
int cmpr

(
Code
Code

*I

key;

/* index of key item in array */

*a,
*b )

/* sort comparison function */


/* first item */
/* second item */

return

((long)

a->vector[

/* ......................

/*

key ] );

*/

medlanof

*/

/* . . . . . . . . . . . . . . . . . . . . . .

long medlanof (
long
CodeBook

key ] - (long) b->vector[

*/

d,
codebook

/* find index of m e d i a n value */


I* discriminator coordinate index */
/* pointer to codebook */

register

/* index of median */

m;

key = d;
I* sort on key indexed by discrim.
qsort ( codebook.book, codebook.num, slzeof ( Code ), cmpr );
m = codebook.num / 2 + i;
/* nominal index of m e d i a n */
while ( m > 0 && VALUE( m, d ) == VALUE( m - I, d ) )

*I

m--;

/* decrement while values are equal */

return

( m - 1 );

/* -i = no split possible

/* ......................

*/

/*

*/

newspllt

I* . . . . . . . . . . . . . . . . . . . . . .

KDTree *newspllt
long
long
KDTree
KDTree

*I

(
d,
p,
*left,
*right

*/

/*
/*
/*
/*
/*

returns non-terminal k-d tree node */


discriminator index */
partition value for discriminator */
left sub-tree */
right sub-tree */

KDTree

*tree;

tree
tree->type
tree->node.split.dlsc
tree->node.spilt.part
tree->node.spilt.left
tree->node.spllt.rlght
return ( tree );

/* pointer to new k-d tree node */


=
=
=
=
=
=

/* . . . . . . . . . . . . . . . . . . . . . .

*/

/*

*/

spilt_left

/ * ......................
CodeBook spilt_left
long

(
pl,

(KDTree *) malloc
SPLIT;
d;
p;
left;
right;

( 8izeof

( *tree ) );

*/

/* return left p a r t i t i o n */
/* index of partition */

1244

D . A . SOUTHAgD
CodeBook

codebook

CodeBook

left;

/* codebook */
/* the left partition
/* number of codes */
/* pointer to codes */

left.num
= pi + i;
left.book
= codebook.book;
return ( left );

/*
spllt rlght
/* ......................
C o d e B o o k spilt_right
long
CodeBook
{
CodeBook

*/

*/
*/

pl,
codebook )

/* return right partition */


/* index of partition */
/* codebook */

right;

/* the right partition

rlght.num
= codebook.num - pl - I;
rlght.book
= codebook.book + pi + I;
return ( right );
/* ......................

*/

/*
search
/* ......................

*/
*/

/*
long search (
uchar
vector[],
/*
/*
KDTree
*tree )
{
/*
if ( tree->type == BUCKET )
{
register Code
*c;
/*
register
i;
/*
register
J;
/*
register
matches;/*
register
sum;
/*

*/

/* number of codes */
/* pointer to codes */

recurslve search thru k-d tree */


vector to be matched */
tree to be traversed */
check all codes in bucket */
code pointer */
code loop index */
coordinate loop index */
logical predicate */
partial distance sum */

c = tree->node.bucket.book;
for ( i = 0 ; i < tree->node.bucket.num ; i++, c++ )
{
sum = dist ( 0, vector, c->vector );
for (
J = 1 ;
( matches = sum < nearest ) && J < vector_size
J++ )
(
sum += dist ( J, vector, c->vector );
)
if ( matches )
(
nearest = sum;
m a t c h e d = c;
)

)
/* recursive descent of k-d tree */

else
{
register
register
register
register

d;
p;
temp;
done;

/*
/*
/*
/*

d = tree->node.split.dlsc;
p = tree->node.spilt.part;
if ( vector[ d ] <= p )
{
temp = upper[ d ];
upper[ d ] = p;

discriminator index */
partition value */
temporary storage */
termination flag */
/* get discriminator index */
/* get partition value */
/* search left subtree */
/* save upper bound */
/* new upper bound */

done = search ( vector, tree->node.split.left


if ( done ) return done;
upper[ d ] = temp;
temp = lower[ d ];
lower[ d ] = p;

);

/* restore upper bound */


/* save lower b o u n d */
/* new lower bound */

if ( bob ( vector ) )
/* bounds overlap ball? */
{
done = search ( vector, tree->node.split.rlght
if ( done ) return done;
)

);

Compression of digitized map images

1245

/* restore lower bound */

lower[ d ] = temp;
}
else
{

/* search right subtree */


temp = lower[ d ];
lower[ d ] = p;

/* save lower bound */


/* new lower bound */

done = search ( vector, tree->node.split.right


if ( done ) return done;

/* restore lower bound */


/* save upper bound */
/* new upper bound */

lower[ d ] = temp;
temp = upper[ d ];
upper[ d ] = p;
if ( bob
{

( vector

);

) )

/* bounds overlap ball? */

done = search ( vector, tree->node.spilt.left


if ( done ) return done;
)
upper[ d ] = temp;
/* restore upper bound */

);

}
)
return

( bwb

/* ball within bounds?

( vector ) );

/* ......................

*/

*/

/*
match
*/
/* ...................... */
Code *match (
uchar
KDTree
{
register

vector[],
*tree )

/* match input vector to codebook */


/* the input vector */
/* k-d tree form of codebook */

i;

/* looping counter */

m a t c h e d = NULL;
nearest = INT_MAX;
for ( i = 0 ; i < vector_size ; i++ )
{
upper[ i ]
= CHAR_MAX;
lower[ i ]
= CHAR_MIN;
)
search ( vector, tree );
return ( matched );
/* ......................

/* nothing matched yet */


/* Get to "infinity" */
/* initialize bounds arrays */
/* set to max uchar value */
/* set to min uchar v a l u e */
/* recursively search tree */
/* m a t c h e d code pointer */

*/

/*

dist
*/
/ * ...................... * /

long dlst (
register long
register uchar
register uchar
{
register

i,
*a,
*b )

/*
/*
/*
/*

d;

/* temp variable

d = (long) a[ i ] - (long) b[ i ];
return ( d * d );
/* ......................

coordinate distance function */


array index */
ist array */
2nd array */
*/

/* take the difference */


/* this is squared distance

*/

*/

/*
bOb
*/
/* ...................... */
long bob

(
uchar

vector[]

register
register
register

i;
sum;
q;

/* Bounds Overlap Ball test */


/* vector being tested */

{
/* looping index */
/* partial sum */
/* logical value of predicate

*/

if ( vector[ 0 ] < lower[ 0 ] ) sum = dist ( 0, vector, lower );


else if ( vector[ 0 ] > upper[ 0 ] ) sum = dist ( 0, vector, upper );
else sum = 0;
for ( i = 1 ; ( q = sum < nearest ) && i < vector_slze ; i++ )
{
if ( vector[ i ] < lower[ i ] ) sum += dist ( i, vector, lower );
else if ( vector[ i ] > upper[ i ] ) sum += dist ( i, vector, upper );
}
return ( q );

1246

D. A. S o ~
/* . . . . . . . . . . . . . . . . . . . . . .

*/

/*
bwb
*/
/* ...................... */
long b w b
{

(
uchar

vector[]

register
register

i;
q;

/* Ball W i t h i n Bounds test */


/* vector to be tested */

/* loop index */
/* logical value of predicate

i = vector_slze;
while ( i-- && ( q =
dist ( i, vector,
dist ( i, vector,
return ( q l;

*/

lower ) > nearest &&


upper l > nearest ) ) { }

APPENDIX

Program for VQ Decompression


Description
The VQ decompression program is termed vqep. Its command line format is

vqexp<in.code>[in.vq] [out.image]
The <in.code> argument is required. The serial number of the eodebook must match the serial number in the compressed
image file for proper decompression. If [in.vq] is omitted, the standard input file is default. The program will create a file
named [out.image]. If this argument is omitted, the decompressed image will be written to the standard output file. As
for the previous programs, the image is an array of pixel values stored in binary format, in row-major order. The pixel
depth is taken from the <in.code> file.

Majorfunctions
This program is simple, and uses no structure more complicated than arrays. The function rd_codes reads the codebook.
Most of the work is done in main. Here, each code is looked-up in the codebook by indexing the codebook array, and
assembling the decompressed image with a series of nested loops.

Program L~ting
FILE:
I~*
I**
/**
I**
I**
I**

vqexp.c

PURPOSE:
USAGE:

Decompress

(expand) VQ compressed image files.

vqexp <codebook>

ARGUMENTS:
codebook
input.vq
[
output.image

/**

LANGUAGE:

/**

AUTHOR=

/**

CREATED:

/**

NOTES/WARNINGS:

[input.vq]

[output.image]

: name of codebook file, created by vqinit


= name of VQ-compressed image file (stdin)
: name of decompressed image file (stdout)

ANSI C, MIPS C compiler version

D a v i d A. Southard,

The MITRE Corporation

12 N o v 1991

I**1
/* standard I/O library d e f i n i n t i o n s

#include < s t d i o . h >

/*
M A N I F E S T CONSTANTS
/* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
#define ERROR
(-i)
#define M I N A R G S 1
#define N O R M A L
0

typedef u n s i g n e d char
void
uchar

*/
*/

/* error return status code */


/* valid minimum number of arguments
/* normal return status code */

I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
/*
FUNCTION PROTOTPYES
I* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
uchar;

*I
*/
*I

/* ensure unslgned for portability

get_args ( int, char ** );


*rd_codes ( FILE * );

*/

*/

*/

Compression of digitized map images


/* ......................................................

*!

/*

*/

GLOBAL VARIABLES

!* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

long
FILE
long
long
long
long

codebook_length;
*coder;
block_height;
block_wldth;
pixel_depth;
serial;

/*
/*
!*
!*
/*
/*

.1

number of codes in codebook *!


file pointer for the codebook */
vector block height *!
vector block width *!
bytes per pixel */
serial number of codebook */

l* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

"1

/*

*/

PROGRAM

!* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

main

(
int
char

1247

*!

1" VQ decompression */
/* argument count *!
!* argument list */

argc,
**argv )

{
uchar
long
long
register
uchar
register
register
register
long
register uchar
register uchar
long
short
uchar
long
long
long
long
long

"*codebook;
code_wldth;
code 81ze;
i;
**image;
J;
k;
i;
n_cols;
*pi;
*pv;
serno;
*svq;
*cvq;
x;
xslze;
y;
ysize;
zsize;

/*
/*
/*
/*
/*
/*
/*
/*
!*
!*
!*
/*
/*
/*
!*
/*
/*
/*
!*

codebook for decompression *!


width of one vector row */
number of bytes in a vector */
vector block row index *!
array of pointers to image rows */
vector block column index */
vector block byte index */
index into image rows */
number of cols in expanded image */
pointer into an image row */
pointer into a code vector */
compressed image serial number */
(short) array of codes buffer *!
(uchar) array of codes buffer */
compressed image column index */
number of cols in compressed image */
compressed image row index */
number of rows in compressed image */
number of bytes per code index */

/* ..............................................

*/

/*

*/

Read arguments and codebooks

/* ..............................................

get_args ( argc, argv );


codebook = rd__codes ( codef );

*!

/* command-llne arguments */
/* get codebook from file */

1" ..............................................

*l

/*

*!

Read compressed image header info

/* ..............................................

fread
fread
fread
fread

(
(
(
(

&serno,
&xsize,
&yslze,
&zsize,

sizeof
slzeof
sizeof
sizeof

(
(
(
(

long
long
long
long

),
),
),
),

i,
1,
1,
1,

*!

stdin
stdin
stdln
stdin

if ( serno I= serial )

);
);
);
);

/*
/*
!*
!*

serial number *!
number of columns */
number of rows *!
bytes per code *!

/* verify serial numbers */

fprintf
exit

( stderr,
%s: codebook inconsistent with compressed imagelkn',
argv[ 0 ] );
( ERROR );

)
l* ..............................................

*l

!*

"1

Allocate buffer memory

l* ..............................................

n_cols = xsize * block_.width;


code_wldth = block_wldth * pixe1_depth;
code_slze = code_wldth * block_height;
image = (uchar **) m a l l o c ( b l o c k _ h e l g h t
for ( i = 0 ; i < block_helght ; i++ )

*l

/* bytes across column */


/* bytes across block */
!* bytes entire block */
* sizeof (uchar *) );
/* allocate image row buffers *!

image[ i ] = (uchar *) malloc

( n_cols * pixel_depth

);

svq = (short *) malloc


cvq = (uchar *) malloc

( xslze * slzeof
( xsize * slzeof

(short));
(uchar));

1" ..............................................

*l

!*

*!

Read codes, and re-assemble

image

!* code row buffer */


/* code row buffer *!

/* ..............................................

*i

for ( y = 0 ; y < yslze ; y++ )

/* loop over vq rows */

1248

D . A . SOUTriARO

if

( zsize

== sizeof

( uchar

) )

/* read

row of u c h a r

*/

{
fread ( cvq, s i z e o f ( uchar ), xsize, stdin );
for ( i = 0 ; i < x s l z e ; i++ ) /* m o v e to short

array*/

{
svq[

i ] = cvq[

i );

}
}
else

/* read

row of s h o r t s

*/

{
fread

( svq,

sizeof

( short

), xsize,

stdin

);

)
for

( x = 0, 1 = 0 ; x < x s i z e

; x++,

1 += c o d e _ w i d t h

{
p v = c o d e b o o k + svq[ x ] * code slze; /* code v e c t o r
for ( i = 0 ; i < b l o c k height ; i++ )

*/

{
pi = image[ i ] + i;
for ( J = b l o c k w i d t h

/* find p r o p e r
; J-- ; )

row */

{
for

( k = pixel

depth

; k--

; )

(
pi++

= *pv++;

/* m o v e

bytes

*/

rows

*/

}
)
)
)
for

( i = 0 ; i < block_height

; i++

/* output

image

{
fwrite

( image[

i ], pixel

depth,

n_cols,

stdout

);

}
}
return

( NORMAL

);

/* ......................

*/

/*

*/

get_args

/* ......................

char
void

*prototype
get_args
int
char

*/

= '<codebook>

[in.vq]

[out.image]';

argc,
*argv[ ] )

/* scan c o m m a n d - l i n e a r g u m e n t s
/* a r g u m e n t count */
/* argument v e c t o r */

i;
J;

/* argument index */
/* p o s i t i o n a l a r g u m e n t

*/

{
register
register
if

( argc

< MINARGS

index

*/

+ 1 )

{
f p r i n t f ( stderr,
exit ( E R R O R );

'usage:

%8 %s\n',

argv[

0 ], p r o t o t y p e

);

)
for

( i = i,

J = 0 ; i < argc

; i++

{
if

( argv[

i ][ 0 ] ==

'-' ) switch

( argv[

i ][ 1 ] )

{
default:
fprintf

( stderr,
U n r e c o g n i z e d
argv[i
] );

option

flag

\'%s\'\n',

( argv[

i ],

"r'

break;

)
else

switch

( ++J

{
case

i:
if

( ( coder

= fopen

) ) == N U L L

{
fprlntf
exit

( stderr,
"%s: Cannot open
argv[ 0 ], argv[
( ERROR );

file \'%s\'
i ] );

for inputikn',

)
break;
case

2:
if

( freopen

( argv[

i ],

'r',

stdin

) == N U L L

{
fprintf
exit

( stderr,
"%s: Cannot open
argv[ 0 ], argv[
( ERROR );

file \'%s\'
i ] );

for inputlkn',

Compression of digitized map images

1249

break;
case 3:
if ( freopen

( argv[ i ], "w', stdout

) == NULL )

(
fprintf
exit

( stderr,
,%s: Cannot open file \'%s\ for outputl\n',
argv[ 0 ], argv[ i ] );
( ERROR );

)
break;
default:
fprintf

( stderr,
Unrecognized positional argument
argv[ i ] );

\'%s\"\n",

break;
}

}
}
/* . . . . . . . . . . . . . . . . . . . . . .

*/

/*
rd_codes
*/
I* ...................... *I
uchar *rd codes (
FILE
{
register uchar
uchar
register
register
long
long

*stream )

/* read a codebook */
/* input file stream */

*cb;
*codebook;
i;
k;
val;
vector_size;

/*
/*
/*
/*
/*
/*

pointer into codebook codes */


a codebook */
looping counter */
looping counter */
a code vector element value */
bytes per vector */

/* ......................................

*/

/*
scan codebook header info
I* ......................................

*/
*I

fscanf
fscanf
fscanf

( stream, '%id\t# serial number of this codebook\n',


&serial );
( stream, '%id %Id %idkt# row, col, pixel size of vector blockskn',
&block_height, &block_width, &pixel_depth );
( stream, '%idkt# size of codebook\n', &codebook_length );

/* ......................................

*/

/*

allocate codebook memory


*/
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . *1
vector size = block_height * block_width * pixel_depth;
cb = codebook = (uchar *) malloc ( codebook_length * vector_size
/* ......................................

);

*/

/*

scan-in code vectors


*/
/ * ...................................... * /
for
{

( i = codebook_length

; i-- ; )

/* for each code */

for ( k = vector_size ; k-- ; ) /* read each value */


{
fscanf ( stream, '%id ', &val );
*cb++ = val;
/* assign to codebook */
}
fscanf ( stream, '\t# code %*d ( % * f % % ) \ n " ) ;
}
return

( codebook

);

APPENDIX 4

Program for VQ Color Mapping


The final program adapts codebooks for decompression to computer displays that use a color look-up table. The
command line syntax is

vqcmap<cmap.code> [in.code] [out.code]

The <cmap.code> file is a VQ codebook generated by vqinit, with a block height = l, and block width = 1. The pixel
depth should be set to the number of color components, namely, 3 for RGB data, or 7 for LANDSAT TM data. Usually,
the codebook length should be set to 256. Most engineering workstations support an 8-bit, 256-entry color look-up table.
PCs equipped with a Super VGA card also have the same capability. Regular VGA cards support only 16 colors in high
resolution mode. Although the PNN clustering algorithm cannot produce successfully such a small color table from most
images, we have been able to select 16 colors by eye, and to manually enter them into the codebook format using a text
editor.

1250

D . A . SOUTHMU~

The [in.code] argument is optional, and defaults to the standard input file. It specifies the name of a VQ codebook
generated by vqinit. This pixel depth in this codebook must match the pixel depth in <cmap.code>. The program then
will map the colors in [in.code] to an index for the color from <cmap.code> that most closely matches, by the squared
error criterion. The modified codcbook will then be written to [out.code], or the standard output file, if this argument
is omitted. A file decompressed with this modified codebook will be color-mapped. Proper colors for the decompressed
image will be displayed when the color table contained in <cmap.code> is loaded into the computer display.
The same data structures are used as in vqcomp for performing a k-d tree closest match search. The same code is used
for build_tree and match. For the sake of brevity, these routines are not reproduced again in the program listing. The
primary new function is rw_codes. It reads the codebook from the input file, calls match to locate the closest color from
the color table, and writes the index of this code to the input file. The serial number of the input codebook is retained
in the output codebook, so that the new codebook can be used to decompress images that were compressed with the original
codebook.

Program Listing
I**
I**
I**
l**

FILE:

I**
I**
/**

USAGE:

I**
I**
I**

vqcmap.c

PURPOSE: Color-mapplng for VQ codebooks.


vqcmap <cmap.code>

ARGUMENTS:
cmap.code
input.code
output.code

[Input.code]

[output.code]

: name of color map codebook file


: name of input codebook file (stdin)
: name of output codebook file (stdout)

/**

LANGUAGE: ANSI C

l**

AUTHOR: David A. Southard, The MITRE Corporation

w*
/**
1.*

CREATED:

14 Nov 1991

NOTES/WARNINGS:
K-d tree search is a version of the algorlthm presented by
JH Friedman, JL Bentley, RA Finkel, 'An Algorithm
for Finding Best Matches in Logarithmic Expected Time,"
ACM Trans. Math. Softw. 3(3) Sep. 1977 pp. 209-226.

I**/
#include <stdlo.h>
#include <llmits.h>

/* standard I/O function definitions */


/* contains MAX_INT, etc. */

/*
MANIFEST CONSTANTS
*/
/* .............................................................. */
#define
#define
#define
#define
#define
#define
#define

BUCKET__SIZE
ERROR
FALSE
MAXCHAR
MINARGS
NORMAL
TRUE

8
(-I)
0
256
1
0
(IFALSE)

/*
/*
/*
/*
/*
/*
/*

number of vectors per bucket */


error return code */
logical FALSE */
max. number indexed by char */
mln. number of arguments required */
normal return code */
logical TRUE */

/* ..............................................................

*/

/*
DATA STRUCTURES
*/
/*
Vector Quantlzatlon Codebook
*/
I* .......... - ................................................... *I
typedef unsigned char
typedef struct
{
long
uchar
}
Code;

uchar;

/* ensure unsigned for portability */


/* structure defining a single code */

index;
*vector;

typedef struct
{
long
num;
Code
*book;
CodeBook;
#define CODE(J)
#define VALUE(J,I)

/* index of the code */


/* pointer to code vector */
/* structure defining a book of codes */
/* number of code vectors in codebook */
/* pointer to a collection of Codes */
(codebook.book[J].vector)
(CODE(J)[I])

/*
Structures for K-d tree used in search algorithm
*/
I* .............................................................. *I

Comprvssion of digitizvd map images


typedef enum { BUCKET,

SPLIT )

1251

NodeType;

typedef struct spilt_node


{
long
disc;
long
part;
struct kd_node
*left;
struct kd__node *right;
)
Split;

/* structure for decision nodes * /

typedef struct kd__node


{
NodeType
union
{
Split
codeBook
}
)
KDTree;

/* structure for K-D tree nodes * /

/*
/*
/*
/*

index of discriminator coordinate */


partition value for discriminator */
for vectors <= threshold */
for vectors threshold */

type;

/* one of two types * /

split;
bucket;
node;

/ * d e c i s i o n node * /
/* d a t a node * /

/* ..............................................................

*/

/*

FUNCTION PROTOTYPES

*/

long
KDTree
long
in*
long
long
void
Code
long
KDTree
KDTree
CodeBook
void
long
CodeBook
CodeBook
long

bob ( uchar * );
build_tree ( CodeBook );
bwb ( uchar * ) ;
cm~r ( Code *, Code * );
discrim ( CodeBook );
dlst ( long, uchar *, uchar * );
get args ( in*, char ** );
match ( uchar *, KDTree * );
medianof ( long, CodeBook );
newbucket ( CodeBook );
newspllt ( long, long, KDTree *, KDTree * );
rd_codes ( FILE * );
rw_codes ( FILE *, KDTree * );
search ( uchar *, KDTree * );
spilt_left ( long, CodeBook );
spllt_rlght ( long, CodeBook );
spreadof ( long, CodeBook );

/* ..............................................................

*/

/* ..............................................................

*/

/*
GLOBAL DATA
I s ..............................................................

*/
*I

long
long
long
FILE
uchar
Code
long
long
long
long
long
uchar
long

block_helght;
block_wldth;
codebook_length;
*cmapf;
*lower;
*matched;
nearest;
n_cols;
n_rows;
plxel_depth;
serno;
*upper;
vector_slze;

/*
/*
I*
/*
I*
/*
/*
/*
/*
/*
/*
/*
/*

height of vector block */


width of vector block */
number of codes in codebook */
color map codebook file stream */
lower bounds for search *I
pointer to matched code */
interim nearest code match index */
number of columns in input image */
number of rows in input image */
size of plxel */
codebook serial number */
upper bounds for search */
size of c o d e vector */

/* ..............................................................

/*

PROGRAN

/* ......................................................

main

........

argc,
*argv[]

*/
*/

/* VQ c o d e b o o k c o l o r - m a p p i n g * /
/* argument count */
/* argument vector */

(
in*
char

*/.

/* c o m m a n d - l i n e a r g u m e n t s * /
/* r e w r i t e c o d e b o o k * /
/* searching k-d tree */
/*
b u i l t from c o l o r map * /
/* normal e x i t * /

get_args
rw_codes
return

(argc, arg v );
( 8tdin,
build_tree (
rd_.codes ( cmapf ) ) );
( NORMAL );

/* ......................

*/

/*
get_args
*/
I* ...................... *I
char

*prototype

void get_args
in*

= "<cmap.code>

(
argc,

[in.code]

lout.code]';

/* scan command-llne arguments "1


/* argument count */

1252

D . A . SOUTHARD
char

*argv[]

register
register

i;
J;

if ( a r g c
(

/* argument vector

*/

/* argument index */
/* positional argument

< MINARGS

index */

+ 1 )

fprintf ( stderr, usage: %s %s\n', argv[


exit ( ERROR );
)
for ( i = I, J = 0 ; i < argc ; i++ )

0 ], prototype

);

if ( a r g v [ i

][ 0 ] == '-' ) switch

( argv[

i ][ 1 ] )

default:
fprlntf

( stderr,
Unrecognized
argv[ i ] );

option flag \'%sk'kn',

break;
)
else switch ( ++J )
(
case i:
if ( ( cmapf = fopen ( argv[ i ], "r" ) ) == NULL )
{
fprintf ( stderr,
'%s: Cannot open file \'%s\" for inputI\n',
argv[ 0 ], a r g v [ i ] );
exit ( ERROR );

)
break;
case 2:
if ( freopen ( argv[ i ], "r', stdin ) == NULL )
(
fprintf ( stderr,
%s: Cannot open file \'%s\" for inputl\n',
argv[ 0 ], argv( i ] );
exit ( ERROR );
)
break;
case 3:
if ( freopen ( a r g v [ i ], 'w', stdout ) == NULL )
(
fprintf ( stderr,
%s: Cannot open file \'%s\" for outputl\n',
argv[ 0 ], argv[ i ] );
exit ( ERROR );
)
break;
defaults
fprlntf ( stderr,
Unrecognized positional argument \'%s\'\n',
argv[ i ] );
break;

CodeBook

/* ......................

*/

/*
rd_codes
I* ......................

*/
*I

rd_codes
FILE

(
*stream

/* read the codebook from file stream */


/* input file stream */

(
CodeBook
register
register
long
fscanf
fscanf
fscanf

cb;
i;
J;
val;

/*
/*
/*
/*

the codebook */
looping counter */
looping counter */
a code vector element value */

( stream, '%Id\t# serial number of this codebook\n', &serno );


( stream, '%Id %id %id\~# row, col, plxel size of vector blocks\n',
&block_height, &block_width, &pixel_depth );
( stream, '%Id\t# size of codebook\n', &codebook_length );

vector_slze = block_helght * block_width * plxel_depth;


cb. num
= codebook_l ength;
cb.book = (Code *) malloc ( cb.num * sizeof ( Code ) );
for

( i = 0 ; i < cb.num ; i++ )


cb.book[
cb.book[

i ].index
i ].vector

= i;
= (uchar *) malloc

( vector_slze

);

1253

Compression of digitized map images


for ( J = 0 ; J < vector_size ; J++ )
{
fscanf ( stream, '%Id ", &val );
cb.book[ i ].vector[ J ] = val;
}
# code %*d ( % * f % % ) \ n " ) ;
fscanf ( stream, '
}
return

( cb );

}
I* ...................... */
/*
rw_codes
*/
/ * ...................... * /
void, rw_codes (
FILE
KDTree
(
long
long
float
register
register
long
long
long
long
long
register uchar
long

*stream,
*tree )

/* read the codeboo~ from file stream */


/* input file stream */
/* search tree */

bh;
bw;
f;
J;
k;
n;
pd;
sn;
sz;
val;
*vector;
vs;

/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*
/*

block height */
block width */
code frequency */
loop index for vector pixels */
loop index for color components */
code number */
pixel depth */
codebook serial number */
codebook size */
a code vector element value */
single vector buffer */
vector size */

if ( block_height I= 1 lJ block_width
{
fprintf ( stderr, 'Unsuitable
exit ( ERROR );
)
fscanf
fscanf
fscanf

( stream,
( stream,
&bh, &bw,
( stream,

for
{

);

'Color map and codebook are incompatible\n,

VS = bh * bW * pd;
vector = (uchar *) malloc ( vs );
upper = (uchar *) malloc ( pd );
lower = (uchar *) malloc ( pd );

prlntf

color map\n"

> 256 )

"%id\t# serial number of this codebook\n', &sn );


"%Id %id %id\t# row, col, pixel size of vector blocks\n',
&pd );
'%Idkt# size of codebookkn', &sz );

if ( pixel_depth I= pd )
{
fprintf ( stderr,
exit ( ERROR );
}

prlntf
prlntf

I= 1 il codebook_length

/*
/*
/*
/*

vector size */
allocate vector buffer */
allocate upper bound buffer
allocate lower bound buffer

( "%idkt# serial number of this codebook\n', sn );


( "%id %id %dkt# row, col, plxel size of vector blocks\n',
bh, bw, i );
( "%Id\t# size of codebook\n', sz );

( ; sz-for
{

; )

/* for each code */

( J = vs ; J ; J -= pd )

/* for each pixel

*/

for ( k = 0 ; k < pd ; k++ )


/* assemble plxel */
{
fscanf ( stream, "%Id "~ &val );
vector[ k ] = val;
}
printf ( "%31d ", match ( vector, tree )->index );
fscanf
prlntf

}
)

( stream, 'it# code %id (%f%%)\n', &n, &f );


( "\t# code %31d (%.3f%%)\n', n, f );

);

*/
*/

Vous aimerez peut-être aussi