Vous êtes sur la page 1sur 16

Image Segmentation using Nearest Neighbor Classifier in Matlab

Dr. Rashi Agarwal, IT Deptt,UIET, CSJM University, Kanpur

Discrete Cosine Transform


The real part of the Fourier Transform.
1D Forward DCT
Given a list of n intensity values I(x), where

x = 0, , n-1
Compute then n
1 DCT coefficients:
2
(2 x 1)
F (u )

where

C (u ) I ( x) cos
x 0

C (u )

2n

1
for u 0,
2
1 otherwise

, u 0...n 1

Visualization of 1D DCT
BasisFunctions

F(0)
F(1)
F(2)
F(3 )
F(7)
Extend DCT from 1D to 2D

F(4)

F(5)

F(6)

Perform 1D DCT on each row of the block and again for each
column of 1D coefficients
Equations for 2D
DCT
m 1 n 1
2
(2 x 1)u
F (u , v)
C (u )C (v) I ( x, y ) * cos
2n
nm

y 0 x 0

(2 y 1)v
*
cos

2
m

Visualization of 2D DCT Basis


Functions

F(0,0)
includes the lowest frequency in both directions is called DC
coefficient
Determines fundamental color of the block
F(0,1) . F(7,7)
are called AC coefficients
Their frequency is non-zero in one or both directions

Huffman Coding

Average length = 1*0.4 +2*0.3+3*0.3+4*0.1+0.06*5


+0.04*5 -2.2

MISSISSIPPIRIVER
16 alphabets
M:1, I:5, S:4, P:2, R:2, V:1, E:1
alp probabili
hab
ty
et
I

5/16 =
0.31

0.31

0.31

0.31

0.44

0.66 (0)

4/16=0.2
5

0.25

0.25

0.25

0.31(01)

0.44(1)

2/16=0.1
25

0.125

0.190

0.25
(10)

0.25(00)

2/16=0.1
25

0.125

0.125(10
1)

1/16=0.0
64

0.126
(110)

0.125(10
0)

1/16=0.0
63

0.064
(111)

01

0.19(1 S 00
1)
R 101
P

100

M 110

1101

1100

JPEG using DCT

Uncompressed Image
Let us first consider how much

memory an uncompressed
bitmap raster image uses.
Using adesktop 1280 x 1024
pixel imagefor example. Each
pixel requires 3 memory
locations to store the RGB
colours.
So 3 Blocks ( arrays-grids) of 1280 x 1024 memory are used =
3932160 memory locations.
With larger images the huge file sizes create a problem for data
storage and transmission over networks.
To overcome these problems data compression is used to reduce
the file size.

Lossless and Lossy


Compression
The first data compression methods devised were loss-less.

That is after compression and decompression you getback


the original data.
These methodsrelied on the data being inefficiently coded in
the first place to get good compression ratios.
Graphic image data with lots of fine detail when compressed
using a loss-less method entail lots of processing for little
compression effect.
New ideas were needed to overcome the problem which led
to a detailed examination of the information stored in an
image.
An image is shades of light and dark of different hues. The
viewer is the human eye and brain. The new ideas were
centered around exploiting the strengths and weakness of the
human system.

JPEG
In 1987

two groups were combined to form a joint


committee (Joint Photographic Experts Group ( JPEG )) that
would research and produce a single standard.
JPEG unlike other compression methods is not a single
algorithm but maybe thought of as a toolkit of image
compression methods to suit the users needs.
JPEG uses a lossy compression method that throws useless
data away during encoding.
This is why lossy schemes manage to obtain superior
compression ratios over most loss-less schemes.
JPEG is designed to discard information the human eye
cannot easily see.
The eye barely notices slight changes in colour but will
pick out slight changes in brightness or contrast.

STEPS
Convert RGB to YCbCr format (luminance &
chroma)
Y = 0.299 R + 0.587 G + 0.114 B
Cb = - 0.1687 R - 0.3313 G + 0.5 B + 128
Cr = 0.5 R - 0.4187 G - 0.0813 B + 128
Jpeg Chroma sampling
The luminance channel is retained at full resolution.
Both chrominance channels are typically down
sampled by 2:1 horizontally and either 1:1 or 2:1
vertically

The luminance and chrominance components of the

image are divided up into an array of 8x8 pixel blocks.


Padding is provided if required to ensure blocks on the
right and bottom of the image are full.
These 8x8 pixel blocks are fed into a process that
preforms a forward Discrete Cosine
Transformation (DCT) .
The output of this process is a set of 64 values.
Jpeg Quantization
The next step is the Quantization process which is the
main source of the Lossy Compression. The values in
the quantization table are chosen to preserve lowfrequency information and discard high-frequency
(noise-like) detail as humans are less critical to the
loss of information in this area.

Each

DCT term is divided by the


corresponding position in the Quantization
table and then rounded to the nearest
integer as illustrated below. In each table the
low frequency terms are in the top left hand
corner and the high frequency terms are in
the bottom right hand corner.
This is the point at which we can control the
quality and amount of compression of the
JPEG. The lower the quality setting, the
greater the divisor, increasing the chance of
a zero result.

Camera

manufacturers independently choose an


arbitrary "image quality" name (or level) to assign to
the 64-value quantization matrix that they devise, and
so the names cannot be compared between makes or
even models by the same manufacturer. The
Quantization tables used is stored as part of the Jpeg.
The first output is a steady (DC) level of all the 64
image pixels averaged together and the other 63
outputs represent different frequency (AC) levels found
in the 8 x 8 pixel image block.
If through the user compression level (quality factor)
slider in the quantization stage it discarded all of the 63
AC outputs the resultant image would show 8 x 8 pixel
areas of the same tone.
The image will get maximum compression typically
something in excess of 120:1 but you may of dumped a
lot of image information to get it.

Jpeg Huffman
After quantization the 63 DCT AC terms are

collected using the zigzag method illustrated on


the left.
This collection method takes advantage of the
fact that high frequency terms will tend to zero
after quantization improving the chance of
getting longer runs of zero which will be idea for
good run length compression.
The 63 AC components from the DCT process are
compressed using the loss lessrun length
encoding
However the DC component is treated differently.
It is assumed that neighboring 8 x 8 blocks will
have a similar average value so instead of
dealing with a large number format it uses a
small number format which defines the difference
in level from the previous block thereby requiring
less code to store the information.

Vous aimerez peut-être aussi