Image Segmentation using Nearest Neighbor Classifier in Matlab

Image Segmentation using Nearest Neighbor Classifier in Matlab
Dr. Rashi Agarwal, IT Deptt,UIET, CSJM University, Kanpur
Discrete Cosine Transform

The real part of the Fourier Transform.
1D Forward DCT
Given a list of n intensity values I(x), where
x = 0, , n-1
Compute then n
1 DCT coefficients:
2
(2 x 1)
F (u )
where
C (u ) I ( x) cos
x 0
C (u )
2n
1
for u 0,
2
1 otherwise
, u 0...n 1
Visualization of 1D DCT
BasisFunctions
F(0)
F(1)
F(2)
F(3 )
F(7)
Extend DCT from 1D to 2D
F(4)
F(5)
F(6)
Perform 1D DCT on each row of the block and again for each
column of 1D coefficients
Equations for 2D
DCT
m 1 n 1
2
(2 x 1)u
F (u , v)
C (u )C (v) I ( x, y ) * cos
2n
nm
y 0 x 0
(2 y 1)v
*
cos
2
m
Visualization of 2D DCT Basis

Functions
F(0,0)
includes the lowest frequency in both directions is called DC
coefficient
Determines fundamental color of the block
F(0,1) . F(7,7)
are called AC coefficients
Their frequency is non-zero in one or both directions
Huffman Coding
Average length = 1*0.4 +2*0.3+3*0.3+4*0.1+0.06*5

+0.04*5 -2.2
MISSISSIPPIRIVER
16 alphabets
M:1, I:5, S:4, P:2, R:2, V:1, E:1
alp probabili
hab
ty
et
I
5/16 =
0.31
0.31
0.31
0.31
0.44
0.66 (0)
4/16=0.2
5
0.25
0.25
0.25
0.31(01)
0.44(1)
2/16=0.1
25
0.125
0.190
0.25
(10)
0.25(00)
2/16=0.1
25
0.125
0.125(10
1)
1/16=0.0
64
0.126
(110)
0.125(10
0)
1/16=0.0
63
0.064
(111)
01
0.19(1 S 00
1)
R 101
P
100
M 110
1101
1100
JPEG using DCT
Uncompressed Image
Let us first consider how much
memory an uncompressed
bitmap raster image uses.
Using adesktop 1280 x 1024
pixel imagefor example. Each
pixel requires 3 memory
locations to store the RGB
colours.
So 3 Blocks ( arrays-grids) of 1280 x 1024 memory are used =
3932160 memory locations.
With larger images the huge file sizes create a problem for data
storage and transmission over networks.
To overcome these problems data compression is used to reduce
the file size.
Lossless and Lossy

Compression
The first data compression methods devised were loss-less.
That is after compression and decompression you getback

the original data.
These methodsrelied on the data being inefficiently coded in
the first place to get good compression ratios.
Graphic image data with lots of fine detail when compressed
using a loss-less method entail lots of processing for little
compression effect.
New ideas were needed to overcome the problem which led
to a detailed examination of the information stored in an
image.
An image is shades of light and dark of different hues. The
viewer is the human eye and brain. The new ideas were
centered around exploiting the strengths and weakness of the
human system.
JPEG
In 1987
two groups were combined to form a joint

committee (Joint Photographic Experts Group ( JPEG )) that
would research and produce a single standard.
JPEG unlike other compression methods is not a single
algorithm but maybe thought of as a toolkit of image
compression methods to suit the users needs.
JPEG uses a lossy compression method that throws useless
data away during encoding.
This is why lossy schemes manage to obtain superior
compression ratios over most loss-less schemes.
JPEG is designed to discard information the human eye
cannot easily see.
The eye barely notices slight changes in colour but will
pick out slight changes in brightness or contrast.
STEPS
Convert RGB to YCbCr format (luminance &
chroma)
Y = 0.299 R + 0.587 G + 0.114 B
Cb = - 0.1687 R - 0.3313 G + 0.5 B + 128
Cr = 0.5 R - 0.4187 G - 0.0813 B + 128
Jpeg Chroma sampling
The luminance channel is retained at full resolution.
Both chrominance channels are typically down
sampled by 2:1 horizontally and either 1:1 or 2:1
vertically
The luminance and chrominance components of the
image are divided up into an array of 8x8 pixel blocks.

Padding is provided if required to ensure blocks on the
right and bottom of the image are full.
These 8x8 pixel blocks are fed into a process that
preforms a forward Discrete Cosine
Transformation (DCT) .
The output of this process is a set of 64 values.
Jpeg Quantization
The next step is the Quantization process which is the
main source of the Lossy Compression. The values in
the quantization table are chosen to preserve lowfrequency information and discard high-frequency
(noise-like) detail as humans are less critical to the
loss of information in this area.
Each
DCT term is divided by the

corresponding position in the Quantization
table and then rounded to the nearest
integer as illustrated below. In each table the
low frequency terms are in the top left hand
corner and the high frequency terms are in
the bottom right hand corner.
This is the point at which we can control the
quality and amount of compression of the
JPEG. The lower the quality setting, the
greater the divisor, increasing the chance of
a zero result.
Camera
manufacturers independently choose an

arbitrary "image quality" name (or level) to assign to
the 64-value quantization matrix that they devise, and
so the names cannot be compared between makes or
even models by the same manufacturer. The
Quantization tables used is stored as part of the Jpeg.
The first output is a steady (DC) level of all the 64
image pixels averaged together and the other 63
outputs represent different frequency (AC) levels found
in the 8 x 8 pixel image block.
If through the user compression level (quality factor)
slider in the quantization stage it discarded all of the 63
AC outputs the resultant image would show 8 x 8 pixel
areas of the same tone.
The image will get maximum compression typically
something in excess of 120:1 but you may of dumped a
lot of image information to get it.
Jpeg Huffman
After quantization the 63 DCT AC terms are
collected using the zigzag method illustrated on

the left.
This collection method takes advantage of the
fact that high frequency terms will tend to zero
after quantization improving the chance of
getting longer runs of zero which will be idea for
good run length compression.
The 63 AC components from the DCT process are
compressed using the loss lessrun length
encoding
However the DC component is treated differently.
It is assumed that neighboring 8 x 8 blocks will
have a similar average value so instead of
dealing with a large number format it uses a
small number format which defines the difference
in level from the previous block thereby requiring
less code to store the information.

Image Segmentation using Nearest Neighbor Classifier in Matlab

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Image Segmentation using Nearest Neighbor Classifier in Matlab

Transféré par

Droits d'auteur :

Formats disponibles

Image Segmentation using Nearest Neighbor Classifier in Matlab

Dr. Rashi Agarwal, IT Deptt,UIET, CSJM University, Kanpur

Discrete Cosine Transform

Visualization of 2D DCT Basis

Average length = 10.4 +20.3+30.3+40.1+0.06*5

JPEG using DCT

Lossless and Lossy

That is after compression and decompression you getback

two groups were combined to form a joint

The luminance and chrominance components of the

image are divided up into an array of 8x8 pixel blocks.

DCT term is divided by the

manufacturers independently choose an

collected using the zigzag method illustrated on

Vous aimerez peut-être aussi