Vous êtes sur la page 1sur 56

Digital image processing

1.What is Digital image processing?


Digital image processing deals with manipulation of digital images through a digital computer. It
is a subfield of signals and systems but focus particularly on images. DIP focuses on developing
a computer system that is able to perform processing on an image. The input of that system is a
digital image and the system process that image using efficient algorithms, and gives an image
as an output. The most common example is Adobe Photoshop. It is one of the widely used
application for processing digital images.

In the above figure, an image has been captured by a camera and has been sent to a digital
system to remove all the other details, and just focus on the water drop by zooming it in such a
way that the quality of the image remains the same.

2. What is an Image?
An image is nothing more than a two dimensional signal. It is defined by the mathematical
function f(x,y) where x and y are the two co-ordinates horizontally and vertically.

The value of f(x,y) at any point is gives the pixel value at that point of an image.
The above figure is an example of digital image that you are now viewing on your computer
screen. But actually , this image is nothing but a two dimensional array of numbers ranging
between 0 and 255.

128 30 123

232 123 321

123 77 89

80 255 255

Each number represents the value of the function f(x,y) at any point. In this case the value 128
, 230 ,123 each represents an individual pixel value. The dimensions of the picture is actually
the dimensions of this two dimensional array.

3. Write Applications of Digital Image Processing in detail?


Some of the major fields in which digital image processing is widely used are mentioned
below

 Image sharpening and restoration

 Medical field

 Remote sensing
 Transmission and encoding

 Machine/Robot vision

 Color processing

 Pattern recognition

 Video processing

 Microscopic Imaging

 Others

Image sharpening and restoration

Image sharpening and restoration refers here to process images that have been captured from
the modern camera to make them a better image or to manipulate those images in way to
achieve desired result. It refers to do what Photoshop usually does.

This includes Zooming, blurring , sharpening , gray scale to color conversion, detecting edges
and vice versa , Image retrieval and Image recognition. The common examples are:

The original image

The zoomed image


Blur image

Sharp image

Edges
Medical field
The common applications of DIP in the field of medical is

 Gamma ray imaging

 PET scan

 X Ray Imaging

 Medical CT

 UV imaging

UV imaging
In the field of remote sensing , the area of the earth is scanned by a satellite or from a very high
ground and then it is analyzed to obtain information about it. One particular application of
digital image processing in the field of remote sensing is to detect infrastructure damages
caused by an earthquake.

As it takes longer time to grasp damage, even if serious damages are focused on. Since the area
effected by the earthquake is sometimes so wide , that it not possible to examine it with human
eye in order to estimate damages. Even if it is , then it is very hectic and time consuming
procedure. So a solution to this is found in digital image processing. An image of the effected
area is captured from the above ground and then it is analyzed to detect the various types of
damage done by the earthquake.
The key steps include in the analysis are

 The extraction of edges

 Analysis and enhancement of various types of edges

Transmission and encoding


The very first image that has been transmitted over the wire was from London to New York via
a submarine cable. The picture that was sent is shown below.

The picture that was sent took three hours to reach from one place to another.

Now just imagine , that today we are able to see live video feed , or live cctv footage from one
continent to another with just a delay of seconds. It means that a lot of work has been done in
this field too. This field doesnot only focus on transmission , but also on encoding. Many
different formats have been developed for high or low bandwith to encode photos and then
stream it over the internet or e.t.c.

Machine/Robot vision
Apart form the many challenges that a robot face today , one of the biggest challenge still is to
increase the vision of the robot. Make robot able to see things , identify them , identify the
hurdles e.t.c. Much work has been contributed by this field and a complete other field of
computer vision has been introduced to work on it.
Hurdle detection

Hurdle detection is one of the common task that has been done through image processing, by
identifying different type of objects in the image and then calculating the distance between
robot and hurdles.

Line follower robot

Most of the robots today work by following the line and thus are called line follower robots.
This help a robot to move on its path and perform some tasks. This has also been achieved
through image processing.

Color processing
Color processing includes processing of colored images and different color spaces that are used.
For example RGB color model , YCbCr, HSV. It also involves studying transmission , storage , and
encoding of these color images.

Pattern recognition
Pattern recognition involves study from image processing and from various other fields that
includes machine learning ( a branch of artificial intelligence). In pattern recognition , image
processing is used for identifying the objects in an images and then machine learning is used to
train the system for the change in pattern. Pattern recognition is used in computer aided
diagnosis , recognition of handwriting , recognition of images e.t.c

Video processing
A video is nothing but just the very fast movement of pictures. The quality of the video depends
on the number of frames/pictures per minute and the quality of each frame being used. Video
processing involves noise reduction , detail enhancement , motion detection , frame rate
conversion , aspect ratio conversion , color space conversion e.t.c

4. What is pixel?
Pixel

Pixel is the smallest element of an image. Each pixel correspond to any one value. In an 8-bit
gray scale image, the value of the pixel between 0 and 255. The value of a pixel at any point
correspond to the intensity of the light photons striking at that point. Each pixel store a value
proportional to the light intensity at that particular location.

PEL

A pixel is also known as PEL. You can have more understanding of the pixel from the pictures
given below.

In the above picture, there may be thousands of pixels, that together make up this image. We
will zoom that image to the extent that we are able to see some pixels division. It is shown in
the image below.

5. What are edges?


We can also say that sudden changes of discontinuities in an image are called as edges.
Significant transitions in an image are called as edges.

Types of edges

Generally edges are of three types:

 Horizontal edges

 Vertical Edges

 Diagonal Edges

Why detect edges

Most of the shape information of an image is enclosed in edges. So first we detect these edges
in an image and by using these filters and then by enhancing those areas of image which
contains edges, sharpness of the image will increase and image will become clearer.

Here are some of the masks for edge detection that we will discuss in the upcoming tutorials.

 Prewitt Operator

 Sobel Operator

 Robinson Compass Masks

 Krisch Compass Masks

 Laplacian Operator.

Above mentioned all the filters are Linear filters or smoothing filters.

Prewitt Operator

Prewitt operator is used for detecting edges horizontally and vertically.

Sobel Operator

The sobel operator is very similar to Prewitt operator. It is also a derivate mask and is used for
edge detection. It also calculates edges in both horizontal and vertical direction.

Robinson Compass Masks

This operator is also known as direction mask. In this operator we take one mask and rotate it
in all the 8 compass major directions to calculate edges of each direction.
Kirsch Compass Masks

Kirsch Compass Mask is also a derivative mask which is used for finding edges. Kirsch mask is
also used for calculating edges in all the directions.

Laplacian Operator

Laplacian Operator is also a derivative operator which is used to find edges in an image.
Laplacian is a second order derivative mask. It can be further divided into positive laplacian and
negative laplacian.

All these masks find edges. Some find horizontally and vertically, some find in one direction
only and some find in all the directions. The next concept that comes after this is sharpening
which can be done once the edges are extracted from the image.

6. Prewitt operator in detail?

Prewitt operator is used for edge detection in an image. It detects two types of edges

 Horizontal edges

 Vertical Edges

Edges are calculated by using difference between corresponding pixel intensities of an image.
All the masks that are used for edge detection are also known as derivative masks. Because as
we have stated many times before in this series of tutorials that image is also a signal so
changes in a signal can only be calculated using differentiation. So that’s why these operators
are also called as derivative operators or derivative masks.

All the derivative masks should have the following properties:

 Opposite sign should be present in the mask.

 Sum of mask should be equal to zero.

 More weight means more edge detection.

Prewitt operator provides us two masks one for detecting edges in horizontal direction and
another for detecting edges in an vertical direction.

Vertical direction
-1 0 1

-1 0 1

-1 0 1

Above mask will find the edges in vertical direction and it is because the zeros column in the
vertical direction. When you will convolve this mask on an image, it will give you the vertical
edges in an image.

How it works

When we apply this mask on the image it prominent vertical edges. It simply works like as first
order derivate and calculates the difference of pixel intensities in a edge region. As the center
column is of zero so it does not include the original values of an image but rather it calculates
the difference of right and left pixel values around that edge. This increase the edge intensity
and it become enhanced comparatively to the original image.

Horizontal Direction

-1 -1 -1

0 0 0

1 1 1

Above mask will find edges in horizontal direction and it is because that zeros column is in
horizontal direction. When you will convolve this mask onto an image it would prominent
horizontal edges in the image.

How it works

This mask will prominent the horizontal edges in an image. It also works on the principle of
above mask and calculates difference among the pixel intensities of a particular edge. As the
center row of mask is consist of zeros so it does not include the original values of edge in the
image but rather it calculate the difference of above and below pixel intensities of the
particular edge. Thus increasing the sudden change of intensities and making the edge more
visible. Both the above masks follow the principle of derivate mask. Both masks have opposite
sign in them and both masks sum equals to zero. The third condition will not be applicable in
this operator as both the above masks are standardize and we can’t change the value in them.

Now it’s time to see these masks in action:

Sample Image

Following is a sample picture on which we will apply above two masks one at time.

After applying Vertical Mask

After applying vertical mask on the above sample image, following image will be obtained. This
image contains vertical edges. You can judge it more correctly by comparing with horizontal
edges picture.

After applying Horizontal Mask

After applying horizontal mask on the above sample image, following image will be obtained.
Comparison

As you can see that in the first picture on which we apply vertical mask, all the vertical edges
are more visible than the original image. Similarly in the second picture we have applied the
horizontal mask and in result all the horizontal edges are visible. So in this way you can see that
we can detect both horizontal and vertical edges from an image.

7. Write Sobel operator in detail?


The sobel operator is very similar to Prewitt operator. It is also a derivate mask and is used for
edge detection. Like Prewitt operator sobel operator is also used to detect two kinds of edges
in an image:

 Vertical direction

 Horizontal direction

Difference with Prewitt Operator

The major difference is that in sobel operator the coefficients of masks are not fixed and they
can be adjusted according to our requirement unless they do not violate any property of
derivative masks.

Following is the vertical Mask of Sobel Operator:

-1 0 1
-2 0 2

-1 0 1

This mask works exactly same as the Prewitt operator vertical mask. There is only one
difference that is it has “2” and “-2” values in center of first and third column. When applied on
an image this mask will highlight the vertical edges.

How it works

When we apply this mask on the image it prominent vertical edges. It simply works like as first
order derivate and calculates the difference of pixel intensities in a edge region.

As the center column is of zero so it does not include the original values of an image but rather
it calculates the difference of right and left pixel values around that edge. Also the center values
of both the first and third column is 2 and -2 respectively.

This give more weight age to the pixel values around the edge region. This increase the edge
intensity and it become enhanced comparatively to the original image.

Following is the horizontal Mask of Sobel Operator

-1 -2 -1

0 0 0

1 2 1

Above mask will find edges in horizontal direction and it is because that zeros column is in
horizontal direction. When you will convolve this mask onto an image it would prominent
horizontal edges in the image. The only difference between it is that it have 2 and -2 as a center
element of first and third row.

How it works

This mask will prominent the horizontal edges in an image. It also works on the principle of
above mask and calculates difference among the pixel intensities of a particular edge. As the
center row of mask is consist of zeros so it does not include the original values of edge in the
image but rather it calculate the difference of above and below pixel intensities of the
particular edge. Thus increasing the sudden change of intensities and making the edge more
visible.

Now it’s time to see these masks in action:

Sample Image

Following is a sample picture on which we will apply above two masks one at time.

After applying Vertical Mask

After applying vertical mask on the above sample image, following image will be obtained.

After applying Horizontal Mask

After applying horizontal mask on the above sample image, following image will be obtained
Comparison

As you can see that in the first picture on which we apply vertical mask, all the vertical edges
are more visible than the original image. Similarly in the second picture we have applied the
horizontal mask and in result all the horizontal edges are visible.

So in this way you can see that we can detect both horizontal and vertical edges from an image.
Also if you compare the result of sobel operator with Prewitt operator, you will find that sobel
operator finds more edges or make edges more visible as compared to Prewitt Operator.

This is because in sobel operator we have allotted more weight to the pixel intensities around
the edges.

Applying more weight to mask

Now we can also see that if we apply more weight to the mask, the more edges it will get for us.
Also as mentioned in the start of the tutorial that there is no fixed coefficients in sobel
operator, so here is another weighted operator

-1 0 1

-5 0 5

-1 0 1

If you can compare the result of this mask with of the Prewitt vertical mask, it is clear that this
mask will give out more edges as compared to Prewitt one just because we have allotted more
weight in the mask.
8. Write note on laplacian operator ?
Laplacian Operator is also a derivative operator which is used to find edges in an image. The
major difference between Laplacian and other operators like Prewitt, Sobel, Robinson and
Kirsch is that these all are first order derivative masks but Laplacian is a second order derivative
mask. In this mask we have two further classifications one is Positive Laplacian Operator and
other is Negative Laplacian Operator.

Another difference between Laplacian and other operators is that unlike other operators
Laplacian didn’t take out edges in any particular direction but it take out edges in following
classification.

 Inward Edges

 Outward Edges

Let’s see that how Laplacian operator works.

Positive Laplacian Operator

In Positive Laplacian we have standard mask in which center element of the mask should be
negative and corner elements of mask should be zero.

0 1 0

1 -4 1

0 1 0

Positive Laplacian Operator is use to take out outward edges in an image.

Negative Laplacian Operator

In negative Laplacian operator we also have a standard mask, in which center element should
be positive. All the elements in the corner should be zero and rest of all the elements in the
mask should be -1.
0 -1 0

-1 4 -1

0 -1 0

Negative Laplacian operator is use to take out inward edges in an image

How it works

Laplacian is a derivative operator; its uses highlight gray level discontinuities in an image and try
to deemphasize regions with slowly varying gray levels. This operation in result produces such
images which have grayish edge lines and other discontinuities on a dark background. This
produces inward and outward edges in an image

The important thing is how to apply these filters onto image. Remember we can’t apply both
the positive and negative Laplacian operator on the same image. we have to apply just one but
the thing to remember is that if we apply positive Laplacian operator on the image then we
subtract the resultant image from the original image to get the sharpened image. Similarly if we
apply negative Laplacian operator then we have to add the resultant image onto original image
to get the sharpened image.

9. Write about Conversion of analog to digital signals?


Since there are lot of concepts related to this analog to digital conversion and vice-versa. We
will only discuss those which are related to digital image processing.There are two main
concepts that are involved in the coversion.

 Sampling

 Quantization

Sampling
Sampling as its name suggests can be defined as take samples. Take samples of a digital signal
over x axis. Sampling is done on an independent variable. In case of this mathematical
equation:
Sampling is done on the x variable. We can also say that the conversion of x axis (infinite values)
to digital is done under sampling.

Sampling is further divide into up sampling and down sampling. If the range of values on x-axis
are less then we will increase the sample of values. This is known as up sampling and its vice
versa is known as down sampling.

Quantization
Quantization as its name suggest can be defined as dividing into quanta (partitions).
Quantization is done on dependent variable. It is opposite to sampling.

In case of this mathematical equation y = sin(x)

Quantization is done on the Y variable. It is done on the y axis. The conversion of y axis infinite
values to 1, 0, -1 (or any other level) is known as Quantization.

These are the two basics steps that are involved while converting an analog signal to a digital
signal.

The quantization of a signal has been shown in the figure below.

Why do we need to convert an analog signal to digital signal.

The first and obvious reason is that digital image processing deals with digital images, that are
digital signals. So when ever the image is captured, it is converted into digital format and then it
is processed.
The second and important reason is, that in order to perform operations on an analog signal
with a digital computer, you have to store that analog signal in the computer. And in order to
store an analog signal, infinite memory is required to store it. And since thats not possible, so
thats why we convert that signal into digital format and then store it in digital computer and
then performs operations on it.

10. Discuss Fundamental Steps of Digital Image Processing in detail?


Fundamental Steps of Digital Image Processing

There are some fundamental steps but as they are fundamental, all these steps may have sub-
steps. The fundamental steps are described below with a neat diagram

1. Image Acquisition :

This is the first step or process of the fundamental steps of digital image processing.
Image acquisition could be as simple as being given an image that is already in digital form.
Generally, the image acquisition stage involves preprocessing, such as scaling etc.

2. Image Enhancement : Image enhancement is among the simplest and most appealing
areas of digital image processing. Basically, the idea behind enhancement techniques is
to bring out detail that is obscured, or simply to highlight certain features of interest in
an image. Such as, changing brightness & contrast etc.
3. Image Restoration : Image restoration is an area that also deals with improving the
appearance of an image. However, unlike enhancement, which is subjective, image
restoration is objective, in the sense that restoration techniques tend to be based on
mathematical or probabilistic models of image degradation.

4. Color Image Processing : Color image processing is an area that has been gaining its
importance because of the significant increase in the use of digital images over the
Internet. This may include color modeling and processing in a digital domain etc.

5. Wavelets and Multiresolution Processing : Wavelets are the foundation for


representing images in various degrees of resolution. Images subdivision successively
into smaller regions for data compression and for pyramidal representation.

6. Compression : Compression deals with techniques for reducing the storage required to
save an image or the bandwidth to transmit it. Particularly in the uses of internet it is
very much necessary to compress data.

7. Morphological Processing :

Morphological processing deals with tools for extracting image components that are
useful in the representation and description of shape.

8. Segmentation : Segmentation procedures partition an image into its constituent parts


or objects. In general, autonomous segmentation is one of the most difficult tasks in
digital image processing. A rugged segmentation procedure brings the process a long
way toward successful solution of imaging problems that require objects to be identified
individually.
9. Representation and Description : Representation and description almost always follow
the output of a segmentation stage, which usually is raw pixel data, constituting either
the boundary of a region or all the points in the region itself. Choosing a representation
is only part of the solution for transforming raw data into a form suitable for subsequent
computer processing. Description deals with extracting attributes that result in some
quantitative information of interest or are basic for differentiating one class of objects
from another.
10. Object recognition : Recognition is the process that assigns a label, such as, “vehicle” to
an object based on its descriptors.
11. Knowledge Base : Knowledge may be as simple as detailing regions of an image where
the information of interest is known to be located, thus limiting the search that has to
be conducted in seeking that information. The knowledge base also can be quite
complex, such as an interrelated list of all major possible defects in a materials
inspection problem or an image database containing high-resolution satellite images of
a region in connection with change-detection applications.

11.Discuss Colour Models in detail ?

Colour models provide a standard way to specify a particular colour, by defining a 3D


coordinate system, and a subspace that contains all constructible colours within a particular
model. Any colour that can be specified using a model will correspond to a single point within
the subspace it defines. Each colour model is oriented towards either specific hardware
(RGB,CMY,YIQ), or image processing applications (HSI).

1 The RGB Model

In the RGB model, an image consists of three independent image planes, one in each of the
primary colours: red, green and blue. (The standard wavelengths for the three primaries are as
shown in figure 1). Specifying a particular colour is by specifying the amount of each of the
primary components present. Figure 5 shows the geometry of the RGB colour model for
specifying colours using a Cartesian coordinate system. The greyscale spectrum, i.e. those
colours made from equal amounts of each primary, lies on the line joining the black and white
vertices.

Figure

Figure 5: The RGB colour cube. The greyscale spectrum lies on the line joining the black and
white vertices.

This is an additive model, i.e. the colours present in the light add to form new colours, and is
appropriate for the mixing of coloured light for example. The image on the left of
figure 6 shows the additive mixing of red, green and blue primaries to form the three secondary
colours yellow (red + green), cyan (blue + green) and magenta (red + blue), and white ((red +
green + blue).

The RGB model is used for colour monitors and most video cameras.

2 The CMY Model

The CMY (cyan-magenta-yellow) model is a subtractive model appropriate to absorption of


colours, for example due to pigments in paints. Whereas the RGB model asks what is added to
black to get a particular colour, the CMY model asks what is subtracted from white. In this case,
the primaries are cyan, magenta and yellow, with red, green and blue as secondary colours (see
the image on the right of figure 6).

When a surface coated with cyan pigment is illuminated by white light, no red light is reflected,
and similarly for magenta and green, and yellow and blue. The relationship between the RGB
and CMY models is given by:

æ C ö æ 1 ö æ R ö
ç ÷ ç ÷ ç ÷
ç M ÷ ç 1 ÷ ç G ÷
= - .
ç ÷ ç ÷ ç ÷
ç ÷ ç ÷ ç ÷
Y 1 B
è ø è ø è ø

The CMY model is used by printing devices and filters.

Figure

Figure 6: The figure on the left shows the additive mixing of red, green and blue primaries to
form the three secondary colours yellow (red + green), cyan (blue + green) and magenta (red +
blue), and white ((red + green + blue). The figure on the right shows the three subtractive
primaries, and their pairwise combinations to form red, green and blue, and finally black by
subtracting all three primaries from white.

Why does blue paint plus yellow paint give green?


As all schoolchildren know, the way to make green paint is to mix blue paint with yellow. But
how does this work? If blue paint absorbs all but blue light, and yellow absorbs blue only, when
combined no light should be reflected and black paint result.

However, what actually happens is that imperfections in the paint are exploited. In practice,
blue paint reflects not only blue, but also some green. Since the yellow paint also reflects green
(since yellow = green + red), some green is reflected by both pigments, and all other colours are
abosrbed, resulting in green paint.

3 The HSI Model

As mentioned above, colour may be specified by the three quantities hue, saturation and
intensity. This is the HSI model, and the entire space of colours that may be specified in this way
is shown in figure 7.

Figure

Figure 7: The HSI model, showing the HSI solid on the left, and the HSI triangle on the right,
formed by taking a horizontal slice through the HSI solid at a particular intensity. Hue is
measured from red, and saturation is given by distance from the axis. Colours on the surface of
the solid are fully saturated, i.e. pure colours, and the greyscale spectrum is on the axis of the
solid. For these colours, hue is undefined.

Conversion between the RGB model and the HSI model is quite complicated. The intensity is
given by

R+G+B

I= ,

3
where the quantities R, G and B are the amounts of the red, green and blue components,
normalised to the range [0,1]. The intensity is therefore just the average of the red, green and
blue components. The saturation is given by:

min (R,G,B) 3

S=1- =1- min (R,G,B)

R+G+B
I

where the min(R,G,B) term is really just indicating the amount of white present. If any of R, G or
B are zero, there is no white and we have a pure colour. The expression for the hue, and details
of the derivation may be found in reference [1].

4 The YIQ Model

The YIQ (luminance-inphase-quadrature) model is a recoding of RGB for colour television, and is
a very important model for colour image processing. The importance of luminance was
discussed in § 1.

The conversion from RGB to YIQ is given by:

æ Y ö æ 0.299 0.587 0.114 ö æ R ö


ç ÷ ç ÷ ç ÷
ç I ÷ ç 0.596 -0.275 -0.321 ÷ ç G ÷
= .
ç ÷ ç ÷ ç ÷
ç ÷ ç ÷ ç ÷
Q 0.212 -0.523 0.311 B
è ø è ø è ø

The luminance (Y) component contains all the information required for black and white
television, and captures our perception of the relative brightness of particular colours. That we
perceive green as much lighter than red, and red lighter than blue, is indicated by their
respective weights of 0.587, 0.299 and 0.114 in the first row of the conversion matrix above.
These weights should be used when converting a colour image to greyscale if you want the
perception of brightness to remain the same. This is not the case for the intensity component in
an HSI image, as shown in figure 8.

The Y component is the same as the CIE primary Y (see § 2.1).


Figure

Figure 8: Image (a) shows a colour test pattern, consisting of horizontal stripes of black, blue,
green, cyan, red, magenta and yellow, a colour ramp with constant intensity, maximal
saturation, and hue changing linearly from red through green to blue, and a greyscale ramp
from black to white. Image (b) shows the intensity for image (a). Note how much detail is lost.
Image (c) shows the luminance. This third image accurately reflects the brightness variations
preceived in the original image.

12. What is Pixel connectivity?


Pixel connectivity is defined in terms of pixel neighbourhoods. A normal rectangular sampling
pattern producing a finite arithmetic lattice {(x,y): x = 0, 1, ..., X−1; y = 0, 1, ..., Y−1} supporting
digital images allows us to define two types of neighbourhood surrounding a pixel. A 4-
neighbourhood {(x−1,y), (x,y+1), (x+1,y), (x,y−1)} contains only the pixels above, below, to the
left and to the right of the central pixel (x,y). An 8-neighbourhood adds to the 4-neighbourhood
four diagonal neighbours: {(x−1,y−1),(x−1,y), (x−1,y+1), (x,y+1), (x+1,y+1), (x+1,y), (x+1,y−1),
(x,y−1)}.
A 4-connected path from a pixel p1 to another pixel pn is defined as the sequence of pixels
{p1, p2, ..., pn} such that pi+1 is a 4-neighbour of pi for all i = 1, ..., n−1. The path is 8-
connected if pi+1 is an 8-neighbour of pi. A set of pixels is a 4-connected region if there exists at
least one 4-connected path between any pair of pixels from that set. The 8-connected
region has at least one 8-connected path between any pair of pixels from that set.

13. Discuss introduction to Histograms?


A histogram is a graph. A graph that shows frequency of anything. Usually histogram have bars
that represent frequency of occurring of data in the whole data set.

A Histogram has two axis the x axis and the y axis.

The x axis contains event whose frequency you have to count.

The y axis contains frequency.

The different heights of bar shows different frequency of occurrence of data.

Usually a histogram looks like this.


Now we will see an example of this histogram is build

Example

Consider a class of programming students and you are teaching python to them.

At the end of the semester, you got this result that is shown in table. But it is very messy and
does not show your overall result of class. So you have to make a histogram of your result,
showing the overall frequency of occurrence of grades in your class. Here how you are going to
do it.

Result sheet

Name Grade

John A

Jack D

Carter B

Tommy A

Lisa C+

Derek A-

Tom B+

Histogram of result sheet

Now what you are going to do is, that you have to find what comes on the x and the y axis.

There is one thing to be sure, that y axis contains the frequency, so what comes on the x axis. X
axis contains the event whose frequency has to be calculated. In this case x axis contains
grades.
Now we will how do we use a histogram in an image.

Histogram of an image

Histogram of an image, like other histograms also shows frequency. But an image histogram,
shows frequency of pixels intensity values. In an image histogram, the x axis shows the gray
level intensities and the y axis shows the frequency of these intensities.

For example

The histogram of the above picture of the Einstein would be something like this
The x axis of the histogram shows the range of pixel values. Since its an 8 bpp image, that
means it has 256 levels of gray or shades of gray in it. Thats why the range of x axis starts from
0 and end at 255 with a gap of 50. Whereas on the y axis, is the count of these intensities.

As you can see from the graph, that most of the bars that have high frequency lies in the first
half portion which is the darker portion. That means that the image we have got is darker. And
this can be proved from the image too.

Applications of Histograms
Histograms has many uses in image processing. The first use as it has also been discussed above
is the analysis of the image. We can predict about an image by just looking at its histogram. Its
like looking an x ray of a bone of a body.

The second use of histogram is for brightness purposes. The histograms has wide application in
image brightness. Not only in brightness, but histograms are also used in adjusting contrast of
an image.

Another important use of histogram is to equalize an image.

And last but not the least, histogram has wide use in thresholding. This is mostly used in
computer vision

14. What is PMF?


PMF stands for probability mass function. As it name suggest , it gives the probability of each
number in the data set or you can say that it basically gives the count or frequency of each
element.

1 2 7 5 6

7 2 3 4 5

0 1 5 7 3
1 2 5 6 7

6 1 0 3 4

How PMF is calculated:

We will calculate PMF from two different ways. First from a matrix , because in the next
tutorial , we have to calculate the PMF from a matrix , and an image is nothing more then a two
dimensional matrix.

Then we will take another example in which we will calculate PMF from the histogram.

Consider this matrix.

Now if we were to calculate the PMF of this matrix , here how we are going to do it.

At first , we will take the first value in the matrix , and then we will count , how much time this
value appears in the whole matrix. After count they can either be represented in a histogram ,
or in a table like this below.

PMF

0 2 2/25

1 4 4/25

2 3 3/25

3 3 3/25

4 2 2/25

5 4 4/25

6 3 3/25
7 4 4/25

Note that the sum of the count must be equal to total number of values.

Calculating PMF from histogram

The above histogram shows frequency of gray level values for an 8 bits per pixel image.

Now if we have to calculate its PMF , we will simple look at the count of each bar from vertical
axis and then divide it by total count.

So the PMF of the above histogram is this.

Another important thing to note in the above histogram is that it is not monotonically
increasing. So in order to increase it monotonically, we will calculate its CDF.

15. What is CDF?


CDF stands for cumulative distributive function. It is a function that calculates the cumulative
sum of all the values that are calculated by PMF. It basically sums the previous one.

How it is calculated?

We will calculate CDF using a histogram. Here how it is done. Consider the histogram shown
above which shows PMF.

Since this histogram is not increasing monotonically , so will make it grow monotonically.

We will simply keep the first value as it is , and then in the 2nd value , we will add the first one
and so on.

Here is the CDF of the above PMF function.

Now as you can see from the graph above , that the first value of PMF remain as it is. The
second value of PMF is added in the first value and placed over 128. The third value of PMF is
added in the second value of CDF , that gives 110/110 which is equal to 1.

16. Write note on Histogram Equalization?


Histogram equalization is used to enhance contrast. It is not necessary that contrast will always
be increase in this. There may be some cases were histogram equalization can be worse. In that
cases the contrast is decreased.

Lets start histogram equalization by taking this image below as a simple image.

Image
Histogram of this image:

The histogram of this image has been shown below.

Now we will perform histogram equalization to it.

PMF:

First we have to calculate the PMF (probability mass function) of all the pixels in this image. If
you donot know how to calculate PMF, please visit our tutorial of PMF calculation.

CDF:

Our next step involves calculation of CDF (cumulative distributive function). Again if you donot
know how to calculate CDF , please visit our tutorial of CDF calculation.

Calculate CDF according to gray levels

Lets for instance consider this , that the CDF calculated in the second step looks like this.
Gray Level Value CDF

0 0.11

1 0.22

2 0.55

3 0.66

4 0.77

5 0.88

6 0.99

7 1

Then in this step you will multiply the CDF value with (Gray levels (minus) 1) .

Considering we have an 3 bpp image. Then number of levels we have are 8. And 1 subtracts 8 is
7. So we multiply CDF by 7. Here what we got after multiplying.

Gray Level Value CDF CDF * (Levels-1)

0 0.11 0

1 0.22 1

2 0.55 3
3 0.66 4

4 0.77 5

5 0.88 6

6 0.99 6

7 1 7

Now we have is the last step , in which we have to map the new gray level values into number
of pixels.

Lets assume our old gray levels values has these number of pixels.

Gray Level Value Frequency

0 2

1 4

2 6

3 8

4 10

5 12

6 14
7 16

Now if we map our new values to , then this is what we got.

Gray Level Value New Gray Level Value Frequency

0 0 2

1 1 4

2 3 6

3 4 8

4 5 10

5 6 12

6 6 14

7 7 16

Now map these new values you are onto histogram , and you are done.

Lets apply this technique to our original image. After applying we got the following image and
its following histogram.

Histogram Equalization Image


Cumulative Distributive function of this image

Histogram Equalization histogram

Comparing both the histograms and images


Conclusion

As you can clearly see from the images that the new image contrast has been enhanced and its
histogram has also been equalized. There is also one important thing to be note here that
during histogram equalization the overall shape of the histogram changes, where as in
histogram stretching the overall shape of histogram remains same.

17. What is Convolution?


Convolving mask over image. It is done in this way. Place the center of the mask at each
element of an image. Multiply the corresponding elements and then add them , and paste the
result onto the element of the image on which you place the center of mask.
The box in red color is the mask , and the values in the orange are the values of the mask. The
black color box and values belong to the image. Now for the first pixel of the image , the value
will be calculated as

First pixel = (5*2) + (4*4) + (2*8) + (1*10)

= 10 + 16 + 16 + 10

= 52

Place 52 in the original image at the first index and repeat this procedure for each pixel of the
image.

18. What is Weighted average filter?


In weighted average filter, we gave more weight to the center value. Due to which the
contribution of center becomes more then the rest of the values. Due to weighted average
filtering , we can actually control the blurring.

Properties of the weighted average filter are.

 It must be odd ordered

 The sum of all the elements should be 1

 The weight of center element should be more then all of the other elements
Filter 1

1 1 1

1 2 1

1 1 1

The two properties are satisfied which are (1 and 3). But the property 2 is not satisfied. So in
order to satisfy that we will simple divide the whole filter by 10, or multiply it with 1/10.

Filter 2

1 1 1

1 10 1

1 1 1

Dividing factor = 18.

19. What do you mean by Mean filter?


Mean filter is also known as Box filter and average filter. A mean filter has the following
properties.

 It must be odd ordered

 The sum of all the elements should be 1

 All the elements should be same


If we follow this rule, then for a mask of 3x3. We get the following result.

1/9 1/9 1/9

1/9 1/9 1/9

1/9 1/9 1/9

Since it is a 3x3 mask, that means it has 9 cells. The condition that all the element sum should
be equal to 1 can be achieved by dividing each value by 9. As

1/9 + 1/9 + 1/9 + 1/9 + 1/9 + 1/9 + 1/9 + 1/9 + 1/9 = 9/9 = 1

The result of a mask of 3x3 on an image is shown below

Original Image

Blurred Image

May be the results are not much clear. Let’s increase the blurring. The blurring can be increased
by increasing the size of the mask. The more is the size of the mask, the more is the blurring.
Because with greater mask, greater number of pixels are catered and one smooth transition is
defined.

20. What is median filter ?


The median filter is normally used to reduce noise in an image, somewhat like the mean filter.
However, it often does a better job than the mean filter of preserving useful detail in the image.

How It Works

Like the mean filter, the median filter considers each pixel in the image in turn and looks at its
nearby neighbors to decide whether or not it is representative of its surroundings. Instead of
simply replacing the pixel value with the mean of neighboring pixel values, it replaces it with
the median of those values. The median is calculated by first sorting all the pixel values from
the surrounding neighborhood into numerical order and then replacing the pixel being
considered with the middle pixel value. (If the neighborhood under consideration contains an
even number of pixels, the average of the two middle pixel values is used.) Figure 1 illustrates
an example calculation.

Figure 1 Calculating the median value of a pixel neighborhood. As can be seen, the central pixel
value of 150 is rather unrepresentative of the surrounding pixels and is replaced with the
median value: 124. A 3×3 square neighborhood is used here --- larger neighborhoods will
produce more severe smoothing
21. What is Bit-plane slicing?
Instead of highlighting gray level images, highlighting the contribution made to total image
appearance by specific bits might be desired. Suppose that each pixel in an image is
represented by 8 bits. Imagine the image is composed of 8, 1-bit planes ranging from bit
plane1-0 (LSB)to bit plane 7 (MSB).

In terms of 8-bits bytes, plane 0 contains all lowest order bits in the bytes comprising the pixels
in the image and plane 7 contains all high order bits.

Figure (5.15)

Separating a digital image into its bit planes is useful for analyzing the relative importance
played by each bit of the image, implying, it determines the adequacy of numbers of bits used
to quantize each pixel , useful for image compression.

In terms of bit-plane extraction for a 8-bit image, it is seen that binary image for bit plane 7 is
obtained by proceeding the input image with a thresholding gray-level transformation function
that maps all levels between 0 and 127 to one level (e.g 0)and maps all levels from 129 to 253
to another (eg. 255).

22. Write detailed note on Thresholding and types of it?


Thresholding
Fig. 1.: The process of thresholding along with its inputs and outputs.

Definition
Thresholding is a process of converting a grayscale input image to a bi-level image by using an
optimal threshold.

Purpose
The purpose of thresholding is to extract those pixels from some image which represent
an object (either text or other line image data such as graphs, maps). Though the information is
binary the pixels represent a range of intensities. Thus the objective of binarization is to mark
pixels that belong to true foreground regions with a single intensity and background regions
with different intensities.

Thresholding algorithms
For a thresholding algorithm to be really effective, it should preserve logical and semantic
content. There are two types of thresholding algorithms

1. Global thresholding algorithms

2. Local or adaptive thresholding algorithms

In global thresholding, a single threshold for all the image pixels is used. When the pixel values
of the components and that of background are fairly consistent in their respective values over
the entire image, global thresholding could be used.

In adaptive thresholding, different threshold values for different local areas are used.

Quadratic Integral Ratio (QIR) algorithm


Fig. 2: Three sub images of QIR method

Method: QIR is a global two stage thresholding technique that uses intensity histogram to find
the threshold.

The first stage of the algorithm divides an image into three subimages: foreground, background,
and a fuzzy subimage where it is hard to determine whether a pixel actually belongs to the
foreground or the Background. Two important parameters that separate the subimages are A,
which separates the foreground and the fuzzy subimage, and C, which separate the fuzzy and
the background subimage. If a pixel's intensity is less than or equal to A, the pixel belongs to the
foreground. If a pixel's intensity is greater than or equal to C, the pixel belongs to the
background. If a pixel has an intensity value between A and C, it belongs to the fuzzy sub image
and more information is needed from the image to decide whether it actually belongs to the
foreground or the background.

The strategy is to eliminate all pixels with intensity level in [0,A] and [C,255]. Thus produce a
range of promising threshold values delimited by the parameter A and C (T [A,C]).

Performance (with respect to our experiments): QIR performed well as it generally was able to
separate definite foreground (dark) pixels and definite (background pixels). The uncertain or
fuzzy pixels were clearly defined and required further processing to determine appropriate
assignment to background or foreground.

OTSU algorithm
Method: This type of thresholding is global thresholding. It stores the intensities of the pixels in
an array. The threshold is calculated by using total mean and variance. Based on this threshold
value each pixel is set to either 0 or 1. i.e. background or foreground. Thus here the change of
image takes place only once.

The following formulas are used to calculate the total mean and variance.

The pixels are divided into 2 classes, C1 with gray levels [1, ...,t] and C2 with gray levels [t+1, ...
,L].

The probability distribution for the two classes is:

Also, the means for the two classes are

Using Discriminant Analysis, Otsu defined the between-class variance of the thresholded image
as

For bi-level thresholding, Otsu verified that the optimal threshold t* is chosen so that the
between-class variance B is maximized; that is,

Performance (with respect to our experiments Otsu works well with some images and
performs badly with some. The majority of the results from Otsu have too much of noise in the
form of the background being detected as foreground. Otsu can be used for thresholding if the
noise removal and character recognition implementations are really good. The main advantage
is the simplicity of calculation of the threshold. Since it is a global algorithm it is well suited only
for the images with equal intensities. This might not give a good result for the images with lots
of variation in the intensities of pixels.
23. What is Redundancy and types of redundancy in image compression?
Data compression is the process of reducing the amount of data required to represent a given
quantity of information. Different amounts of data might be used to communicate the same
amount of information. If the same information can be represented using different amounts of
data, it is reasonable to believe that the representation that requires more data contains what
is technically called data redundancy.

Image compression and coding techniques explore three types of


redundancies: coding redundancy, interpixel (spatial) redundancy,
and psychovisual redundancy. The way each of them is explored is briefly described below.

 Coding redundancy: consists in using variable-length codewords selected as to


match the statistics of the original source, in this case, the image itself or a processed
version of its pixel values. This type of coding is always reversible and usually
implemented using look-up tables (LUTs). Examples of image coding schemes that
explore coding redundancy are the Huffman codes and the arithmetic coding technique.

 Interpixel redundancy: this type of redundancy – sometimes called spatial


redundancy, interframe redundancy, or geometric redundancy – exploits the fact that
an image very often contains strongly correlated pixels, in other words, large regions
whose pixel values are the same or almost the same. This redundancy can be explored
in several ways, one of which is by predicting a pixel value based on the values of its
neighboring pixels. In order to do so, the original 2-D array of pixels is usually mapped
into a different format, e.g., an array of differences between adjacent pixels. If the
original image pixels can be reconstructed from the transformed data set the mapping is
said to be reversible. Examples of compression techniques that explore the interpixel
redundancy include: Constant Area Coding (CAC), (1-D or 2-D) Run-Length Encoding
(RLE) techniques, and many predictive coding algorithms such as Differential Pulse Code
Modulation (DPCM).

 Psychovisual redundancy: many experiments on the psychophysical aspects of


human vision have proven that the human eye does not respond with equal sensitivity
to all incoming visual information; some pieces of information are more important than
others. The knowledge of which particular types of information are more or less
relevant to the final human user have led to image and video compression techniques
that aim at eliminating or reducing any amount of data that is psychovisually redundant.
The end result of applying these techniques is a compressed image file, whose size and
quality are smaller than the original information, but whose resulting quality is still
acceptable for the application at hand. The loss of quality that ensues as a byproduct of
such techniques is frequently called quantization, as to indicate that a wider range of
input values is normally mapped into a narrower range of output values thorough an
irreversible process. In order to establish the nature and extent of information loss,
different fidelity criteria (some objective such as root mean square (RMS) error, some
subjective, such as pairwise comparison of two images encoded with different quality
settings) can be used. Most of the image coding algorithms in use today exploit this type
of redundancy, such as the Discrete Cosine Transform (DCT)-based algorithm at the
heart of the JPEG encoding standard.

24. What is Lossless Compression ?


Lossless compression algorithms reduce file size with no loss in image quality. When the file is
saved it is compressed, when it is decompressed (opened) the original data is retrieved. The file
data is only temporarily 'thrown away', so that the file can be transferred.

This type of compression can be applied not just to graphics but to any kind of computer data
such as spreadsheets, text documents and software applications. If you need to send files as an
email attachment, then you may be best to compress it first. A common format which is used to
do this is the .zip format. If you've downloaded a software program from the Internet it may
have been in this or another compressed format. When you open the file up all the original
data is retrieved. Think of it like this: if you compress a word document with a lossless
algorithm it looks for repeated letters and temporarily discards them. When the document is
decompressed, the letters are retrieved. If they weren't then the document wouldn't make
sense. The following example gives a very simple illustration of how this works:

Uncompressed Data Compressed Data Decompressed Data

Hello Sandra HllSndr Hello Sandra

25. What is Lossy Compression ?


Lossy compression also looks for 'redundant' pixel information, however, it permenantly
discards it. This means that when the file is decompressed the original data isn't retrieved.
You're probably wondering how can this be effective? Won't it look obvious that data is
missing? Well, it is effective and data doesn't look as though it's missing.

As stated before, lossy compression isn't used for data such as text based documents and
software, since they need to keep all their information. Lossy is only effective with media
elements that can still 'work' without all their original data. These include audio, video, images
and detailed graphics for screen design (computers, TVs, projector screens).

"Lossy compression algorithms take advantage of the inherent limitations of the human eye and
discard information that cannot be seen".

26. What is Image enhancement?

The aim of image enhancement is to improve the interpretability or perception of information


in images for human viewers, or to provide `better' input for other automated image processing
techniques.

Image enhancement techniques can be divided into two broad categories:

1. Spatial domain methods, which operate directly on pixels, and


2. frequency domain methods, which operate on the Fourier transform of an image.

Unfortunately, there is no general theory for determining what is `good' image enhancement
when it comes to human perception. If it looks good, it is good! However, when image
enhancement techniques are used as pre-processing tools for other image processing
techniques, then quantitative measures can determine which techniques are most appropriate.

27. Discuss the Convolution Theorem?

Convolution Theorem

The relationship between the spatial domain and the frequency domain can be established by
convolution theorem.

The convolution theorem can be represented as.


It can be stated as the convolution in spatial domain is equal to filtering in frequency domain
and vice versa.

The filtering in frequency domain can be represented as following:

The steps in filtering are given below.

 At first step we have to do some pre – processing an image in spatial domain, means
increase its contrast or brightness

 Then we will take discrete Fourier transform of the image

 Then we will center the discrete Fourier transform, as we will bring the discrete Fourier
transform in center from corners

 Then we will apply filtering, means we will multiply the Fourier transform by a filter
function

 Then we will again shift the DFT from center to the corners

 Last step would be take to inverse discrete Fourier transform, to bring the result back
from frequency domain to spatial domain

 And this step of post processing is optional, just like pre processing , in which we just
increase the appearance of image.
28. Write down Lossy compression algorithms in detail?

Lossy predictive coding

A quantizer, that also


executes rounding, is
now added between
the calculation of the
prediction error
en and the symbol
encoder. It maps en to
a limited range of
values qn and
determines both the
amount of extra
compression and the
deviation of the error-
free compression.
This happens in a
closed circuit with the
predictor to restrict
an increase in errors.
The predictor does
not use en but rather
qn, because it is
known by both the
encoder and decoder.

Delta Modulation is a simple but well known form of it:

pn = pin with < 1 (here, pi stands for the "predictor input")


qn = sign ( en) and can be represented by a 1-bit value: - or +

Disadvantages are the so-called "slope overload" because a big step in fn must be broken down
into a few smaller steps , and the "granular noise" because a step of must be made
repeatedly, see [fig 8.22].
With Differential Pulse Code modulation (DPCM), pn = i=1 m pin-i. Under the assumption that
i

the quantization error (en-qn) is small, the optimal values of i can be found by minimizing
E{en2} = E{ [fn-pn]2 } . The is seem to depend on the autocorrelation matrices of the image.
These calculations are almost never done for each single image but rather for a few typical
images or from models of them. See [fig 8.23 and 8.24] for the prediction error of 4 prediction
functions on a given image.

Instead of one level, the quantizer can also be L levels step-wise: Lloyd-Max quantizers. Look at
[fig 8.25], the steps can be determined by minimizing the expectation of the quantization error.
Adjusting the level (per for example 17 pixels) with a restricted amount (for example 4) scale
factors yields a substantial improvement of the error in the decoded image against a small
decrease in the compression ratio (1/8 bit per pixel), see [table 8.10]. In [fig. 8.26, 8.27] the
decoded images and their deviation are given for several DPCMs.

Transformation coding

A linear, reversible
transformation (such as the
Fourier transformation) maps
the image to as set of
coefficients which are then
quatized and coded.

Often small sub-images are used (8*8 or 16*16) and small coefficients are left out or quantized
in less bits. See [fig. 8.31] for DFT, Discrete Cosine Transformation and Walsh-Hadamard
Transformation with 8*8 sub-images where the smallest 50% of the coefficients are left out.
The DCT is often the best of the three for natural images. See [fig 8.33], a Karhunen-Loeve
Transformation (KLT) is better but costs far more processor time. DCT also has the advantage
over DFT that there are less discontinualities in the sub-images, this is less restrictive when seen
by the human eye.

The coefficients can be quantized in less bits by dividing them by certain optimal values [fig.
8.37], the higher the frequency the larger the number. The DC (frequency 0) component is
often treated separately by the symbol encoder because this is larger than the other
coefficients.
JPEG makes use of 8*8 sub-images, DCT transformation, quantization of the coeffients by
dividing with a quatization matrix [fig. 8.37b for Y], a zigzag ordering [fig. 8.36d] of it followed
by a Huffman encoder, separatly for the DC component. It uses a YUV color model, the U and V
component blocks of 2 by 2 pixels are combined into 1 pixel. The quantization matrixes can be
scaled to yield several compression ratios. There are standard coding tables and quantization
matrices, but the user can also indicate others to obtain better results for a certain image

29. Write Lossless compression algorithms or Error free compression in detail?


Huffman coding

This is a popular method to reduce the code redundancy. Under the condition that the symbols
are coded one by one, an optimal code for the set of symbols and probabilities is generated.
These are:

block code: every source symbol is mapped to a static order of code symbols
instantaneous code: every code is decoded without reference to the previous symbols
and is uniquely decodable

It can be generated in the following manner. The two symbols with the lowest probability are
repeatedly combined, until only two composed symbols left over. These get codes 0 and 1, the
components of the composed symbol get a 0 or a 1 behind it:

Sym Prob Code Prob Code Prob Code Prob Code


a1 0.6 1 0.6 1 0.6 1 0.6 1
a2 0.2 00 0.2 00 0.2 00 0.4 0
a3 0.1 010 0.1 010 0.2 01
a4 0.06 0110 0.1 011
a5 0.04 0111

A scan from left to right of 00010101110110 results in a2a3a1a5a4.


This code results in an average 1.7 bits per symbol instead of 3 bits.

Lempel-Ziv coding

This translates variable length arrays of source symbols (with about the same probability) to a
static (or predictable) code length. The method is adaptive: the table with symbol arrays is built
up in one pass over the data set during both compression and decompression. A variant on this
by Welch (LZW coding) is used in the UNIX compress program.
Just as Huffman, this is a symbol encoder which can be used both directly on the input and after
a mapper and quantizer.

Run Length Encoding

Many variations on this method are possible, FAX (both group 3 and 4 standards) are based on
this. The run lengths themselves can be coded as independent variable length code, possibly
separated for black and white if the probabilities are very different. In 2-D we can use the fact
that black-white transitions in consecutive scan lines are correlated: Relative Address Coding
[fig 8.17] and Contour tracing [fig. 8.18] in several varitions.

Bit plane decomposition

A gray level image of 8 bits can be transposed to 8 binary images [fig. 8.15], which each need to
be coded independently by a suitable code. The most significant bits contain the longest runs
and can be coded using the RLE methods. The least significant bits contain noise, here the RLE
will not yield good results.

Constant area coding

The gray value image is divided into m*n large blocks which are black, white or mixed. The most
probable type of block gets the code 0, the others get codes 10 and 11 and the mixed blocks are
followed by a bit pattern of it. A variant is "Quadtree", where the image is divided into 4
quadrants and mixed quadrants are further divided recursively. The attained tree is then
flattened.

Predictive coding

Starting at the previous source


symbols or pixel values, the next
value is predicted and only the
difference with the real value is
passed on. For example, the
predictor can be:
pn = round ( i=1 m ai fn-i )
with ai a well-chosen coefficient.
For the first m values fn must be
passed on. If m=1 we get the
differential or previous pixel
coding, see [fig. 8.20] for an
example.

For 2-D images the rows are consecutively placed in the model above. We could also use a pixel
from the previous row, for example
p(x,y)= round (a1f[x,y-1] + a2 f[x-1,y])
to make e[x,y] as small as possible; however, a good initiation will become more difficult then.

30. What is image compression ?


Image compression is minimizing the size in bytes of a graphics file without degrading the
quality of the image to an unacceptable level. The reduction in file size allows more images to
be stored in a given amount of disk or memory space. It also reduces the time required for
images to be sent over the Internet or downloaded from Web pages.

There are several different ways in which image files can be compressed. For Internet use, the
two most common compressed graphic image formats are the JPEG format and the GIF format.
The JPEG method is more often used for photographs, while the GIF method is commonly used
for line art and other images in which geometric shapes are relatively simple.

Other techniques for image compression include the use of fractals and wavelets. These
methods have not gained widespread acceptance for use on the Internet as of this writing.
However, both methods offer promise because they offer higher compression ratios than the
JPEG or GIF methods for some types of images. Another new method that may in time replace
the GIF format is the PNG format.

A text file or program can be compressed without the introduction of errors, but only up to a
certain extent. This is called lossless compression. Beyond this point, errors are introduced. In
text and program files, it is crucial that compression be lossless because a single error can
seriously damage the meaning of a text file, or cause a program not to run. In image
compression, a small loss in quality is usually not noticeable. There is no "critical point" up to
which compression works perfectly, but beyond which it becomes impossible. When there is
some tolerance for loss, the compression factor can be greater than it can when there is no loss
tolerance. For this reason, graphic images can be compressed more than text files or programs.

Balraj gill

Vous aimerez peut-être aussi