Module 3 Image Segmentation

Image and Video Processing
Module 3
Image Segmentation
by
Dr. S. D. Ruikar
Syllabus
Introduction, Classification of image

segmentation algorithm, Detection of
discontinuities, Edge detection, Hough
transform, Corner detection,
Thresholding, Region growing,
segmentation approach.
Introduction to image segmentation
The purpose of image segmentation is
to partition an image into meaningful
regions with respect to a particular
application
The segmentation is based on
measurements taken from the image

and might be greylevel, colour, texture,
depth or motion
Usually image segmentation is an initial and
vital step in a series of processes aimed at
overall image understanding
Applications of image segmentation include
Identifying objects in a scene for object-based measurements
such as size and shape
Identifying objects in a moving scene for object-based video
compression (MPEG4)
Identifying objects which are at different distances from a
sensor using depth measurements from a laser range finder
enabling path planning for a mobile robots
Example 1
Segmentation based on greyscale
Very simple model of greyscale leads to
inaccuracies in object labelling

Example 2
Segmentation based on texture
Enables object surfaces with varying
patterns of grey to be segmented

Example 3
Segmentation based on depth
This example shows a range image, obtained with
a laser range finder
A segmentation based on the range (the object
distance from the sensor) is useful in guiding
mobile robots
Range
Original
image
image
Segmented
image
Principal approaches
Segmentation algorithms generally are based
on one of 2 basis properties of intensity
values
discontinuity : to partition an image based on
sharp changes in intensity (such as edges)
similarity : to partition an image into regions
that are similar according to a set
of predefined criteria.
Detection of Discontinuities
detect the three basic types of gray-level
discontinuities
points , lines , edges
the common way is to run a mask through
the image
Goal: Extract Blobs
What are blobs?

Regions of an image that are somehow coherent
Why?
Object extraction, object removal, compositing, etc.
but are blobs objects?
No, not in general
Blobs coherence
Simplest way to define blob coherence
is as similarity in brightness or color:
The house, grass, and sky make

The tools become blobs different blobs
The meaning of a blob
gx2+gy2
Other interpretations of blobs are
possible, depending on how you define
the input image:
Image can be a response of a particular
detector
Color Detector
Face detector
Motion Detector
Edge Detector
Why is this useful?
AIBO
RoboSoccer
(VelosoLab)
Ideal Segmentation
Result of Segmentation
Thresholding
Basic segmentation operation:
mask(x,y) = 1 if im(x,y) > T
mask(x,y) = 0 if im(x,y) < T
T is threshold
User-defined
Or automatic
Same as
histogram
partitioning:
As Edge Detection
gx2+gy2 gx2+gy2 > T

Sometimes works well
What are potential

Problems?
but more often not
Adaptive thresholding
Region growing
Start with initial set of pixels K

Add to K any neighbors, if they are within similarity
threshold
Repeat until nothing changes
Is this same as global threshold?
What can go wrong?
Boundary Extraction
Point Detection
a point has been detected at the
location on which the mark is centered
if
|R| T
where
T is a nonnegative threshold
R is the sum of products of the coefficients
with the gray levels contained in the region
encompassed by the mark.
Point Detection
Note that the mark is the same as the
mask of Laplacian Operation
The only differences that are considered
of interest are those large enough (as
determined by T) to be considered
isolated points.
|R| T
Example
Line Detection
Horizontal mask will result with max response when

a line passed through the middle row of the mask
with a constant background.
the similar idea is used with other masks.
note: the preferred direction of each mask is
weighted with a larger coefficient (i.e.,2) than other
possible directions.
Line Detection
Apply every masks on the image
let R1, R2, R3, R4 denotes the response
of the horizontal, +45 degree, vertical
and -45 degree masks, respectively.
if, at a certain point in the image
|Ri| > |Rj|,
for all ji, that point is said to be more
likely associated with a line in the
direction of mask i.
Line Detection
Alternatively, if we are interested in
detecting all lines in an image in the
direction defined by a given mask, we
simply run the mask through the image and
threshold the absolute value of the result.
The points that are left are the strongest

responses, which, for lines one pixel thick,
correspond closest to the direction
defined by the mask.
Example
Edge Detection
we discussed approaches for implementing
first-order derivative (Gradient operator)
second-order derivative (Laplacian operator)
Here, we will talk only about their properties for
edge detection.
we have introduced both derivatives in chapter
3
Edge Detection
Our goal is to
extract a line
drawing
representation from
an image
Useful for
recognition: edges
contain shape
information
invariance
Edge Detection
The ability to measure gray-level

transitions in a meaningful way.
(R.C. Gonzales & R. E. Woods Digital
Image Processing, 2nd Edition, Prentice-
Hall, 2001)
Gray-Level Transition
Ideal Ramp
Detecting the Edge
Original
(1)
First Derivative
I x, y
TRSH Edge Detected
x
I x, y I x , y
x
TRSH
x x
Detecting
Original
the Edge (2)
First Derivative
I x, y
I x, y I x , y TRSH Edge Not Detected
x
x
TRSH
Gradient Operators
The gradient of the image I(x,y) at
location (x,y), is the vector: I x, y
Gx x
I
G
y
I x , y
y
The magnitude of the gradient:

I I G 2
x G y2
The direction of the gradient vector:
Gx
x, y tan
1
Gy
The Meaning of the Gradient
It represents the direction of the
strongest variationHorizontal
Vertical
in intensityGeneric
Edge Strength:
I Gx2 G y2
I G x I G y
G
Edge Direction:
x, y 0 x, y
x, y tan 1 y
2 Gx
The direction of the edge at location (x,y) is perpendicular to

the gradient vector at that point
Calculating the Gradient
For each pixel the
gradient is
calculated, based
on a 3x3
neighborhood z z z
1 2 3
around this pixel.
z z z
4 5 6
z z z
7 8 9
The Sobel Edge Detector
-1 -2 -1 -1 0 1
0 0 0 -2 0 2
1 2 1 -1 0 1
Gx z7 2z8 z9 z1 2z2 z3 Gy z3 2 z6 z9 z1 2 z4 z7
The Prewitt Edge Detector
-1 -1 -1 -1 0 1
0 0 0 -1 0 1
1 1 1 -1 0 1
Gx z7 z8 z9 z1 z2 z3 Gy z3 z6 z9 z1 z4 z7
The Roberts Edge Detector
0 0 0 0 0 0
0 -1 0 0 0 -1
0 0 1 0 1 0
Gx z9 z5 Gy z8 z6
The Roberts Edge Detector is in fact a 2x2 operator

The Canny Method
Two Possible Implementations:
1. The image is convolved with a Gaussian
filter before gradient evaluation

r2

hr e 2 2
r x2 y2
2. The image is convolved with the gradient of

the Gaussian Filter.
The Edge Detection Algorithm
The gradient is calculated (using any of the four
methods described in the previous slides), for each
pixel in the picture.
If the absolute value exceeds a threshold, the pixel
belongs to an edge.
The Canny method uses two thresholds, and
enables the detection of two edge types: strong and
weak edge. If a pixel's magnitude in the gradient
image, exceeds the high threshold, then the pixel
corresponds to a strong edge. Any pixel connected to
a strong edge and having a magnitude greater than
the low threshold corresponds to a weak edge.
Ideal and Ramp Edges
because of optics,
sampling, image
acquisition
imperfection
Thick edge
The slope of the ramp is inversely proportional to the
degree of blurring in the edge.
We no longer have a thin (one pixel thick) path.
Instead, an edge point now is any point contained in
the ramp, and an edge would then be a set of such
points that are connected.
The thickness is determined by the length of the
ramp.
The length is determined by the slope, which is in
turn determined by the degree of blurring.
Blurred edges tend to be thick and sharp edges
tend to be thin
First and Second derivatives
the signs of the derivatives

would be reversed for an edge
that transitions from light to
dark
Second derivatives
produces 2 values for every edge in an
image (an undesirable feature)
an imaginary straight line joining the
extreme positive and negative values of
the second derivative would cross zero
near the midpoint of the edge. (zero-
crossing property)
Zero-crossing
quite useful for locating the centers of
thick edges
we will talk about it again later
Noise Images
First column: images
and gray-level profiles
of a ramp edge
corrupted by random
Gaussian noise of mean
0 and = 0.0, 0.1, 1.0
and 10.0, respectively.
Second column: first-
derivative images and
gray-level profiles.
Third column : second-
derivative images and
gray-level profiles.
Keep in mind
fairly little noise can have such a
significant impact on the two key
derivatives used for edge detection in
images
image smoothing should be serious
consideration prior to the use of
derivatives in applications where noise
is likely to be present.
Edge point
to determine a point as an edge point
the transition in grey level associated with
the point has to be significantly stronger
than the background at that point.
use threshold to determine whether a
value is significant or not.
the points two-dimensional first-order
derivative must be greater than a specified
threshold.
f
Gx x
f f
Gradient Operator G y
y
first derivatives are implemented

using the magnitude of the gradient.
1
f mag (f ) [G G ]
2
x
2
y
2
1
commonly approx
f 2 f 2 2

x y
f G x G y
the magnitude becomes nonlinear
Gradient Masks
Diagonal edges with Prewitt
and Sobel masks
Example
Example
Example
Laplacian
Laplacian operator
2
f ( x, y ) 2
f ( x, y)
(linear operator) f
2

x 2
y 2
f [ f ( x 1, y ) f ( x 1, y )
2
f ( x, y 1) f ( x, y 1) 4 f ( x, y )]
Laplacian of Gaussian
Laplacian combined with smoothing to
find edges via zero-crossing.
where r2 = x2+y2, and

r2 is the standard deviation

h(r ) e 2 2
r2
r 2 2
2 2
h( r )
2
e

4

positive central term
surrounded by an adjacent negative region (a function of distance)
zero outer region
Mexican hat
the coefficient must be sum to zero

Linear Operation
second derivation is a linear operation
thus, 2f is the same as convolving the
image with Gaussian smoothing
function first and then computing the
Laplacian of the result
Example
a). Original image

b). Sobel Gradient
c). Spatial Gaussian
smoothing function
d). Laplacian mask
e). LoG
f). Threshold LoG
g). Zero crossing
Zero crossing & LoG
Approximate the zero crossing from
LoG image
to threshold the LoG image by setting
all its positive values to white and all
negative values to black.
the zero crossing occur between
positive and negative values of the
thresholded LoG.
Canny Edge Detection
Compute edge strength and orientation
at all pixels
Non-max suppression
Reduce thick edge strength responses
around true edges
Link and threshold using hysteresis
Simple method of contour completion
Non-maximum suppression:
Select the single maximum point across the width
of an edge.
Non-maximum
suppression
At q, the
value must
be larger
than values
interpolated
at p or r.
Examples:
Non-Maximum Suppression
Non-maxima
Original image Gradient magnitude
suppressed
Linking to the
next edge point
Assume the marked

point is an edge
point.
Take the normal to

the gradient at that
point and use this to
predict continuation
points (either r or s).
Edge Hysteresis
Hysteresis: A lag or momentum factor
Idea: Maintain two thresholds khigh and klow
Use khigh to find strong edges to start edge
chain
Use klow to find weak edges which continue
edge chain
Typical ratio of thresholds is roughly
khigh / klow = 2
Example: Canny Edge Detection
gap is gone
Strong +
Original
connected
image
weak edges
Strong Weak
edges edges
only
fine scale
high
threshold
coarse
scale,
high
threshold
coarse
scale
low
threshold
Finding lines in an image
Option 1:
Search for the line at every possible
position/orientation
What is the cost of this operation?
Option 2:
Use a voting scheme: Hough transform
Finding lines in an image
y b
b0
x m0 m
image space Hough space
Connection between image (x,y) and Hough (m,b)

spaces
A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
Finding
y
lines inb
an image
y0
x0 x m
image space Hough space
Connection between image (x,y) and Hough (m,b) spaces

A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
What does a point (x0, y0) in the image space map to?
A: the solutions of b = -x0m + y0

this is a line in Hough space
Hough transform algorithm
Typically use a different
parameterization
d is the perpendicular distance from

the line to the origin
is the angle this perpendicular
makes with the x axis
Why?
Hough transform algorithm
Typically use a different parameterization
d is the perpendicular distance from the line to the origin

is the angle this perpendicular makes with the x axis
Why?
Basic Hough transform algorithm
1. Initialize H[d, ]=0
2. for each edge point I[x,y] in the image
for = 0 to 180
H[d, ] += 1
3. Find the value(s) of (d, ) where H[d, ] is maximum
4. The detected line in the image is given by
Whats the running time (measured in # votes)?
Extensions
Extension 1: Use the image gradient
1. same
compute unique (d, ) based on image gradient at (x,y)
H[d, ] += 1
3. same
4. same
1. Whats the running time measured in votes?
2. Extension 2
give more votes for stronger edges
Extension 3
change the sampling of (d, ) to give more/less resolution
Extension 4
The same procedure can be used with circles, squares, or any other
shape
Extensions
Extension 1: Use the image gradient
1. same
compute unique (d, ) based on image gradient at (x,y)
H[d, ] += 1
3. same
4. same
1. Whats the running time measured in votes?
2. Extension 2
give more votes for stronger edges
Extension 3
change the sampling of (d, ) to give more/less resolution
Extension 4
The same procedure can be used with circles, squares, or any
other shape
Hough demos
Line : http://www/dai.ed.ac.uk/HIPR2/houghdemo.html
http://www.dis.uniroma1.it/~iocchi/slides/icra2001/java/hough.html
Circle : http://www.markschulze.net/java/hough/
Hough Transform for Curves
The H.T. can be generalized to detect
any curve that can be expressed in
parametric form:
Y = f(x, a1,a2,ap)
a1, a2, ap are the parameters
The parameter space is p-dimensional
The accumulating array is LARGE!
Thresholding
image with dark image with dark
background and background and
a light object two light objects
Multilevel thresholding
a point (x,y) belongs to
to an object class if T1 < f(x,y) T2
to another object class if f(x,y) > T2
to background if f(x,y) T1
T depends on
only f(x,y) : only on gray-level values Global
threshold
both f(x,y) and p(x,y) : on gray-level values and its
neighbors Local threshold
easily use global thresholding
object and background are separated
The Role of Illumination

f(x,y) = i(x,y) r(x,y)
a). computer generated
reflectance function
b). histogram of reflectance
function
c). computer generated
illumination function (poor)
d). product of a). and c).
e). histogram of product
image
difficult to segment
Basic Global Thresholding
use T midway
between the max
and min gray levels
generate binary image
based on visual inspection of histogram
1. Select an initial estimate for T.
2. Segment the image using T. This will produce two
groups of pixels: G1 consisting of all pixels with
gray level values > T and G2 consisting of pixels
with gray level values T
3. Compute the average gray level values 1 and 2
for the pixels in regions G1 and G2
4. Compute a new threshold value
5. T = 0.5 (1 + 2)
6. Repeat steps 2 through 4 until the difference in T
in successive iterations is smaller than a
predefined parameter To.
Example: Heuristic method
note: the clear
valley of the
histogram and the
effective of the
segmentation
between object
and background
T0 = 0
3 iterations
with result T = 125
Basic Adaptive Thresholding
subdivide original image into small
areas.
utilize a different threshold to segment
each subimages.
since the threshold used for each pixel
depends on the location of the pixel in
terms of the subimages, this type of
thresholding is adaptive.
Example : Adaptive Thresholding
Further subdivision
a). Properly and improperly
segmented subimages from
previous example
b)-c). corresponding histograms
d). further subdivision of the
improperly segmented subimage.
e). histogram of small subimage at
top
f). result of adaptively segmenting
d).
Greylevel histogram-based
segmentation
We will look at two very simple image
segmentation techniques that are based
on the greylevel histogram of an image
Thresholding
Clustering
We will use a very simple object-
background test image
We will consider a zero, low and high noise image
segmentation
Noise free Low noise High noise

segmentation
How do we characterise low noise and
high noise?
We can consider the histograms of our
images
For the noise free image, its simply two spikes at i=100,
i=150
For the low noise image, there are two clear peaks
centred on i=100, i=150
For the high noise image, there is a single peak two
greylevel populations corresponding to object and
background have merged
segmentation
h(i)
2500.00
2000.00
1500.00
Noise free
Low noise
High noise
1000.00
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
segmentation
We can define the input image signal-to-noise
ratio in terms of the mean greylevel value of
the object pixels and background pixels and
the additive noise standard deviation
b o
S/ N

segmentation
For our test images :
S/N (noise free) =
S/N (low noise) = 5
S/N (low noise) = 2

Greylevel thresholding
We can easily understand segmentation
based on thresholding by looking at the
histogram of the low noise
object/background image
There is a clear valley between to two
peaks
h(i)
2500.00
2000.00
Background
1500.00
1000.00
Object
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
T
We can define the greylevel
thresholding algorithm as follows:
If the greylevel of pixel p <=T then pixel p
is an object pixel
else
Pixel p is a background pixel
This simple threshold test begs the
obvious question how do we determine
the threshold ?
Many approaches possible
Interactive threshold
Adaptive threshold
Minimisation method
We will consider in detail a
minimisation method for determining
the threshold
Minimisation of the within group variance
Robot Vision, Haralick & Shapiro, volume
1, page 20
Idealized object/background image
histogram
h(i)
2500.00
2000.00
1500.00
1000.00
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
T
Any threshold separates the histogram into
2 groups with each group having its own
statistics (mean, variance)
The homogeneity of each group is measured
by the within group variance
The optimum threshold is that threshold
which minimizes the within group variance
thus maximizing the homogeneity of each
group
Let group o (object) be those pixels with
greylevel <=T
Let group b (background) be those
pixels with greylevel >T
The prior probability of group o is po(T)
The prior probability of group b is pb(T)

The following expressions can easily be
derived for prior probabilities of object and
background T
po ( T ) P( i )
i0
255
pb ( T ) P( i )
i T 1
P(i ) h( i ) / N
where h(i) is the histogram of an N pixel

image
The mean and variance of each group
are as follows :
T
o ( T ) iP(i ) / po ( T )
i0
255
b ( T ) iP( i ) / pb ( T )
i T 1
( T ) i o ( T ) P(i ) / po ( T )
T 2
2
o
i0
( T ) i b ( T ) P( i ) / pb ( T )
255 2
2
b
i T 1
The within group variance is defined as :
( T ) ( T ) po ( T ) ( T ) pb ( T )
2
W
2
o
2
b
We determine the optimum T by

minimizing this expression with respect to
T
Only requires 256 comparisons for and 8-bit
greylevel image
h(i)
Topt
2500.00
2000.00
Histogram
1500.00
Within group variance
1000.00
500.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
We can examine the performance of
this algorithm on our low and high
noise image
For the low noise case, it gives an optimum
threshold of T=124
Almost exactly halfway between the object
and background peaks
We can apply this optimum threshold to
both the low and high noise images
Low noise image Thresholded at T=124

Low noise image Thresholded at T=124

High level of pixel miss-classification
noticeable
This is typical performance for thresholding
The extent of pixel miss-classification is
determined by the overlap between object and
background histograms.
p(x)
0.02
Background
0.01
Object
x
0.00
o b
T
p(x)
0.02
Background
0.01
Object
x
0.00
o b
T
Easy to see that, in both cases, for any
value of the threshold, object pixels will
be miss-classified as background and vice
versa
For greater histogram overlap, the pixel
miss-classification is obviously greater
We could even quantify the probability of error
in terms of the mean and standard deviations
of the object and background histograms
Greylevel clustering
Consider an idealized
object/background histogram
Background
Object
c1 c2
Clustering tries to separate the

histogram into 2 groups
Defined by two cluster centres c1 and
c2
Greylevels classified according to the
nearest cluster centre
A nearest neighbour clustering

algorithm allows us perform a greylevel
segmentation using clustering
A simple case of a more general and widely
used K-means clustering
A simple iterative algorithm which has
known convergence properties

Given a set of greylevels

g(1), g( 2 )...... g( N )
We can partition this set into two
groups g1 ( 1), g1 ( 2 )...... g1 ( N1 )
g (1), g ( 2 )...... g ( N )
2 2 2 2
Compute the local means of each group

1 N1
c1 g1 ( i )
N 1 i 1
N2
1
c2
N2
g
i 1
2 (i )
Re-define the new groupings
g1 ( k ) c1 g1 ( k ) c2 k 1.. N 1
g2 ( k ) c2 g2 ( k ) c1 k 1.. N 2
In other words all grey levels in set 1 are
nearer to cluster centre c1 and all grey levels
in set 2 are nearer to cluster centre c2
But, we have a chicken and egg

situation
The problem with the above definition is
that each group mean is defined in terms
of the partitions and vice versa
The solution is to define an iterative
algorithm and worry about the convergence

of the algorithm later
The iterative algorithm is as follows

Initialize the label of each pixel randomly
Repeat
c1= mean of pixels assigned to object label
c2= mean of pixels assigned to background label
Compute partition g (1), g ( 2 )...... g ( N )

1 1 1 1
Compute partition g (1), g ( 2 )...... g ( N )

2 2 2 2
Until none pixel labelling changes

Two questions to answer

Does this algorithm converge?
If so, to what does it converge?
We can show that the algorithm is

guaranteed to converge and also that it
converges to a sensible result
Outline proof of algorithm convergence

Define a cost function at iteration r
g1 ( i ) c1 g2 ( i ) c2
1 N1
( r 1 ) 2 1 N2
( r 1 ) 2
E
(r ) (r ) (r )
N 1 i 1 N 2 i 1
E(r) >0
Now update the cluster centres

1 N1
c1( r ) g1( r ) ( i )
N 1 i 1
N1
1
c2( r )
N2
2 (i)
g (r )
i 1
Finally update the cost function
g1 ( i ) c1 g2 ( i ) c2
1 N (r ) 1
(r ) 2 1 N (r )
2
(r ) 2
E1
(r )
N 1 i 1 N 2 i 1
Easy to show that

( r 1 )
E E
(r )
1 E (r )
Since E(r) >0, we conclude that the

algorithm must converge
but
What does the algorithm converge to?
E1 is simply the sum of the variances

within each cluster which is minimised
at convergence
Gives sensible results for well separated
clusters
Similar performance to thresholding
g2
g1
c1 c2
Relaxation labelling
All of the segmentation algorithms we

have considered thus far have been
based on the histogram of the image
This ignores the greylevels of each pixels
neighbours which will strongly influence
the classification of each pixel
Objects are usually represented by a
spatially contiguous set of pixels

The following is a trivial example of a

likely pixel miss-classification
Object
Background
Relaxation labelling is a fairly general technique

in computer vision which is able to incorporate
constraints (such as spatial continuity) into
image labelling problems
We will look at a simple application to greylevel
image segmentation
It could be extended to colour/texture/motion
segmentation
Assume a simple object/background

image
p(i) is the probability that pixel i is a
background pixel
(1- p(i)) is the probability that pixel i is a
background pixel
Define the 8-neighbourhood of pixel

i as {i1,i2,.i8}
i1 i2 i3
i4 i i5
i6 i7 i8
Define consistencies cs and cd

Positive cs and negative cd encourages
neighbouring pixels to have the same label
Setting these consistencies to appropriate
values will encourage spatially contiguous

object and background regions
We assume again a bi-modal

object/background histogram with
maximum greylevel gmax
Background
Object
0 g max
We can initialize the probabilities

p ( i ) g( i ) / gmax
(0)
Our relaxation algorithm must drive

the background pixel probabilities p(i)
to 1 and the object pixel probabilities to
0
We want to take into account:

Neighbouring probabilities p(i1), p(i2),
p(i8)
The consistency values cs and cd
We would like our algorithm to

saturate such that p(i)~1
We can then convert the probabilities to
labels by multiplying by 255
We can derive the equation for

relaxation labelling by first considering
a neighbour i1 of pixel i
We would like to evaluate the contribution
to the increment in p(i) from i1
Let this increment be q(i1)
We can evaluate q(i1) by taking into account the

consistencies
We can apply a simple decision rule to

determine the contribution to the
increment q(i1) from pixel i1
If p(ii) >0.5 the contribution from pixel i1
increments p(i)
If p(ii) <0.5 the contribution from pixel i1
decrements p(i)
Since cs >0 and cd <0 its easy to see that the

following expression for q(i1) has the right properties
q( i1 ) cs p( i1 ) cd ( 1 p( i1 ))
We can now average all the contributions from the 8-
neighbours of i to get the total increment to p(i)
1 8
p (i ) (cs p (ih ) cd (1 p (ih )))
8 h 1
Easy to check that 1<p(i)<1 for -1< cs ,cd

<1
Can update p(i) as follows
p ( r ) ( i ) ~ p ( r 1 ) ( i )( 1 p( i ))
Ensures that p(i) remains positive
Basic form of the relaxation equation
We need to normalize the probabilities

p(i) as they must stay in the range {0..1}
After every iteration p(r)(i) is rescaled to
bring it back into the correct range
Remember our requirement that likely
background pixel probabilities are
driven to 1
153
One possible approach is to use a constant

normalisation factor
p ( r ) ( i ) p ( r ) ( i ) / maxi p ( r ) ( i )
In the following example, the central background
pixel probability may get stuck at 0.9 if max(p(i))=1
0.9 0.9 0.9
0.9 0.9 0.9
0.9 0.9 0.9
154
The following normalisation equation

has all the right properties
It can be derived from the general theory of
relaxation labelling
( r 1 )
p ( i )( 1 p( i ))
p ( i ) ( r 1 )
(r )
p ( i )( 1 p( i )) ( 1 p ( r 1 ) ( i ))( 1 p( i ))
Its easy to check that 0<p(r)(i)<1
155
We can check to see if this

normalisation equation has the correct
saturation properties
When p(r-1)(i)=1, p(r)(i)=1
When p(r-1)(i)=0, p(r)(i)=0
When p(r-1)(i)=0.9 and p(i)=0.9, p(r)(i)=0.994
When p(r-1)(i)=0.1 and p(i)=-0.9, p(r)(i)=0.012
We can see that p(i) converges to 0 or 1
156
Algorithm performance on the high noise

image
Comparison with thresholding
Relaxation labeling - 20
High noise circle image Optimum threshold iterations
157
The following is an example of a case

where the algorithm has problems due
to the thin structure in the clamp image
clamp image clamp image segmented clamp

original noise added image 10 iterations
158
Applying the algorithm to normal

greylevel images we can see a clear
separation into light and dark areas
Original 2 iterations 5 iterations 10 iterations
159
The histogram of each image shows the

clear saturation to 0 and 255
h(i)
5000.00
4000.00
3000.00 Original
2 iterations
5 iterations
2000.00 10 iterations
1000.00
0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
160
The Expectation/Maximization (EM)
algorithm
In relaxation labelling we have seen that we are
representing the probability that a pixel has a certain
label
In general we may imagine that an image comprises
L segments (labels)
Within segment l the pixels (feature vectors) have a probability
distribution represented by pl ( x | l )
l represents the parameters of the data in segment l
Mean and variance of the greylevels
Mean vector and covariance matrix of the colours
Texture parameters
161
algorithm
1 5 3
162
algorithm
Once again a chicken and egg problem arises
If we knew : l 1..L then we could obtain a labelling for
l
each x by simply choosing that label which
maximizes pl ( x | l )
If we new the label for each x we could obtain l : l 1..L
by using a simple maximum likelihood estimator
The EM algorithm is designed to deal with this type
of problem but it frames it slightly differently
It regards segmentation as a missing (or incomplete) data
estimation problem
163
algorithm
The incomplete data are just the measured pixel
greylevels or feature vectors
We can define a probability distribution of the incomplete
data as pi ( x;1 , 2 ..... L )
The complete data are the measured greylevels or
feature vectors plus a mapping function f (.) which
indicates the labelling of each pixel
Given the complete data (pixels plus labels) we can easily
work out estimates of the parameters l : l 1..L
But from the incomplete data no closed form solution exists
algorithm
Once again we resort to an iterative strategy and hope that we
get convergence
The algorithm is as follows:
Initialize an estimate of l : l 1..L

Repeat
Step 1: (E step)
Obtain an estimate of the labels based on
the current parameter estimates
Step 2: (M step)
Update the parameter estimates based on the
current labelling
Until Convergence
165
algorithm
A recent approach to applying EM to image
segmentation is to assume the image pixels
or feature vectors follow a mixture model
Generally we assume that each component of the
mixture model is a Gaussian
A Gaussian mixture model (GMM)
p( x | ) l 1 l pl ( x | l )
L
1 1
pl ( x | l ) exp( ( x ) l ( x l ))
T 1
(2 ) det( l )
d /2 1/ 2 l
2
l 1
L
l 1
166
algorithm
Our parameter space for our distribution now
includes the mean vectors and covariance matrices
for each component in the mixture plus the mixing
weights
1 , 1 ,1 ,..........L , L , L
We choose a Gaussian for each component because

the ML estimate of each parameter in the E-step
becomes linear
167
algorithm
Define a posterior probability P(l | x j , l ) as the
probability that pixel j belongs to region l given the
value of the feature vector x j
Using Bayes rule we can write the following equation
l pl ( x j | l )
P(l | x j , l )
k pk ( x | k )
L
k 1
This actually is the E-step of our EM algorithm as

allows us to assign probabilities to each label at each
pixel
168
algorithm
The M step simply updates the parameter
estimates using MLL estimation
1 n
l
( m 1)
P(l | x j ,l( m) )
n j 1
n
j
x
j 1
P (l | x j , l
( m)
)
l( m 1) n
P (
j 1
l | x j , l
( m)
)
P
j 1
(l | x j , l
(m)
) ( x j l
(m)
)( x j l
( m) T
)
l( m 1) n
P (l |
j 1
x j , l
(m)
)
169
Boundary Characteristic for Histogram
Improvement and Local Thresholding
light object of dark background
0 if f T

s ( x , y ) if f T and 2 f 0
if f T and 2 f 0

Gradient gives an indication of whether a pixel is on an edge
Laplacian can yield information regarding whether a given pixel lies
on the dark or light side of the edge
all pixels that are not on an edge are labeled 0
all pixels that are on the dark side of an edge are labeled +
all pixels that are on the light side an edge are labeled -
Example
Region-Based Segmentation - Region
Growing
start with a set of seed points

growing by appending to each seed
those neighbors that have similar
properties such as specific ranges of
gray level
select all seed points with gray level 255
Region Growing
criteria:
1. the absolute gray-
level difference
between any pixel
and the seed has to
be less than 65
2. the pixel has to be
8-connected to at
least one pixel in
that region (if
more, the regions
are merged)
Corner detection
Corners contain more edges than
lines.
A point on a line is hard to match.
Corners contain more edges than
lines.
A corner is easier
Edge Detectors Tend to Fail at
Corners
Finding Corners
Intuition:
Right at corner, gradient is ill defined.
Near corner, gradient has two different
values.
Formula for Finding Corners
We look at matrix:
Gradient with respect to x,
Sum over a small region, times gradient with respect to y
the hypothetical corner
I 2
I I
C x x y

I x I y I 2
y
Matrix is symmetric WHY THIS?

First, consider case where:
I 2
I I 1 0
C x x y

I x I y I 0 2
2
y
This means all gradients in neighborhood are:

(k,0) or (0, c) or (0, 0) (or off-diagonals cancel).
What is region like if:
1. 1 0?
2. 2 0?
3. 1 0 and 2 0?
4. 1 0 and 2 0?
General Case:
From Linear Algebra, it follows that because C is
symmetric:
1 0
CR 1
R
0 2
With R a rotation matrix.
So every case is like one on last slide.
So, to detect corners
Filter image.
Compute magnitude of the gradient
everywhere.
We construct C in a window.
Use Linear Algebra to find l1 and l2.
If they are both big, we have a corner.
Image Segmentation
Segmentation divides an image into its
constituent regions or objects.
Segmentation of images is a difficult task in
image processing. Still under research.
Segmentation allows to extract objects in
images.
Segmentation is unsupervised learning.
Model based object extraction, e.g., template
matching, is supervised learning.
What it is useful for
After a successful segmenting the image, the
contours of objects can be extracted using edge
detection and/or border following techniques.
Shape of objects can be described.
Based on shape, texture, and color objects can be
identified.
Segmentation Algorithms
Segmentation algorithms are based on one of
two basic properties of color, gray values, or
texture: discontinuity and similarity.
First category is to partition an image based
on abrupt changes in intensity, such as
edges in an image.
Second category are based on partitioning an
image into regions that are similar according
to a predefined criteria. Histogram
thresholding approach falls under this
category.
Domain spaces
spatial domain (row-column (rc) space)
histogram spaces
color space
texture space
other complex feature space

Clustering in Color Space
1. Each image point is mapped to a point in a color
space, e.g.:
Color(i, j) = (R (i, j), G(i, j), B(i, j))
It is many to one mapping.
2. The points in the color space are grouped to clusters.
3. The clusters are then mapped back to regions in the

image.
Examples
Original pictures segmented pictures
Mnp: 30, percent 0.05, cluster number 4
Mnp : 20, percent 0.05, cluster number 7

Displaying objects in the
Segmented Image
The objects can be distinguished by
assigning an arbitrary pixel value or
average pixel value to the pixels
belonging to the same clusters.
Segmentation by Thresholding
Suppose that the gray-level histogram
corresponds to an image f(x,y) composed of
dark objects on the light background, in
such a way that object and background
pixels have gray levels grouped into two
dominant modes. One obvious way to extract
the objects from the background is to select
a threshold T that separates these modes.
Then any point (x,y) for which f(x,y) < T is
called an object point, otherwise, the point is
called a background point.
Gray Scale Image Example
Image of a Finger Print with light background

Histogram
Segmented Image
Image after Segmentation

In Matlab histograms for images can be
constructed using the imhist command.
I = imread('pout.tif');
figure, imshow(I);
figure, imhist(I) %look at the hist to get a threshold, e.g., 110
BW=roicolor(I, 110, 255); % makes a binary image
figure, imshow(BW) % all pixels in (110, 255) will be 1 and white
% the rest is 0 which is black
roicolor returns a region of interest selected as those pixels in I that

match the values in the gray level interval.
BW is a binary image with 1's where the values of I match the values
of the interval.
Thresholding Bimodal Histograms
Basic Global Thresholding:
1)Select an initial estimate for T
2)Segment the image using T. This will produce two
groups of pixels. G1 consisting of all pixels with gray
level values >T and G2 consisting of pixels with
values <=T.
3)Compute the average gray level values mean1 and
mean2 for the pixels in regions G1 and G2.
4)Compute a new threshold value
T=(1/2)(mean1 +mean2)
5)Repeat steps 2 through 4 until difference in T in
successive iterations is smaller than a predefined
parameter T0.
Gray Scale Image - bimodal
Image of rice with black background

Segmented Image
Image histogram of rice Image after segmentation

Basic Adaptive Thresholding:
Images having uneven illumination makes it difficult

to segment using histogram,
this approach is to divide the original image
into sub images
and use the thresholding process
to each of the sub images.
Multimodal Histogram
If there are three or more dominant modes
in the image histogram, the histogram has
to be partitioned by multiple thresholds.
Multilevel thresholding classifies a point

(x,y) as belonging to one object class
if T1 < (x,y) <= T2,
to the other object class
if f(x,y) > T2
and to the background
if f(x,y) <= T1.
Thresholding multimodal histograms
A method based on
Discrete Curve Evolution
to find thresholds in the histogram.
The histogram is treated as a polyline

and is simplified until a few vertices remain.
Thresholds are determined by vertices that are local
minima.
Discrete Curve Evolution (DCE)
It yields a sequence: P=P0, ..., Pm
Pi+1 is obtained from Pi by deleting the vertices of Pi that

have minimal relevance measure
K(v, Pi) = |d(u,v)+d(v,w)-d(u,w)|
v v
> w
w u
u
Gray Scale Image - Multimodal
Original Image of lena

Multimodal Histogram
Histogram of lena
Segmented Image
Image after segmentation we get a outline of her face, hat, shadow etc
Color Image - bimodal
Colour Image having a bimodal histogram

Histogram
Histograms for the three colour spaces

Segmented Image
Segmented image, skin color is shown

Split and Merge
The goal of Image Segmentation is to
find regions that represent objects or
meaningful parts of objects. Major
problems of image segmentation are
result of noise in the image.
An image domain X must be segmented
in N different regions R(1),,R(N)
The segmentation rule is a logical
predicate of the form P(R)
Split and Merge
Image segmentation with respect to
predicate P partitions the image X into
subregions R(i), i=1,,N such that
X = i=1,..N U R(i)
R(i) R(j) = 0 for I j
P(R(i)) = TRUE for i = 1,2,,N
P(R(i) U R(j)) = FALSE for i j
Split and Merge
The segmentation property is a logical
predicate of the form P(R,x,t)
x is a feature vector associated with
region R
t is a set of parameters (usually
thresholds). A simple segmentation
rule has the form:
P(R) : I(r,c) < T for all (r,c) in R
Split and Merge
In the case of color images the feature
vector x can be three RGB image
components (R(r,c),G(r,c),B(r,c))
A simple segmentation rule may have
the form:
P(R) : (R(r,c) <T(R)) && (G(r,c)<T(G))&&
(B(r,c) < T(B))
Region Growing (Merge)
A simple approach to image segmentation is
to start from some pixels (seeds) representing
distinct image regions and to grow them,
until they cover the entire image
For region growing we need a rule describing
a growth mechanism and a rule checking the
homogeneity of the regions after each growth
step
Region Growing
The growth mechanism at each stage k and
for each region Ri(k), i = 1,,N, we check if
there are unclassified pixels in the 8-
neighbourhood of each pixel of the region
border
Before assigning such a pixel x to a region
Ri(k),we check if the region homogeneity:
P(Ri(k) U {x}) = TRUE , is valid
Region Growing Predicate
The arithmetic mean m and standard deviation std of
a region R having n =|R| pixels:
1 1
m( R ) I ( r , c ) std ( R) ( I ( r , c ) m( R )) 2
n ( r ,c )R n 1 ( r ,c )R
The predicate
P: |m(R1) m(R2)| < k*min{std(R1), std(R2)},
is used to decide if the merging
of the two regions R1, R2 is allowed, i.e.,
if |m(R1) m(R2)| < k*min{std(R1), std(R2)},
two regions R1, R2 are merged.
Split
The opposite approach to region growing is
region splitting.
It is a top-down approach and it starts with
the assumption that the entire image is
homogeneous
If this is not true, the image is split into four
sub images
This splitting procedure is repeated
recursively until we split the image into
homogeneous regions
Split
If the original image is square N x N, having
dimensions that are powers of 2(N = 2n):
All regions produced but the splitting algorithm are
squares having dimensions M x M , where M is a
power of 2 as well.
Since the procedure is recursive, it produces an
image representation that can be described by a tree
whose nodes have four sons each
Such a tree is called a Quadtree.
Split
Quadtree
R0 R1
R0
R3
R2 R1
R00 R01 R02 R04

Split
Splitting techniques disadvantage, they
create regions that may be adjacent
and homogeneous, but not merged.
Split and Merge method is an iterative
algorithm that includes both splitting
and merging at each iteration:
Split / Merge
If a region R is inhomogeneous
(P(R)= False) then is split into four sub
regions
If two adjacent regions Ri,Rj are
homogeneous (P(Ri U Rj) = TRUE), they
are merged
The algorithm stops when no further
splitting or merging is possible
Split / Merge
The split and merge algorithm
produces more compact regions than
the pure splitting algorithm
Applications
3D Imaging : A basic task in 3-D image processing
is the segmentation of an image which classifies
voxels/pixels into objects or groups. 3-D image
segmentation makes it possible to create 3-D
rendering for multiple objects and perform
quantitative analysis for the size, density and other
parameters of detected objects.
Several applications in the field of Medicine like
magnetic resonance imaging (MRI).
Results Region grow
Results Region Split
Results Region Split and
Merge
Introduction
All pixels belong to a region
Object
Part of object
Background
Find region
Constituent pixels
Boundary
Watersheds of Gradient
Magnitude
Compare geographical watersheds
Divide landscape into catchment basins
Edges correspond to watersheds
Algorithm
Locate local minima
Flood image from these points
When two floods meet
Identify a watershed pixel
Build a dam
Continue flooding
Example
watersheds
local minima
watershed point
watershed point dam
Image
Segmentation
Background
First-order derivative
f
f '( x) f ( x 1) f ( x)
x
Second-order derivative
2 f
f ( x 1) f ( x 1) 2 f ( x)
x 2
Characteristics of First and Second Order
Derivatives
First-order derivatives generally produce thicker edges in image
Second-order derivatives have a stronger response to fine detail,

such as thin lines, isolated points, and noise
Second-order derivatives produce a double-edge response at

ramp and step transition in intensity
The sign of the second derivative can be used to determine

whether a transition into an edge is from light to dark or dark to
light
Detection of Isolated Points
The Laplacian
2
f 2
f
f ( x, y ) 2 2
2
x y
f ( x 1, y ) f ( x 1, y ) f ( x, y 1) f ( x, y 1)
4 f ( x, y )
9
1 if | R( x, y ) | T R wk zk
g ( x, y ) k 1
0 otherwise
Image Segmentation
Image segmentation divides an image into regions that

are connected and have some similarity within the
region and some difference between adjacent regions.
The goal is usually to find individual objects in an
image.
For the most part there are fundamentally two kinds
of approaches to segmentation: discontinuity and
similarity.
Similarity may be due to pixel intensity, color or texture.
Differences are sudden changes (discontinuities) in any of
these, but especially sudden changes in intensity along a
boundary line, which is called an edge.
There are three kinds of discontinuities of intensity:

points, lines and edges.
The most common way to look for discontinuities is to
scan a small mask over the image. The mask
determines which kind of discontinuity to look for.
9
R w1 z1 w2 z2 ... w9 z9 wi zi
i 1
Point Detection
R T
where T : a nonnegativ e threshold
Line Detection
Only slightly more common than point detection is to

find a one pixel wide line in an image.
For digital images the only three point straight lines
are only horizontal, vertical, or diagonal (+ or 45).
Line Detection
Edge Detection
Edge Detection
Edge Detection
Edge Detection
Gradient Operators
First-order derivatives:
The gradient of an image f(x,y) at location (x,y) is
defined as the vector:
G x f
x

f f
G y y
The magnitude of this vector:
f mag (f ) G G 2
x
2
y
1
2
Gx
The direction of this vector:
( x, y ) tan
1
Gy
Direction of edge
900
Basic Edge Detection by Using First-
Order Derivative
f
g x x
Edge normal: f grad ( f )
g y f
y
Edge unit normal: f / mag(f )
In practice,sometimes the magnitude is approximated by

f f f f
mag(f )= + or mag(f )=max | |,| |
x y x y
Gradient Operators
Roberts cross-gradient operators
Prewitt operators
Sobel operators
Gradient Operators
Prewitt masks for

detecting diagonal edges
Sobel masks for

detecting diagonal edges
Gradient Operators: Example
f G x G y
Gradient Operators
Second-order derivatives: (The Laplacian)

The Laplacian of an 2D function f(x,y) is defined
as 2 f 2 f
f 2 2
2
x y
Two forms in practice:

Gradient Operators
Consider the function: A Gaussian function

r2
2
h(r ) e 2
where r 2 x 2 y 2
and : the standard deviation
The Laplacian of h is
r 2 2
2 2 r2 The Laplacian of a
h( r )
2
e Gaussian (LoG)

4

The Laplacian of a Gaussian sometimes is called the
Mexican hat function. It also can be computed by
smoothing the image with the Gaussian smoothing
mask, followed by application of the Laplacian mask.
Gradient Operators
Sobel gradient
Gaussian smooth function Laplacian mask

Edge Linking and Boundary Detection
Local Processing
Two properties of edge points are useful for edge

linking:
the strength (or magnitude) of the detected edge points
their directions (determined from gradient directions)
This is usually done in local neighborhoods.
Adjacent edge points with similar magnitude and
direction are linked.
For example, an edge pixel with coordinates (x0,y0) in
a predefined neighborhood of (x,y) is similar to the
pixelat , y)
f ( x(x,y) if( x0 , y0 ) E, E : a nonnegativ e threshold
( x, y) ( x0 , y0 ) A, A : a nonegative angle threshold
Local Processing: Example
In this example,
we can find the
license plate
candidate after
edge linking
process.
Global Processing via the Hough Transform
Hough transform: a way of finding edge points in an

image that lie along a straight line.
Example: xy-plane v.s. ab-plane (parameter space)
yi axi b
Global Processing via the Hough Transform
The Hough transform

consists of finding all pairs
of values of and which
satisfy the equations that
pass through (x,y).
These are accumulated in
what is basically a 2-
dimensional histogram.
When plotted these pairs of
and will look like a sine
x cos y sin
wave. The process is
repeated for all appropriate
(x,y) locations.
Hough Transform Example
The intersection of
the curves
corresponding to
points 1,3,5
2,3,4
1,4
Hough Transform Example
Thresholding
Assumption: the range of intensity levels covered by

objects of interest is different from the background.
1 if f ( x, y) T
g ( x, y)
0 if f ( x, y ) T
Single threshold Multiple threshold

Thresholding
Thresholding
r ( x, y ) (a) (c) i ( x, y )
(d) (e)
f ( x, y ) i ( x, y ) r ( x, y )
Thresholding
Thresholding
Thresholding
Thresholding
How to solve this problem?

Thresholding
Answer: subdivision
Thresholding
Optimal Global and Adaptive Thresholding
This method treats pixel values as probability density

functions.
The goal of this method is to minimize the probability of
misclassifying pixels as either object or background.
There are two kinds of error:
mislabeling an object pixel as background, and
mislabeling a background pixel as object.

Thresholding
Use of Boundary Characteristics
Thresholding
Thresholds Based on Several Variables
Color image
Region-Based Segmentation
Edges and thresholds sometimes do not give

good results for segmentation.
Region-based segmentation is based on the
connectivity of similar pixels in a region.
Each region must be uniform.
Connectivity of the pixels within the region is very
important.
There are two main approaches to region-
based segmentation: region growing and region
splitting.
Basic Formulation
Let R represent the entire image region.

Segmentation is a process that partitions R into
subregions, R1,R2,,Rn, such that
n
(a) Ri R
i 1
(b) Ri is a connected region, i 1,2,..., n
(c) Ri R j for all i and j , i j
(d) P( Ri ) TRUE for i 1,2,..., n
(e) P( Ri R j ) FALSE for any adjacent regions Ri and R j
where P(Rk): a logical predicate defined over the points
in set Rk
For example: P(Rk)=TRUE if all pixels in Rk have the
Region Growing
Region Growing
Fig. 10.41 shows the histogram of Fig. 10.40 (a). It is

difficult to segment the defects by thresholding
methods. (Applying region growing methods are better
in this case.)
Figure 10.40(a) Figure 10.41

Region Splitting and Merging
Region splitting is the opposite of region

growing.
First there is a large region (possible the entire
image).
Then a predicate (measurement) is used to
determine if the region is uniform.
If not, then the method requires that the region
be split into two regions.
Then each of these two regions is independently
tested by the predicate (measurement).
This procedure continues until all resulting regions
are uniform.
Region Splitting
The main problem with region splitting is determining

where to split a region.
One method to divide a region is to use a quadtree
structure.
Quadtree: a tree in which nodes have exactly four
descendants.
Region Splitting and Merging
The split and merge procedure:

Split into four disjoint quadrants any region Ri for
which P(Ri) = FALSE.
Merge any adjacent regions Rj and Rk for which
P(RjURk) = TRUE. (the quadtree structure may not
be preserved)
Stop when no further merging or splitting is
possible.
Segmentation by Morphological Watersheds
The concept of watersheds is based on visualizing an

image in three dimensions: two spatial coordinates
versus gray levels.
In such a topographic interpretation, we consider
three types of points:
(a) points belonging to a regional minimum
(b) points at which a drop of water would fall with
certainty to a single minimum

(c) points at which water would be equally likely to
fall to more than one such minimum

The principal objective of segmentation algorithms
based on these concepts is to find the watershed lines.
Watershed Segmentation Algorithm
Visualize an image in 3D: spatial coordinates and gray levels.
In such a topographic interpretation, there are 3 types of points:
Points belonging to a regional minimum
Points at which a drop of water would fall to a single

minimum. (The catchment basin or watershed of that
minimum.)
Points at which a drop of water would be equally likely to fall
to more than one minimum. (The divide lines or watershed
lines.)
Watershed lines
Watershed Segmentation
Algorithm
The objective is to find watershed lines.
The idea is simple:
Suppose that a hole is punched in each regional minimum and that the entire
topography is flooded from below by letting water rise through the holes at a
uniform rate.
When rising water in distinct catchment basins is about the merge, a dam is
built to prevent merging. These dam boundaries correspond to the
watershed lines.
Example
Example
Watershed algorithm is often applied to the
gradient of an image, rather than to the image
itself.
Regional minima of catchment basins correlate
nicely with the small value of the gradient
corresponding the objects of interest
Boundaries are highlighted as the watershed lines.
Start with all pixels with the lowest
possible value.
These form the basis for initial watersheds
For each intensity level k:
For each group of pixels of intensity k
If adjacent to exactly one existing region, add these
pixels to that region
Else if adjacent to more than one existing regions,
mark as boundary
Else start a new region
Watershed algorithm might be used on the gradient image instead
of the original image.
Due to noise and other local irregularities of the gradient,

oversegmentation might occur.
A solution is to limit the number of regional minima. Use

markers to specify the only allowed regional minima.
The Use of Markers
Internal markers are used to limit the
number of regions by specifying the
objects of interest
Like seeds in region growing method
Can be assigned manually or automatically
Regions without markers are allowed to be
merged (no dam is to be built)
External markers those pixels we are
confident to belong to the background
Watershed lines are typical external
markers and they belong the same
(background) region
A solution is to limit the number of regional minima. Use

markers to specify the only allowed regional minima. (For
example, gray-level values might be used as a marker.)
Example
The Use of Motion in Segmentation
ADI: accumulative difference image

The Use of Motion in Segmentation

MATLAB Example
A: Original image f
B: Direct watershed transform
result using the following
commands
L=watershed(g)
wr = L ==0
g is the gradient image of A
C: shows all of the regional
minima of g using
rm=imregionalmin(g)
D: internal markers obtained
by
im = imextendedmin(g,2)
fim = f;
fim(im) = 175;
E: External markers using
Lim = watershed(bwdist(im))
em = Lim ==0
F: Modified gradient image
obtained from internal and ABC
external markers
g2 = imimposemin(g, im | em)
G: Final segmentation result DEF
L2 = watershed(g2)
f2 = f;

f2(L2 == 0) = 255
G

Module 3 Image Segmentation

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Module 3 Image Segmentation

Transféré par

Droits d'auteur :

Formats disponibles

Image and Video Processing

Introduction, Classification of image

measurements taken from the image

inaccuracies in object labelling

patterns of grey to be segmented

What are blobs?

The house, grass, and sky make

mask(x,y) = 0 if im(x,y) < T

gx2+gy2 gx2+gy2 > T

What are potential

Start with initial set of pixels K

Horizontal mask will result with max response when

The points that are left are the strongest

The ability to measure gray-level

The magnitude of the gradient:

The direction of the edge at location (x,y) is perpendicular to

The Roberts Edge Detector is in fact a 2x2 operator

filter before gradient evaluation

2. The image is convolved with the gradient of

the signs of the derivatives

first derivatives are implemented

where r2 = x2+y2, and

the coefficient must be sum to zero

a). Original image

Assume the marked

Take the normal to

Connection between image (x,y) and Hough (m,b)

Connection between image (x,y) and Hough (m,b) spaces

A: the solutions of b = -x0m + y0

d is the perpendicular distance from

d is the perpendicular distance from the line to the origin

The Role of Illumination

Noise free Low noise High noise

S/N (low noise) = 2

The prior probability of group b is pb(T)

where h(i) is the histogram of an N pixel

We determine the optimum T by

Low noise image Thresholded at T=124

Low noise image Thresholded at T=124

Clustering tries to separate the

A nearest neighbour clustering

known convergence properties

Given a set of greylevels

Compute the local means of each group

Re-define the new groupings

But, we have a chicken and egg

algorithm and worry about the convergence

The iterative algorithm is as follows

Compute partition g (1), g ( 2 )...... g ( N )

Compute partition g (1), g ( 2 )...... g ( N )

Until none pixel labelling changes

Two questions to answer

We can show that the algorithm is

Outline proof of algorithm convergence

Now update the cluster centres

Finally update the cost function

Easy to show that

Since E(r) >0, we conclude that the

E1 is simply the sum of the variances

All of the segmentation algorithms we

spatially contiguous set of pixels

The following is a trivial example of a

Relaxation labelling is a fairly general technique

Assume a simple object/background