Vous êtes sur la page 1sur 296

Image and Video Processing

Module 3
Image Segmentation

by
Dr. S. D. Ruikar
Syllabus

Introduction, Classification of image


segmentation algorithm, Detection of
discontinuities, Edge detection, Hough
transform, Corner detection,
Thresholding, Region growing,
segmentation approach.
Introduction to image segmentation
The purpose of image segmentation is
to partition an image into meaningful
regions with respect to a particular
application
The segmentation is based on

measurements taken from the image


and might be greylevel, colour, texture,
depth or motion
Introduction to image segmentation
Usually image segmentation is an initial and
vital step in a series of processes aimed at
overall image understanding
Applications of image segmentation include
Identifying objects in a scene for object-based measurements
such as size and shape
Identifying objects in a moving scene for object-based video
compression (MPEG4)
Identifying objects which are at different distances from a
sensor using depth measurements from a laser range finder
enabling path planning for a mobile robots
Introduction to image segmentation

Example 1
Segmentation based on greyscale
Very simple model of greyscale leads to

inaccuracies in object labelling


Introduction to image segmentation

Example 2
Segmentation based on texture
Enables object surfaces with varying

patterns of grey to be segmented


Introduction to image segmentation
Introduction to image segmentation
Introduction to image segmentation

Example 3
Segmentation based on depth
This example shows a range image, obtained with
a laser range finder
A segmentation based on the range (the object
distance from the sensor) is useful in guiding
mobile robots
Introduction to image segmentation

Range
Original
image
image

Segmented
image
Principal approaches
Segmentation algorithms generally are based
on one of 2 basis properties of intensity
values
discontinuity : to partition an image based on
sharp changes in intensity (such as edges)
similarity : to partition an image into regions
that are similar according to a set
of predefined criteria.
Detection of Discontinuities
detect the three basic types of gray-level
discontinuities
points , lines , edges
the common way is to run a mask through
the image
Goal: Extract Blobs

What are blobs?


Regions of an image that are somehow coherent
Why?
Object extraction, object removal, compositing, etc.
but are blobs objects?
No, not in general
Blobs coherence
Simplest way to define blob coherence
is as similarity in brightness or color:

The house, grass, and sky make


The tools become blobs different blobs
The meaning of a blob
gx2+gy2
Other interpretations of blobs are
possible, depending on how you define
the input image:
Image can be a response of a particular
detector
Color Detector
Face detector
Motion Detector
Edge Detector
Why is this useful?

AIBO
RoboSoccer
(VelosoLab)
Ideal Segmentation
Result of Segmentation
Thresholding
Basic segmentation operation:
mask(x,y) = 1 if im(x,y) > T

mask(x,y) = 0 if im(x,y) < T

T is threshold
User-defined
Or automatic

Same as
histogram
partitioning:
As Edge Detection

gx2+gy2 gx2+gy2 > T


Sometimes works well

What are potential


Problems?
but more often not

Adaptive thresholding
Region growing

Start with initial set of pixels K


Add to K any neighbors, if they are within similarity
threshold
Repeat until nothing changes
Is this same as global threshold?
What can go wrong?
Boundary Extraction
Point Detection
a point has been detected at the
location on which the mark is centered
if
|R| T
where
T is a nonnegative threshold
R is the sum of products of the coefficients
with the gray levels contained in the region
encompassed by the mark.
Point Detection
Note that the mark is the same as the
mask of Laplacian Operation
The only differences that are considered
of interest are those large enough (as
determined by T) to be considered
isolated points.
|R| T
Example
Line Detection

Horizontal mask will result with max response when


a line passed through the middle row of the mask
with a constant background.
the similar idea is used with other masks.
note: the preferred direction of each mask is
weighted with a larger coefficient (i.e.,2) than other
possible directions.
Line Detection
Apply every masks on the image
let R1, R2, R3, R4 denotes the response
of the horizontal, +45 degree, vertical
and -45 degree masks, respectively.
if, at a certain point in the image
|Ri| > |Rj|,
for all ji, that point is said to be more
likely associated with a line in the
direction of mask i.
Line Detection
Alternatively, if we are interested in
detecting all lines in an image in the
direction defined by a given mask, we
simply run the mask through the image and
threshold the absolute value of the result.

The points that are left are the strongest


responses, which, for lines one pixel thick,
correspond closest to the direction
defined by the mask.
Example
Edge Detection
we discussed approaches for implementing
first-order derivative (Gradient operator)
second-order derivative (Laplacian operator)
Here, we will talk only about their properties for
edge detection.
we have introduced both derivatives in chapter
3
Edge Detection
Our goal is to
extract a line
drawing
representation from
an image
Useful for
recognition: edges
contain shape
information
invariance
Edge Detection

The ability to measure gray-level


transitions in a meaningful way.
(R.C. Gonzales & R. E. Woods Digital
Image Processing, 2nd Edition, Prentice-
Hall, 2001)
Gray-Level Transition
Ideal Ramp
Detecting the Edge
Original
(1)
First Derivative

I x, y
TRSH Edge Detected
x
I x, y I x , y
x

TRSH

x x
Detecting
Original
the Edge (2)
First Derivative

I x, y
I x, y I x , y TRSH Edge Not Detected
x
x

TRSH
Gradient Operators
The gradient of the image I(x,y) at
location (x,y), is the vector: I x, y
Gx x
I
G
y
I x , y
y

The magnitude of the gradient:


I I G 2
x G y2
The direction of the gradient vector:
Gx
x, y tan
1

Gy
The Meaning of the Gradient
It represents the direction of the
strongest variationHorizontal
Vertical
in intensityGeneric

Edge Strength:
I Gx2 G y2
I G x I G y
G
Edge Direction:
x, y 0 x, y
x, y tan 1 y
2 Gx

The direction of the edge at location (x,y) is perpendicular to


the gradient vector at that point
Calculating the Gradient
For each pixel the
gradient is
calculated, based
on a 3x3
neighborhood z z z
1 2 3
around this pixel.
z z z
4 5 6
z z z
7 8 9
The Sobel Edge Detector

-1 -2 -1 -1 0 1

0 0 0 -2 0 2

1 2 1 -1 0 1
Gx z7 2z8 z9 z1 2z2 z3 Gy z3 2 z6 z9 z1 2 z4 z7
The Prewitt Edge Detector

-1 -1 -1 -1 0 1

0 0 0 -1 0 1

1 1 1 -1 0 1
Gx z7 z8 z9 z1 z2 z3 Gy z3 z6 z9 z1 z4 z7
The Roberts Edge Detector

0 0 0 0 0 0

0 -1 0 0 0 -1

0 0 1 0 1 0
Gx z9 z5 Gy z8 z6

The Roberts Edge Detector is in fact a 2x2 operator


The Canny Method
Two Possible Implementations:
1. The image is convolved with a Gaussian

filter before gradient evaluation


r2

hr e 2 2

r x2 y2

2. The image is convolved with the gradient of


the Gaussian Filter.
The Edge Detection Algorithm
The gradient is calculated (using any of the four
methods described in the previous slides), for each
pixel in the picture.
If the absolute value exceeds a threshold, the pixel
belongs to an edge.
The Canny method uses two thresholds, and
enables the detection of two edge types: strong and
weak edge. If a pixel's magnitude in the gradient
image, exceeds the high threshold, then the pixel
corresponds to a strong edge. Any pixel connected to
a strong edge and having a magnitude greater than
the low threshold corresponds to a weak edge.
Ideal and Ramp Edges

because of optics,
sampling, image
acquisition
imperfection
Thick edge
The slope of the ramp is inversely proportional to the
degree of blurring in the edge.
We no longer have a thin (one pixel thick) path.
Instead, an edge point now is any point contained in
the ramp, and an edge would then be a set of such
points that are connected.
The thickness is determined by the length of the
ramp.
The length is determined by the slope, which is in
turn determined by the degree of blurring.
Blurred edges tend to be thick and sharp edges
tend to be thin
First and Second derivatives

the signs of the derivatives


would be reversed for an edge
that transitions from light to
dark
Second derivatives
produces 2 values for every edge in an
image (an undesirable feature)
an imaginary straight line joining the
extreme positive and negative values of
the second derivative would cross zero
near the midpoint of the edge. (zero-
crossing property)
Zero-crossing
quite useful for locating the centers of
thick edges
we will talk about it again later
Noise Images
First column: images
and gray-level profiles
of a ramp edge
corrupted by random
Gaussian noise of mean
0 and = 0.0, 0.1, 1.0
and 10.0, respectively.
Second column: first-
derivative images and
gray-level profiles.
Third column : second-
derivative images and
gray-level profiles.
Keep in mind
fairly little noise can have such a
significant impact on the two key
derivatives used for edge detection in
images
image smoothing should be serious
consideration prior to the use of
derivatives in applications where noise
is likely to be present.
Edge point
to determine a point as an edge point
the transition in grey level associated with
the point has to be significantly stronger
than the background at that point.
use threshold to determine whether a
value is significant or not.
the points two-dimensional first-order
derivative must be greater than a specified
threshold.
f
Gx x
f f
Gradient Operator G y
y

first derivatives are implemented


using the magnitude of the gradient.
1
f mag (f ) [G G ]
2
x
2
y
2

1
commonly approx
f 2 f 2 2


x y
f G x G y
the magnitude becomes nonlinear
Gradient Masks
Diagonal edges with Prewitt
and Sobel masks
Example
Example
Example
Laplacian
Laplacian operator
2
f ( x, y ) 2
f ( x, y)
(linear operator) f
2

x 2
y 2

f [ f ( x 1, y ) f ( x 1, y )
2

f ( x, y 1) f ( x, y 1) 4 f ( x, y )]
Laplacian of Gaussian
Laplacian combined with smoothing to
find edges via zero-crossing.

where r2 = x2+y2, and


r2 is the standard deviation

h(r ) e 2 2

r2
r 2 2
2 2
h( r )
2
e

4

positive central term
surrounded by an adjacent negative region (a function of distance)
zero outer region
Mexican hat

the coefficient must be sum to zero


Linear Operation
second derivation is a linear operation
thus, 2f is the same as convolving the
image with Gaussian smoothing
function first and then computing the
Laplacian of the result
Example

a). Original image


b). Sobel Gradient
c). Spatial Gaussian
smoothing function
d). Laplacian mask
e). LoG
f). Threshold LoG
g). Zero crossing
Zero crossing & LoG
Approximate the zero crossing from
LoG image
to threshold the LoG image by setting
all its positive values to white and all
negative values to black.
the zero crossing occur between
positive and negative values of the
thresholded LoG.
Canny Edge Detection
Compute edge strength and orientation
at all pixels
Non-max suppression
Reduce thick edge strength responses
around true edges
Link and threshold using hysteresis
Simple method of contour completion
Non-maximum suppression:
Select the single maximum point across the width
of an edge.
Non-maximum
suppression

At q, the
value must
be larger
than values
interpolated
at p or r.
Examples:
Non-Maximum Suppression

Non-maxima
Original image Gradient magnitude
suppressed
Linking to the
next edge point

Assume the marked


point is an edge
point.

Take the normal to


the gradient at that
point and use this to
predict continuation
points (either r or s).
Edge Hysteresis
Hysteresis: A lag or momentum factor
Idea: Maintain two thresholds khigh and klow
Use khigh to find strong edges to start edge
chain
Use klow to find weak edges which continue
edge chain
Typical ratio of thresholds is roughly
khigh / klow = 2
Example: Canny Edge Detection
gap is gone

Strong +
Original
connected
image
weak edges

Strong Weak
edges edges
only
fine scale
high
threshold
coarse
scale,
high
threshold
coarse
scale
low
threshold
Finding lines in an image
Option 1:
Search for the line at every possible
position/orientation
What is the cost of this operation?

Option 2:
Use a voting scheme: Hough transform
Finding lines in an image
y b

b0

x m0 m
image space Hough space

Connection between image (x,y) and Hough (m,b)


spaces
A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
Finding
y
lines inb
an image

y0

x0 x m
image space Hough space

Connection between image (x,y) and Hough (m,b) spaces


A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
What does a point (x0, y0) in the image space map to?

A: the solutions of b = -x0m + y0


this is a line in Hough space
Hough transform algorithm
Typically use a different
parameterization

d is the perpendicular distance from


the line to the origin
is the angle this perpendicular
makes with the x axis
Why?
Hough transform algorithm
Typically use a different parameterization

d is the perpendicular distance from the line to the origin


is the angle this perpendicular makes with the x axis
Why?
Basic Hough transform algorithm
1. Initialize H[d, ]=0
2. for each edge point I[x,y] in the image
for = 0 to 180

H[d, ] += 1
3. Find the value(s) of (d, ) where H[d, ] is maximum
4. The detected line in the image is given by
Whats the running time (measured in # votes)?
Extensions
Extension 1: Use the image gradient
1. same
2. for each edge point I[x,y] in the image
compute unique (d, ) based on image gradient at (x,y)
H[d, ] += 1
3. same
4. same
1. Whats the running time measured in votes?

2. Extension 2
give more votes for stronger edges
Extension 3
change the sampling of (d, ) to give more/less resolution
Extension 4
The same procedure can be used with circles, squares, or any other
shape
Extensions
Extension 1: Use the image gradient
1. same
2. for each edge point I[x,y] in the image
compute unique (d, ) based on image gradient at (x,y)
H[d, ] += 1
3. same
4. same
1. Whats the running time measured in votes?

2. Extension 2
give more votes for stronger edges
Extension 3
change the sampling of (d, ) to give more/less resolution
Extension 4
The same procedure can be used with circles, squares, or any
other shape
Hough demos
Line : http://www/dai.ed.ac.uk/HIPR2/houghdemo.html
http://www.dis.uniroma1.it/~iocchi/slides/icra2001/java/hough.html

Circle : http://www.markschulze.net/java/hough/
Hough Transform for Curves
The H.T. can be generalized to detect
any curve that can be expressed in
parametric form:
Y = f(x, a1,a2,ap)
a1, a2, ap are the parameters
The parameter space is p-dimensional
The accumulating array is LARGE!
Thresholding
image with dark image with dark
background and background and
a light object two light objects
Multilevel thresholding
a point (x,y) belongs to
to an object class if T1 < f(x,y) T2
to another object class if f(x,y) > T2
to background if f(x,y) T1
T depends on
only f(x,y) : only on gray-level values Global
threshold
both f(x,y) and p(x,y) : on gray-level values and its
neighbors Local threshold
easily use global thresholding
object and background are separated

The Role of Illumination


f(x,y) = i(x,y) r(x,y)
a). computer generated
reflectance function
b). histogram of reflectance
function
c). computer generated
illumination function (poor)
d). product of a). and c).
e). histogram of product
image

difficult to segment
Basic Global Thresholding

use T midway
between the max
and min gray levels
generate binary image
Basic Global Thresholding
based on visual inspection of histogram
1. Select an initial estimate for T.
2. Segment the image using T. This will produce two
groups of pixels: G1 consisting of all pixels with
gray level values > T and G2 consisting of pixels
with gray level values T
3. Compute the average gray level values 1 and 2
for the pixels in regions G1 and G2
4. Compute a new threshold value
5. T = 0.5 (1 + 2)
6. Repeat steps 2 through 4 until the difference in T
in successive iterations is smaller than a
predefined parameter To.
Example: Heuristic method
note: the clear
valley of the
histogram and the
effective of the
segmentation
between object
and background

T0 = 0
3 iterations
with result T = 125
Basic Adaptive Thresholding
subdivide original image into small
areas.
utilize a different threshold to segment
each subimages.
since the threshold used for each pixel
depends on the location of the pixel in
terms of the subimages, this type of
thresholding is adaptive.
Example : Adaptive Thresholding
Further subdivision
a). Properly and improperly
segmented subimages from
previous example
b)-c). corresponding histograms
d). further subdivision of the
improperly segmented subimage.
e). histogram of small subimage at
top
f). result of adaptively segmenting
d).
Greylevel histogram-based
segmentation
We will look at two very simple image
segmentation techniques that are based
on the greylevel histogram of an image
Thresholding
Clustering
We will use a very simple object-
background test image
We will consider a zero, low and high noise image
Greylevel histogram-based
segmentation

Noise free Low noise High noise


Greylevel histogram-based
segmentation
How do we characterise low noise and
high noise?
We can consider the histograms of our
images
For the noise free image, its simply two spikes at i=100,
i=150
For the low noise image, there are two clear peaks
centred on i=100, i=150
For the high noise image, there is a single peak two
greylevel populations corresponding to object and
background have merged
Greylevel histogram-based
segmentation
h(i)
2500.00

2000.00

1500.00
Noise free
Low noise
High noise
1000.00

500.00

0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
Greylevel histogram-based
segmentation
We can define the input image signal-to-noise
ratio in terms of the mean greylevel value of
the object pixels and background pixels and
the additive noise standard deviation
b o
S/ N

Greylevel histogram-based
segmentation
For our test images :
S/N (noise free) =
S/N (low noise) = 5

S/N (low noise) = 2


Greylevel thresholding
We can easily understand segmentation
based on thresholding by looking at the
histogram of the low noise
object/background image
There is a clear valley between to two
peaks
Greylevel thresholding
h(i)
2500.00

2000.00

Background
1500.00

1000.00
Object

500.00

0.00 i
0.00 50.00 100.00 150.00 200.00 250.00

T
Greylevel thresholding
We can define the greylevel
thresholding algorithm as follows:
If the greylevel of pixel p <=T then pixel p
is an object pixel
else
Pixel p is a background pixel
Greylevel thresholding
This simple threshold test begs the
obvious question how do we determine
the threshold ?
Many approaches possible

Interactive threshold
Adaptive threshold

Minimisation method
Greylevel thresholding
We will consider in detail a
minimisation method for determining
the threshold
Minimisation of the within group variance
Robot Vision, Haralick & Shapiro, volume

1, page 20
Greylevel thresholding
Idealized object/background image
histogram
h(i)
2500.00

2000.00

1500.00

1000.00

500.00

0.00 i
0.00 50.00 100.00 150.00 200.00 250.00

T
Greylevel thresholding
Any threshold separates the histogram into
2 groups with each group having its own
statistics (mean, variance)
The homogeneity of each group is measured
by the within group variance
The optimum threshold is that threshold
which minimizes the within group variance
thus maximizing the homogeneity of each
group
Greylevel thresholding
Let group o (object) be those pixels with
greylevel <=T
Let group b (background) be those
pixels with greylevel >T
The prior probability of group o is po(T)

The prior probability of group b is pb(T)


Greylevel thresholding
The following expressions can easily be
derived for prior probabilities of object and
background T
po ( T ) P( i )
i0
255
pb ( T ) P( i )
i T 1

P(i ) h( i ) / N

where h(i) is the histogram of an N pixel


image
Greylevel thresholding
The mean and variance of each group
are as follows :
T
o ( T ) iP(i ) / po ( T )
i0
255
b ( T ) iP( i ) / pb ( T )
i T 1

( T ) i o ( T ) P(i ) / po ( T )
T 2
2
o
i0

( T ) i b ( T ) P( i ) / pb ( T )
255 2
2
b
i T 1
Greylevel thresholding
The within group variance is defined as :
( T ) ( T ) po ( T ) ( T ) pb ( T )
2
W
2
o
2
b

We determine the optimum T by


minimizing this expression with respect to
T
Only requires 256 comparisons for and 8-bit
greylevel image
Greylevel thresholding
h(i)
Topt
2500.00

2000.00

Histogram
1500.00
Within group variance

1000.00

500.00

0.00 i
0.00 50.00 100.00 150.00 200.00 250.00
Greylevel thresholding
We can examine the performance of
this algorithm on our low and high
noise image
For the low noise case, it gives an optimum
threshold of T=124
Almost exactly halfway between the object
and background peaks
We can apply this optimum threshold to
both the low and high noise images
Greylevel thresholding

Low noise image Thresholded at T=124


Greylevel thresholding

Low noise image Thresholded at T=124


Greylevel thresholding
High level of pixel miss-classification
noticeable
This is typical performance for thresholding
The extent of pixel miss-classification is
determined by the overlap between object and
background histograms.
Greylevel thresholding
p(x)

0.02

Background

0.01
Object

x
0.00
o b

T
Greylevel thresholding
p(x)

0.02

Background
0.01

Object

x
0.00
o b

T
Greylevel thresholding
Easy to see that, in both cases, for any
value of the threshold, object pixels will
be miss-classified as background and vice
versa
For greater histogram overlap, the pixel
miss-classification is obviously greater
We could even quantify the probability of error
in terms of the mean and standard deviations
of the object and background histograms
Greylevel clustering

Consider an idealized
object/background histogram

Background

Object

c1 c2
Greylevel clustering

Clustering tries to separate the


histogram into 2 groups
Defined by two cluster centres c1 and
c2
Greylevels classified according to the
nearest cluster centre
Greylevel clustering

A nearest neighbour clustering


algorithm allows us perform a greylevel
segmentation using clustering
A simple case of a more general and widely
used K-means clustering
A simple iterative algorithm which has

known convergence properties


Greylevel clustering

Given a set of greylevels


g(1), g( 2 )...... g( N )
We can partition this set into two
groups g1 ( 1), g1 ( 2 )...... g1 ( N1 )
g (1), g ( 2 )...... g ( N )
2 2 2 2
Greylevel clustering

Compute the local means of each group


1 N1
c1 g1 ( i )
N 1 i 1
N2
1
c2
N2
g
i 1
2 (i )
Greylevel clustering

Re-define the new groupings

g1 ( k ) c1 g1 ( k ) c2 k 1.. N 1

g2 ( k ) c2 g2 ( k ) c1 k 1.. N 2
In other words all grey levels in set 1 are
nearer to cluster centre c1 and all grey levels
in set 2 are nearer to cluster centre c2
Greylevel clustering

But, we have a chicken and egg


situation
The problem with the above definition is
that each group mean is defined in terms
of the partitions and vice versa
The solution is to define an iterative

algorithm and worry about the convergence


of the algorithm later
Greylevel clustering

The iterative algorithm is as follows


Initialize the label of each pixel randomly

Repeat
c1= mean of pixels assigned to object label
c2= mean of pixels assigned to background label

Compute partition g (1), g ( 2 )...... g ( N )


1 1 1 1

Compute partition g (1), g ( 2 )...... g ( N )


2 2 2 2

Until none pixel labelling changes


Greylevel clustering

Two questions to answer


Does this algorithm converge?
If so, to what does it converge?

We can show that the algorithm is


guaranteed to converge and also that it
converges to a sensible result
Greylevel clustering

Outline proof of algorithm convergence


Define a cost function at iteration r
g1 ( i ) c1 g2 ( i ) c2
1 N1
( r 1 ) 2 1 N2
( r 1 ) 2
E
(r ) (r ) (r )

N 1 i 1 N 2 i 1

E(r) >0
Greylevel clustering

Now update the cluster centres


1 N1
c1( r ) g1( r ) ( i )
N 1 i 1
N1
1
c2( r )
N2
2 (i)
g (r )

i 1

Finally update the cost function

g1 ( i ) c1 g2 ( i ) c2
1 N (r ) 1
(r ) 2 1 N (r )
2
(r ) 2
E1
(r )

N 1 i 1 N 2 i 1
Greylevel clustering

Easy to show that


( r 1 )
E E
(r )
1 E (r )

Since E(r) >0, we conclude that the


algorithm must converge
but
What does the algorithm converge to?
Greylevel clustering

E1 is simply the sum of the variances


within each cluster which is minimised
at convergence
Gives sensible results for well separated
clusters
Similar performance to thresholding
Greylevel clustering

g2

g1

c1 c2
Relaxation labelling

All of the segmentation algorithms we


have considered thus far have been
based on the histogram of the image
This ignores the greylevels of each pixels
neighbours which will strongly influence
the classification of each pixel
Objects are usually represented by a

spatially contiguous set of pixels


Relaxation labelling

The following is a trivial example of a


likely pixel miss-classification

Object

Background
Relaxation labelling

Relaxation labelling is a fairly general technique


in computer vision which is able to incorporate
constraints (such as spatial continuity) into
image labelling problems
We will look at a simple application to greylevel
image segmentation
It could be extended to colour/texture/motion
segmentation
Relaxation labelling

Assume a simple object/background


image
p(i) is the probability that pixel i is a
background pixel
(1- p(i)) is the probability that pixel i is a

background pixel
Relaxation labelling

Define the 8-neighbourhood of pixel


i as {i1,i2,.i8}
i1 i2 i3

i4 i i5

i6 i7 i8
Relaxation labelling

Define consistencies cs and cd


Positive cs and negative cd encourages
neighbouring pixels to have the same label
Setting these consistencies to appropriate

values will encourage spatially contiguous


object and background regions
Relaxation labelling

We assume again a bi-modal


object/background histogram with
maximum greylevel gmax
Background

Object

0 g max
Relaxation labelling

We can initialize the probabilities


p ( i ) g( i ) / gmax
(0)

Our relaxation algorithm must drive


the background pixel probabilities p(i)
to 1 and the object pixel probabilities to
0
Relaxation labelling

We want to take into account:


Neighbouring probabilities p(i1), p(i2),
p(i8)
The consistency values cs and cd

We would like our algorithm to


saturate such that p(i)~1
We can then convert the probabilities to
labels by multiplying by 255
Relaxation labelling

We can derive the equation for


relaxation labelling by first considering
a neighbour i1 of pixel i
We would like to evaluate the contribution
to the increment in p(i) from i1
Let this increment be q(i1)

We can evaluate q(i1) by taking into account the


consistencies
Relaxation labelling

We can apply a simple decision rule to


determine the contribution to the
increment q(i1) from pixel i1
If p(ii) >0.5 the contribution from pixel i1
increments p(i)
If p(ii) <0.5 the contribution from pixel i1

decrements p(i)
Relaxation labelling

Since cs >0 and cd <0 its easy to see that the


following expression for q(i1) has the right properties

q( i1 ) cs p( i1 ) cd ( 1 p( i1 ))
We can now average all the contributions from the 8-
neighbours of i to get the total increment to p(i)

1 8
p (i ) (cs p (ih ) cd (1 p (ih )))
8 h 1
Relaxation labelling

Easy to check that 1<p(i)<1 for -1< cs ,cd


<1
Can update p(i) as follows
p ( r ) ( i ) ~ p ( r 1 ) ( i )( 1 p( i ))
Ensures that p(i) remains positive
Basic form of the relaxation equation
Relaxation labelling

We need to normalize the probabilities


p(i) as they must stay in the range {0..1}
After every iteration p(r)(i) is rescaled to
bring it back into the correct range
Remember our requirement that likely
background pixel probabilities are
driven to 1

153
Relaxation labelling

One possible approach is to use a constant


normalisation factor
p ( r ) ( i ) p ( r ) ( i ) / maxi p ( r ) ( i )
In the following example, the central background
pixel probability may get stuck at 0.9 if max(p(i))=1
0.9 0.9 0.9

0.9 0.9 0.9

0.9 0.9 0.9

154
Relaxation labelling

The following normalisation equation


has all the right properties
It can be derived from the general theory of
relaxation labelling
( r 1 )
p ( i )( 1 p( i ))
p ( i ) ( r 1 )
(r )
p ( i )( 1 p( i )) ( 1 p ( r 1 ) ( i ))( 1 p( i ))

Its easy to check that 0<p(r)(i)<1

155
Relaxation labelling

We can check to see if this


normalisation equation has the correct
saturation properties
When p(r-1)(i)=1, p(r)(i)=1
When p(r-1)(i)=0, p(r)(i)=0
When p(r-1)(i)=0.9 and p(i)=0.9, p(r)(i)=0.994
When p(r-1)(i)=0.1 and p(i)=-0.9, p(r)(i)=0.012
We can see that p(i) converges to 0 or 1

156
Relaxation labelling

Algorithm performance on the high noise


image
Comparison with thresholding

Relaxation labeling - 20
High noise circle image Optimum threshold iterations
157
Relaxation labelling

The following is an example of a case


where the algorithm has problems due
to the thin structure in the clamp image

clamp image clamp image segmented clamp


original noise added image 10 iterations

158
Relaxation labelling

Applying the algorithm to normal


greylevel images we can see a clear
separation into light and dark areas

Original 2 iterations 5 iterations 10 iterations

159
Relaxation labelling

The histogram of each image shows the


clear saturation to 0 and 255
h(i)
5000.00

4000.00

3000.00 Original
2 iterations
5 iterations
2000.00 10 iterations

1000.00

0.00 i
0.00 50.00 100.00 150.00 200.00 250.00

160
The Expectation/Maximization (EM)
algorithm
In relaxation labelling we have seen that we are
representing the probability that a pixel has a certain
label
In general we may imagine that an image comprises
L segments (labels)
Within segment l the pixels (feature vectors) have a probability
distribution represented by pl ( x | l )
l represents the parameters of the data in segment l
Mean and variance of the greylevels
Mean vector and covariance matrix of the colours
Texture parameters

161
The Expectation/Maximization (EM)
algorithm

1 5 3

162
The Expectation/Maximization (EM)
algorithm
Once again a chicken and egg problem arises
If we knew : l 1..L then we could obtain a labelling for
l
each x by simply choosing that label which
maximizes pl ( x | l )
If we new the label for each x we could obtain l : l 1..L
by using a simple maximum likelihood estimator
The EM algorithm is designed to deal with this type
of problem but it frames it slightly differently
It regards segmentation as a missing (or incomplete) data
estimation problem

163
The Expectation/Maximization (EM)
algorithm
The incomplete data are just the measured pixel
greylevels or feature vectors
We can define a probability distribution of the incomplete
data as pi ( x;1 , 2 ..... L )
The complete data are the measured greylevels or
feature vectors plus a mapping function f (.) which
indicates the labelling of each pixel
Given the complete data (pixels plus labels) we can easily
work out estimates of the parameters l : l 1..L
But from the incomplete data no closed form solution exists
The Expectation/Maximization (EM)
algorithm
Once again we resort to an iterative strategy and hope that we
get convergence
The algorithm is as follows:

Initialize an estimate of l : l 1..L


Repeat
Step 1: (E step)
Obtain an estimate of the labels based on
the current parameter estimates

Step 2: (M step)
Update the parameter estimates based on the
current labelling
Until Convergence
165
The Expectation/Maximization (EM)
algorithm
A recent approach to applying EM to image
segmentation is to assume the image pixels
or feature vectors follow a mixture model
Generally we assume that each component of the
mixture model is a Gaussian
A Gaussian mixture model (GMM)
p( x | ) l 1 l pl ( x | l )
L

1 1
pl ( x | l ) exp( ( x ) l ( x l ))
T 1

(2 ) det( l )
d /2 1/ 2 l
2

l 1
L
l 1
166
The Expectation/Maximization (EM)
algorithm
Our parameter space for our distribution now
includes the mean vectors and covariance matrices
for each component in the mixture plus the mixing
weights
1 , 1 ,1 ,..........L , L , L

We choose a Gaussian for each component because


the ML estimate of each parameter in the E-step
becomes linear

167
The Expectation/Maximization (EM)
algorithm
Define a posterior probability P(l | x j , l ) as the
probability that pixel j belongs to region l given the
value of the feature vector x j
Using Bayes rule we can write the following equation
l pl ( x j | l )
P(l | x j , l )
k pk ( x | k )
L
k 1

This actually is the E-step of our EM algorithm as


allows us to assign probabilities to each label at each
pixel

168
The Expectation/Maximization (EM)
algorithm
The M step simply updates the parameter
estimates using MLL estimation
1 n
l
( m 1)
P(l | x j ,l( m) )
n j 1
n

j
x
j 1
P (l | x j , l
( m)
)
l( m 1) n

P (
j 1
l | x j , l
( m)
)

P
j 1
(l | x j , l
(m)
) ( x j l
(m)
)( x j l
( m) T
)
l( m 1) n

P (l |
j 1
x j , l
(m)
)
169
Boundary Characteristic for Histogram
Improvement and Local Thresholding
light object of dark background

0 if f T

s ( x , y ) if f T and 2 f 0
if f T and 2 f 0

Gradient gives an indication of whether a pixel is on an edge
Laplacian can yield information regarding whether a given pixel lies
on the dark or light side of the edge
all pixels that are not on an edge are labeled 0
all pixels that are on the dark side of an edge are labeled +
all pixels that are on the light side an edge are labeled -
Example
Region-Based Segmentation - Region
Growing

start with a set of seed points


growing by appending to each seed
those neighbors that have similar
properties such as specific ranges of
gray level
select all seed points with gray level 255

Region Growing
criteria:
1. the absolute gray-
level difference
between any pixel
and the seed has to
be less than 65
2. the pixel has to be
8-connected to at
least one pixel in
that region (if
more, the regions
are merged)
Corner detection
Corners contain more edges than
lines.
A point on a line is hard to match.
Corners contain more edges than
lines.
A corner is easier
Edge Detectors Tend to Fail at
Corners
Finding Corners

Intuition:
Right at corner, gradient is ill defined.
Near corner, gradient has two different
values.
Formula for Finding Corners
We look at matrix:
Gradient with respect to x,
Sum over a small region, times gradient with respect to y
the hypothetical corner

I 2
I I
C x x y

I x I y I 2
y

Matrix is symmetric WHY THIS?


First, consider case where:

I 2
I I 1 0
C x x y

I x I y I 0 2
2
y

This means all gradients in neighborhood are:


(k,0) or (0, c) or (0, 0) (or off-diagonals cancel).
What is region like if:
1. 1 0?
2. 2 0?
3. 1 0 and 2 0?
4. 1 0 and 2 0?
General Case:
From Linear Algebra, it follows that because C is
symmetric:

1 0
CR 1
R
0 2
With R a rotation matrix.
So every case is like one on last slide.
So, to detect corners
Filter image.
Compute magnitude of the gradient
everywhere.
We construct C in a window.
Use Linear Algebra to find l1 and l2.
If they are both big, we have a corner.
Image Segmentation
Segmentation divides an image into its
constituent regions or objects.
Segmentation of images is a difficult task in
image processing. Still under research.
Segmentation allows to extract objects in
images.
Segmentation is unsupervised learning.
Model based object extraction, e.g., template
matching, is supervised learning.
What it is useful for
After a successful segmenting the image, the
contours of objects can be extracted using edge
detection and/or border following techniques.
Shape of objects can be described.
Based on shape, texture, and color objects can be
identified.
Segmentation Algorithms
Segmentation algorithms are based on one of
two basic properties of color, gray values, or
texture: discontinuity and similarity.
First category is to partition an image based
on abrupt changes in intensity, such as
edges in an image.
Second category are based on partitioning an
image into regions that are similar according
to a predefined criteria. Histogram
thresholding approach falls under this
category.
Domain spaces
spatial domain (row-column (rc) space)

histogram spaces

color space

texture space

other complex feature space


Clustering in Color Space
1. Each image point is mapped to a point in a color
space, e.g.:

Color(i, j) = (R (i, j), G(i, j), B(i, j))

It is many to one mapping.

2. The points in the color space are grouped to clusters.

3. The clusters are then mapped back to regions in the


image.
Examples
Original pictures segmented pictures

Mnp: 30, percent 0.05, cluster number 4

Mnp : 20, percent 0.05, cluster number 7


Displaying objects in the
Segmented Image
The objects can be distinguished by
assigning an arbitrary pixel value or
average pixel value to the pixels
belonging to the same clusters.
Segmentation by Thresholding
Suppose that the gray-level histogram
corresponds to an image f(x,y) composed of
dark objects on the light background, in
such a way that object and background
pixels have gray levels grouped into two
dominant modes. One obvious way to extract
the objects from the background is to select
a threshold T that separates these modes.
Then any point (x,y) for which f(x,y) < T is
called an object point, otherwise, the point is
called a background point.
Gray Scale Image Example

Image of a Finger Print with light background


Histogram
Segmented Image

Image after Segmentation


In Matlab histograms for images can be
constructed using the imhist command.

I = imread('pout.tif');
figure, imshow(I);
figure, imhist(I) %look at the hist to get a threshold, e.g., 110
BW=roicolor(I, 110, 255); % makes a binary image
figure, imshow(BW) % all pixels in (110, 255) will be 1 and white
% the rest is 0 which is black

roicolor returns a region of interest selected as those pixels in I that


match the values in the gray level interval.
BW is a binary image with 1's where the values of I match the values
of the interval.
Thresholding Bimodal Histograms
Basic Global Thresholding:
1)Select an initial estimate for T
2)Segment the image using T. This will produce two
groups of pixels. G1 consisting of all pixels with gray
level values >T and G2 consisting of pixels with
values <=T.
3)Compute the average gray level values mean1 and
mean2 for the pixels in regions G1 and G2.
4)Compute a new threshold value
T=(1/2)(mean1 +mean2)
5)Repeat steps 2 through 4 until difference in T in
successive iterations is smaller than a predefined
parameter T0.
Gray Scale Image - bimodal

Image of rice with black background


Segmented Image

Image histogram of rice Image after segmentation


Basic Adaptive Thresholding:

Images having uneven illumination makes it difficult


to segment using histogram,
this approach is to divide the original image
into sub images
and use the thresholding process
to each of the sub images.
Multimodal Histogram
If there are three or more dominant modes
in the image histogram, the histogram has
to be partitioned by multiple thresholds.

Multilevel thresholding classifies a point


(x,y) as belonging to one object class
if T1 < (x,y) <= T2,
to the other object class
if f(x,y) > T2
and to the background
if f(x,y) <= T1.
Thresholding multimodal histograms
A method based on
Discrete Curve Evolution
to find thresholds in the histogram.

The histogram is treated as a polyline


and is simplified until a few vertices remain.
Thresholds are determined by vertices that are local
minima.
Discrete Curve Evolution (DCE)
It yields a sequence: P=P0, ..., Pm

Pi+1 is obtained from Pi by deleting the vertices of Pi that


have minimal relevance measure

K(v, Pi) = |d(u,v)+d(v,w)-d(u,w)|

v v
> w
w u

u
Gray Scale Image - Multimodal

Original Image of lena


Multimodal Histogram

Histogram of lena
Segmented Image

Image after segmentation we get a outline of her face, hat, shadow etc
Color Image - bimodal

Colour Image having a bimodal histogram


Histogram

Histograms for the three colour spaces


Segmented Image

Segmented image, skin color is shown


Split and Merge
The goal of Image Segmentation is to
find regions that represent objects or
meaningful parts of objects. Major
problems of image segmentation are
result of noise in the image.
An image domain X must be segmented
in N different regions R(1),,R(N)
The segmentation rule is a logical
predicate of the form P(R)
Split and Merge
Image segmentation with respect to
predicate P partitions the image X into
subregions R(i), i=1,,N such that

X = i=1,..N U R(i)
R(i) R(j) = 0 for I j
P(R(i)) = TRUE for i = 1,2,,N
P(R(i) U R(j)) = FALSE for i j
Split and Merge
The segmentation property is a logical
predicate of the form P(R,x,t)
x is a feature vector associated with
region R
t is a set of parameters (usually
thresholds). A simple segmentation
rule has the form:
P(R) : I(r,c) < T for all (r,c) in R
Split and Merge
In the case of color images the feature
vector x can be three RGB image
components (R(r,c),G(r,c),B(r,c))
A simple segmentation rule may have
the form:
P(R) : (R(r,c) <T(R)) && (G(r,c)<T(G))&&
(B(r,c) < T(B))
Region Growing (Merge)
A simple approach to image segmentation is
to start from some pixels (seeds) representing
distinct image regions and to grow them,
until they cover the entire image
For region growing we need a rule describing
a growth mechanism and a rule checking the
homogeneity of the regions after each growth
step
Region Growing
The growth mechanism at each stage k and
for each region Ri(k), i = 1,,N, we check if
there are unclassified pixels in the 8-
neighbourhood of each pixel of the region
border
Before assigning such a pixel x to a region
Ri(k),we check if the region homogeneity:
P(Ri(k) U {x}) = TRUE , is valid
Region Growing Predicate
The arithmetic mean m and standard deviation std of
a region R having n =|R| pixels:

1 1
m( R ) I ( r , c ) std ( R) ( I ( r , c ) m( R )) 2
n ( r ,c )R n 1 ( r ,c )R

The predicate
P: |m(R1) m(R2)| < k*min{std(R1), std(R2)},
is used to decide if the merging
of the two regions R1, R2 is allowed, i.e.,
if |m(R1) m(R2)| < k*min{std(R1), std(R2)},
two regions R1, R2 are merged.
Split
The opposite approach to region growing is
region splitting.
It is a top-down approach and it starts with
the assumption that the entire image is
homogeneous
If this is not true, the image is split into four
sub images
This splitting procedure is repeated
recursively until we split the image into
homogeneous regions
Split
If the original image is square N x N, having
dimensions that are powers of 2(N = 2n):
All regions produced but the splitting algorithm are
squares having dimensions M x M , where M is a
power of 2 as well.
Since the procedure is recursive, it produces an
image representation that can be described by a tree
whose nodes have four sons each
Such a tree is called a Quadtree.
Split

Quadtree

R0 R1
R0

R3
R2 R1

R00 R01 R02 R04


Split
Splitting techniques disadvantage, they
create regions that may be adjacent
and homogeneous, but not merged.
Split and Merge method is an iterative
algorithm that includes both splitting
and merging at each iteration:
Split / Merge
If a region R is inhomogeneous
(P(R)= False) then is split into four sub
regions
If two adjacent regions Ri,Rj are
homogeneous (P(Ri U Rj) = TRUE), they
are merged
The algorithm stops when no further
splitting or merging is possible
Split / Merge
The split and merge algorithm
produces more compact regions than
the pure splitting algorithm
Applications
3D Imaging : A basic task in 3-D image processing
is the segmentation of an image which classifies
voxels/pixels into objects or groups. 3-D image
segmentation makes it possible to create 3-D
rendering for multiple objects and perform
quantitative analysis for the size, density and other
parameters of detected objects.
Several applications in the field of Medicine like
magnetic resonance imaging (MRI).
Results Region grow
Results Region Split
Results Region Split and
Merge
Introduction
All pixels belong to a region
Object
Part of object
Background
Find region
Constituent pixels
Boundary
Watersheds of Gradient
Magnitude
Compare geographical watersheds
Divide landscape into catchment basins
Edges correspond to watersheds
Algorithm
Locate local minima
Flood image from these points
When two floods meet
Identify a watershed pixel
Build a dam
Continue flooding
Example
watersheds

local minima
watershed point
watershed point dam
Image
Segmentation
Background

First-order derivative

f
f '( x) f ( x 1) f ( x)
x

Second-order derivative

2 f
f ( x 1) f ( x 1) 2 f ( x)
x 2
Characteristics of First and Second Order
Derivatives
First-order derivatives generally produce thicker edges in image

Second-order derivatives have a stronger response to fine detail,


such as thin lines, isolated points, and noise

Second-order derivatives produce a double-edge response at


ramp and step transition in intensity

The sign of the second derivative can be used to determine


whether a transition into an edge is from light to dark or dark to
light
Detection of Isolated Points

The Laplacian

2
f 2
f
f ( x, y ) 2 2
2

x y
f ( x 1, y ) f ( x 1, y ) f ( x, y 1) f ( x, y 1)
4 f ( x, y )
9

1 if | R( x, y ) | T R wk zk
g ( x, y ) k 1

0 otherwise
Image Segmentation

Image segmentation divides an image into regions that


are connected and have some similarity within the
region and some difference between adjacent regions.
The goal is usually to find individual objects in an
image.
For the most part there are fundamentally two kinds
of approaches to segmentation: discontinuity and
similarity.
Similarity may be due to pixel intensity, color or texture.
Differences are sudden changes (discontinuities) in any of
these, but especially sudden changes in intensity along a
boundary line, which is called an edge.
Detection of Discontinuities

There are three kinds of discontinuities of intensity:


points, lines and edges.
The most common way to look for discontinuities is to
scan a small mask over the image. The mask
determines which kind of discontinuity to look for.
9
R w1 z1 w2 z2 ... w9 z9 wi zi
i 1
Detection of Discontinuities
Point Detection

R T
where T : a nonnegativ e threshold
Detection of Discontinuities
Line Detection

Only slightly more common than point detection is to


find a one pixel wide line in an image.
For digital images the only three point straight lines
are only horizontal, vertical, or diagonal (+ or 45).
Detection of Discontinuities
Line Detection
Detection of Discontinuities
Edge Detection
Detection of Discontinuities
Edge Detection
Detection of Discontinuities
Edge Detection
Detection of Discontinuities
Edge Detection
Detection of Discontinuities
Gradient Operators

First-order derivatives:
The gradient of an image f(x,y) at location (x,y) is
defined as the vector:
G x f
x

f f
G y y
The magnitude of this vector:
f mag (f ) G G 2
x
2
y
1
2

Gx
The direction of this vector:
( x, y ) tan
1

Gy
Direction of edge
900
Basic Edge Detection by Using First-
Order Derivative
f
g x x
Edge normal: f grad ( f )
g y f
y
Edge unit normal: f / mag(f )

In practice,sometimes the magnitude is approximated by


f f f f
mag(f )= + or mag(f )=max | |,| |
x y x y
Detection of Discontinuities
Gradient Operators

Roberts cross-gradient operators

Prewitt operators

Sobel operators
Detection of Discontinuities
Gradient Operators

Prewitt masks for


detecting diagonal edges

Sobel masks for


detecting diagonal edges
Detection of Discontinuities
Gradient Operators: Example

f G x G y
Detection of Discontinuities
Gradient Operators: Example
Detection of Discontinuities
Gradient Operators: Example
Detection of Discontinuities
Gradient Operators

Second-order derivatives: (The Laplacian)


The Laplacian of an 2D function f(x,y) is defined
as 2 f 2 f
f 2 2
2

x y

Two forms in practice:


Detection of Discontinuities
Gradient Operators

Consider the function: A Gaussian function


r2
2
h(r ) e 2
where r 2 x 2 y 2
and : the standard deviation
The Laplacian of h is

r 2 2
2 2 r2 The Laplacian of a
h( r )
2
e Gaussian (LoG)

4

The Laplacian of a Gaussian sometimes is called the
Mexican hat function. It also can be computed by
smoothing the image with the Gaussian smoothing
mask, followed by application of the Laplacian mask.
Detection of Discontinuities
Gradient Operators
Detection of Discontinuities
Gradient Operators: Example

Sobel gradient

Gaussian smooth function Laplacian mask


Detection of Discontinuities
Gradient Operators: Example
Edge Linking and Boundary Detection
Local Processing

Two properties of edge points are useful for edge


linking:
the strength (or magnitude) of the detected edge points
their directions (determined from gradient directions)
This is usually done in local neighborhoods.
Adjacent edge points with similar magnitude and
direction are linked.
For example, an edge pixel with coordinates (x0,y0) in
a predefined neighborhood of (x,y) is similar to the
pixelat , y)
f ( x(x,y) if( x0 , y0 ) E, E : a nonnegativ e threshold
( x, y) ( x0 , y0 ) A, A : a nonegative angle threshold
Edge Linking and Boundary Detection
Local Processing: Example

In this example,
we can find the
license plate
candidate after
edge linking
process.
Edge Linking and Boundary Detection
Global Processing via the Hough Transform

Hough transform: a way of finding edge points in an


image that lie along a straight line.
Example: xy-plane v.s. ab-plane (parameter space)
yi axi b
Edge Linking and Boundary Detection
Global Processing via the Hough Transform

The Hough transform


consists of finding all pairs
of values of and which
satisfy the equations that
pass through (x,y).
These are accumulated in
what is basically a 2-
dimensional histogram.
When plotted these pairs of
and will look like a sine
x cos y sin
wave. The process is
repeated for all appropriate
(x,y) locations.
Edge Linking and Boundary Detection
Hough Transform Example
The intersection of
the curves
corresponding to
points 1,3,5

2,3,4

1,4
Edge Linking and Boundary Detection
Hough Transform Example
Thresholding

Assumption: the range of intensity levels covered by


objects of interest is different from the background.
1 if f ( x, y) T
g ( x, y)
0 if f ( x, y ) T

Single threshold Multiple threshold


Thresholding
The Role of Illumination
Thresholding
The Role of Illumination

r ( x, y ) (a) (c) i ( x, y )

(d) (e)

f ( x, y ) i ( x, y ) r ( x, y )
Thresholding
Basic Global Thresholding
Thresholding
Basic Global Thresholding
Thresholding
Basic Adaptive Thresholding
Thresholding
Basic Adaptive Thresholding

How to solve this problem?


Thresholding
Basic Adaptive Thresholding

Answer: subdivision
Thresholding
Optimal Global and Adaptive Thresholding

This method treats pixel values as probability density


functions.
The goal of this method is to minimize the probability of
misclassifying pixels as either object or background.
There are two kinds of error:
mislabeling an object pixel as background, and

mislabeling a background pixel as object.


Thresholding
Use of Boundary Characteristics
Thresholding
Thresholds Based on Several Variables

Color image
Region-Based Segmentation

Edges and thresholds sometimes do not give


good results for segmentation.
Region-based segmentation is based on the
connectivity of similar pixels in a region.
Each region must be uniform.
Connectivity of the pixels within the region is very
important.
There are two main approaches to region-
based segmentation: region growing and region
splitting.
Region-Based Segmentation
Basic Formulation

Let R represent the entire image region.


Segmentation is a process that partitions R into
subregions, R1,R2,,Rn, such that
n
(a) Ri R
i 1
(b) Ri is a connected region, i 1,2,..., n
(c) Ri R j for all i and j , i j
(d) P( Ri ) TRUE for i 1,2,..., n
(e) P( Ri R j ) FALSE for any adjacent regions Ri and R j
where P(Rk): a logical predicate defined over the points
in set Rk
For example: P(Rk)=TRUE if all pixels in Rk have the
Region-Based Segmentation
Region Growing
Region-Based Segmentation
Region Growing

Fig. 10.41 shows the histogram of Fig. 10.40 (a). It is


difficult to segment the defects by thresholding
methods. (Applying region growing methods are better
in this case.)

Figure 10.40(a) Figure 10.41


Region-Based Segmentation
Region Splitting and Merging

Region splitting is the opposite of region


growing.
First there is a large region (possible the entire
image).
Then a predicate (measurement) is used to
determine if the region is uniform.
If not, then the method requires that the region
be split into two regions.
Then each of these two regions is independently
tested by the predicate (measurement).
This procedure continues until all resulting regions
are uniform.
Region-Based Segmentation
Region Splitting

The main problem with region splitting is determining


where to split a region.
One method to divide a region is to use a quadtree
structure.
Quadtree: a tree in which nodes have exactly four
descendants.
Region-Based Segmentation
Region Splitting and Merging

The split and merge procedure:


Split into four disjoint quadrants any region Ri for
which P(Ri) = FALSE.
Merge any adjacent regions Rj and Rk for which
P(RjURk) = TRUE. (the quadtree structure may not
be preserved)
Stop when no further merging or splitting is
possible.
Segmentation by Morphological Watersheds

The concept of watersheds is based on visualizing an


image in three dimensions: two spatial coordinates
versus gray levels.
In such a topographic interpretation, we consider
three types of points:
(a) points belonging to a regional minimum

(b) points at which a drop of water would fall with

certainty to a single minimum


(c) points at which water would be equally likely to

fall to more than one such minimum


The principal objective of segmentation algorithms
based on these concepts is to find the watershed lines.
Watershed Segmentation Algorithm
Visualize an image in 3D: spatial coordinates and gray levels.
In such a topographic interpretation, there are 3 types of points:
Points belonging to a regional minimum

Points at which a drop of water would fall to a single


minimum. (The catchment basin or watershed of that
minimum.)
Points at which a drop of water would be equally likely to fall
to more than one minimum. (The divide lines or watershed
lines.)
Watershed lines
Watershed Segmentation
Algorithm
The objective is to find watershed lines.
The idea is simple:
Suppose that a hole is punched in each regional minimum and that the entire
topography is flooded from below by letting water rise through the holes at a
uniform rate.
When rising water in distinct catchment basins is about the merge, a dam is
built to prevent merging. These dam boundaries correspond to the
watershed lines.
Segmentation by Morphological Watersheds
Example
Segmentation by Morphological Watersheds
Example
Watershed algorithm is often applied to the
gradient of an image, rather than to the image
itself.
Regional minima of catchment basins correlate
nicely with the small value of the gradient
corresponding the objects of interest
Boundaries are highlighted as the watershed lines.
Watershed Segmentation Algorithm
Start with all pixels with the lowest
possible value.
These form the basis for initial watersheds
For each intensity level k:
For each group of pixels of intensity k
If adjacent to exactly one existing region, add these
pixels to that region
Else if adjacent to more than one existing regions,
mark as boundary
Else start a new region
Watershed Segmentation Algorithm
Watershed algorithm might be used on the gradient image instead
of the original image.
Watershed Segmentation Algorithm

Due to noise and other local irregularities of the gradient,


oversegmentation might occur.
Watershed Segmentation Algorithm

A solution is to limit the number of regional minima. Use


markers to specify the only allowed regional minima.
The Use of Markers
Internal markers are used to limit the
number of regions by specifying the
objects of interest
Like seeds in region growing method
Can be assigned manually or automatically
Regions without markers are allowed to be
merged (no dam is to be built)
External markers those pixels we are
confident to belong to the background
Watershed lines are typical external
markers and they belong the same
(background) region
Watershed Segmentation Algorithm

A solution is to limit the number of regional minima. Use


markers to specify the only allowed regional minima. (For
example, gray-level values might be used as a marker.)
Segmentation by Morphological Watersheds
Example
The Use of Motion in Segmentation

ADI: accumulative difference image


The Use of Motion in Segmentation

MATLAB Example
A: Original image f
B: Direct watershed transform
result using the following
commands
L=watershed(g)
wr = L ==0
g is the gradient image of A
C: shows all of the regional
minima of g using
rm=imregionalmin(g)
D: internal markers obtained
by
im = imextendedmin(g,2)
fim = f;
fim(im) = 175;
E: External markers using
Lim = watershed(bwdist(im))
em = Lim ==0
F: Modified gradient image
obtained from internal and ABC
external markers
g2 = imimposemin(g, im | em)
G: Final segmentation result DEF
L2 = watershed(g2)
f2 = f;

f2(L2 == 0) = 255
G

Vous aimerez peut-être aussi