Académique Documents
Professionnel Documents
Culture Documents
Question:
How do we perceive the three-dimensional properties of the world when
the images on our retinas are only two-dimensional?
Stereo is not the entire story!
Correspondence between
left and right images
Surveillance/Security:
Mapping: Recover as-built models
Detect and recognize faces
2017-03-30 Automotive Sensor Systems 13
Articulated Body Fitting
700 nm 400 nm
The system
consists of
many
thousand
folds of
sensors
vT
R
2
where R range (m)
v wave propagation velocity (m/s)
T round trip time (s)
Histogram
Select threshold
Number of pixels
Create binary image:
I(x,y) < T -> O(x,y) = 0
I(x,y) > T -> O(x,y) = 1
Gray value
T
2017-03-30 Automotive Sensor Systems 18
Threshold Value vs. Histogram
10 11 10 0 0 1
9 10 11 1 0 1
Thresholding usually involves analyzing the 10 9 10 0 2 1
histogram
11 10 9 10 9 11
Different image features give rise to
distinct features in a histogram 9 10 11 9 99 11
(Bimodel)
10 9 9 11 10 10
In general the histogram peaks
corresponding to two features will
overlap
Multi-level Thresholding
A point (x,y) belongs to;
i. an object class if T1 < f(x,y) T2
ii. another object class if f(x,y) > T2
iii. background if f(x,y) T1
T depends on;
only f(x,y) : only on gray-level values Global threshold
both f(x,y) and p(x,y) : on gray-level values and its neighbors Local
threshold
Solution:
smooth first
Laplacian of Gaussian
operator
Laplacian of Gaussian
1
f f
2 2 2 f Gx G y
Non-linear
x y
Y
Origin
Y Y
X X
Edge Direction
-90
X
d ( I ( x) * G( x)) E( x, y) ( I ( x, y) * G( x, y))
E ( x)
dx
Absolute value Magnitude
E ( x ) Th E ( x, y ) Th
2017-03-30 Automotive Sensor Systems 38
Algorithm CANNY_ENHANCER
The input is image I; G is a zero mean Gaussian filter (std = )
1. J = I * G (smoothing)
2. For each pixel (i,j): (edge enhancement)
Compute the image gradient
J(i,j) = (Jx(i,j),Jy(i,j))
Estimate edge strength
es(i,j) = (Jx2(i,j)+ Jy2(i,j))1/2
Estimate edge orientation
eo(i,j) = arctan(Jy(i,j)/Jx(i,j))
The output are images Es and Eo
Edge ENHANCER
Th
Th
Assume the
marked point is an
edge point. Then
we construct the
tangent to the edge
curve (which is
normal to the
gradient at that
point) and use this
to predict the next
points (here either
r or s).
A thermocouple measuring circuit with a heat source, cold junction and a measuring instrument.
Observation
4 16 0 7 03
4
= +
3 1 3 1 3
2
v2
Time t
(a)
90
v1
Leads by 90
Time t
Time t
v2
(c)
Time t
2 13
3
Transparent
4 11
Opaque
5 10
6 9
7 8
Binary Code
If there are n tracks, there are n pick-off elements and the disk is divided into 2n
number of sectors. If n = 16 o
360
0.0055o
216
Boundary as a sequence of
straight lines
4- or 8-connectedness
See: Robust Analysis of Feature Space: Color Image Segmentation, by D. Comaniciu and P.Meer,
CVPR 1997, pp. 750-755.
o x x x
1. Initialize random seed, and fixed window
Iterative Mode 2. Calculate center of gravity x of the window (themean)
Search 3. Translate the search window to the mean
4. Repeat Step 2 until convergence
2017-03-30 Automotive Sensor Systems 86
Fundamental Definitions Erosion and Dilation
Step 2: Perform
dilation on the
segment left in step 1
Output: Output of
opening ->
N i
E E E E
Using: 0 , 0 , 0 , 0
a b c d
Hough Transforms exploit the fact that a large analytic curve may
encompass many pixels in image space, but be characterized by
only a few parameters.
Disadvantage:
HTs can only detect lines or curves that analytically specifiable, or that
can be represented in a template-like form (GHT, Ballard).
Even for the GHT, the implementation is a bit awkward, and you have
to know what youre looking for. So the Hough Transform is primarily a
hypothesize and test tool
( xi , yi ) (m, c)
x c
Image Space Parameter Space
( xi , yi ) (m , b )
x m
Image Space Parameter Space
Equation of Line: y mx b y i mx i b or b x i m y i
Find: (m , b )
Consider point: ( xi , yi )
b0
x m0 m
image space Hough (parameter) space
x0 x m
image space Hough (parameter) space
Connection between image (x,y) and Hough (m,b) spaces
A line in the image corresponds to a point in Hough space
To go from image space to Hough space:
given a set of points (x,y), find all (m,b) such that y = mx + b
What does a point (x0, y0) in the image space map to?
Answer: the solutions of b = -x0m + y0
this is a line in Hough space Slide credit: Steve Seitz
b = x1m + y1
x0 x m
image space Hough (parameter) space
x m
image space Hough (parameter) space
How can we use this to find the most likely parameters (m,b) for
the most prominent line in the image space?
Let each edge point in image space vote for a set of possible
parameters in Hough space
Accumulate votes in discrete set of bins; parameters with the
most votes indicate line in image space.
Image columns
[0,0] x Solution:
d : perpendicular distance
from line to origin
d
: angle the perpendicular
Image rows
x cos y sin d
Point in image space sinusoid segment in Hough space
(Slides: Kristen Grauman)
12/22/2016 Automotive Sensor Systems 105
Hough Transform in ( ) plane
NOTE: m y
Large Accumulator ( xi , yi )
Image Space
More memory and computations
( xi a) 2 ( yi b) 2 r 2
Equation of Circle:
( xi a) 2 ( yi b) 2 r 2
Edge Location ( xi , yi )
Edge Direction i
a x r cos
b y r sin
If min(l1,l2) > T
There is a corner!
Intensity scale: I a I
R R
threshold
The auto-correlation
Describes the relations between neighboring pixels.
Equivalently, we can analyze the power spectrum of the window: We apply a
Fourier Transform in small windows.
Analyzing the power spectrum:
Periodicity: The energy of different frequencies.
Directionality: The energy of slices in different directions.
Simplest Texture Discrimination
Compare histograms.
Divide intensities into discrete ranges.
Count how many pixels in each range.
j 0.1
k
0.8
2 m 1 hi (m) h j (m)
(Malik)
More Complex Discrimination
Histogram comparison is very limiting
Every pixel is independent.
Everything happens at a tiny scale.
Second order statistics (or co-occurrence matrices)
0 1 2 3
Example: 0
co-occurrence matrix of
0 0 1 1 2 2 1 0
0 0 1 1 1 0 2 0 0
I(x,y) and I(x+1,y)
Normalize the matrix to 0 2 2 2 2 0 0 3 2
get probabilities.
2 2 3 3 3 0 0 0 1
4x4 image I and three different co-occurrence matrices for I: C(0;1), C(1;0), and C(1;1).
which normalizes the co-occurrence values to lie between zero and one and
allows them to be thought of as probabilities in a large matrix.
Image patch
2 9
Nd (I,j) = 1/25 =>
10 4
Instead, numeric features are computed from the co-occurrence matrix that
can be used to represent the texture more compactly.
2 5
Example: det 2 15 13
3 1
3/30/2017 160
Matrices
Inverse: A must be square
1 1
Ann A nn A nn Ann I
1
a11 a12 1 a22 a12
a a
21 a 22 a11a22 a21a12 21 a11
1
6 2 1 5 2
1 5 28 1 6
Example:
1
6 2 6 2 1 5 2 6 2 1 28 0 1 0
1 5 .1 5 28 1 6 .1 5 28 0 28 0 1
3/30/2017 161
2D Vector x2 P
v ( x1 , x2 ) v
x1
2 2
Magnitude:
|| v || x1 x2
If || v || 1 , v Is a UNIT vector
v x1 x2
, Is a unit vector
|| v || || v || || v ||
x2
Orientation: tan 1
x1
3/30/2017 162
Vector Addition
v w ( x1 , x2 ) ( y1 , y2 ) ( x1 y1 , x2 y2 )
V+w
v
w
3/30/2017 163
Vector Subtraction
v w ( x1 , x2 ) ( y1 , y2 ) ( x1 y1 , x2 y2 )
V-w
v
w
3/30/2017 164
Scalar Product
av a( x1 , x2 ) (ax1 , ax2 )
av
3/30/2017 165
Inner (dot) Product
w v.w ( x1 , x2 ).( y1 , y2 ) x1 y1 x2 . y2
v.w 0 v w
3/30/2017 166
Orthonormal Basis
P
x2
i (1,0) || i || 1
v
ij 0
j j (0,1) || j || 1
i x1
v ( x1 , x2 ) v x1.i x2 .j
3/30/2017 168
Vector Product Computation
i (1,0,0) i 1
j (0,1,0) j 1 i.j i.k j.k 0
k (0,0,1) k 1
u v w ( x1 , x2 , x3 ) ( y1 , y2 , y3 )
i j k
u w
u x1 x2 x3
v
y1 y2 y3
( x2 y3 x3 y2 )i ( x3 y1 x1 y3 ) j ( x1 y2 x2 y1 )k
3/30/2017 169
Coordinate Systems
World Camera Film Image
Coords. Coords. Coords. Coords.
Xw x x u
Yw y y v
Zw z
3/30/2017 170
Coordinate Systems
World Camera Film Image
Coords. Coords. Coords. Coords.
Xw x x u
Yw y y v
Zw z
3/30/2017 171
Caution!
Changing a coordinate system is equivalent to apply the inverse
transformation to the point coordinates
3/30/2017 172
Reverse Rotations
Q: How do you undo a rotation of R()?
A: Apply the inverse of the rotation R-1() = R(-)
3/30/2017 173
3D Rotation of Coordinates Systems
Rotation around the coordinate axes, clockwise:
Z,Z 1 0 0
Rx ( ) 0 cos sin
0 sin cos
Y cos 0 sin
R y ( ) 0 1 0
X sin 0 cos
Y
cos sin 0
X Rz ( ) sin cos 0
0 0 1
3/30/2017 174
3D Rotation of Coordinates Systems
z
1 0 0 t x
0 1 0 t
T y
t
0 0 1 t z
y Y
0 0 0 1
x
3/30/2017
x 175
Example
yc
P
Zw
xc
zc 4 Translate W to C:
Yw
Xw
10 6
1 0 0 0
0 1 0 3
T
0 0 1 2
0 0 0 1
3/30/2017 176
Relationship in Perspective Projection
World to camera:
Camera: X
P Y
Z
World:
Zw
Pw Yw
Transform: Zw
R ,T
3/30/2017 177
Relationship in Perspective Projection
World to camera:
X Zw
Y R Yw T (1)
Z Zw
3/30/2017 178
Relationship in Perspective Projection
Camera to image:
Camera: X
, Image:
x
P Y p
Z y
(3)
3/30/2017 179
Relationship in Perspective Projection
Camera to image:
Camera: X
, Image:
x
P Y p
Z y
3/30/2017 180
Relationship in Perspective Projection
World to frame:
= =
= =
World to frame:
If we left = / and = / , we have now 4 independent intrinsic
parameters Ox , Oy , fx , and
fx - Focal length expressed in the effective horizontal pixel size
- Aspect ratio: pixel deformation introduced by the acquisition process.
Thus, we have:
3/30/2017 182
Relationship in Perspective Projection
Question:
In the three coordinate systems: world, camera, and image, which one
cant be accessed?
3/30/2017 183
Relationship in Perspective Projection
Answer:
This suggests that, give a sufficient # of Paris of 3-D world points and
their corresponding image points, we can try to solve (6) and (7) for the
unknown parameters.
3/30/2017 184
Pinhole Camera Model
(World Coordinates)
P
y Zw
Y x p
X f
Xw
Z T
O Yw
R
P R T Pw M ext Pw
p M int P M int M ext Pw
3/30/2017 185
Camera Model Summary
Geometric Projection of a Camera
Pinhole camera model
Perspective projection
Weak-Perspective Projection
3/30/2017 186
Camera Model Summary
Camera Parameters
Extrinsic parameters (R, T): R, T 6 DOF (degrees of freedom)
p M P M int M ext Pw
Intrinsic Parameters: f, oint
x,oy, sx,sy
P R T Pw M ext Pw
p M Pw M is 3x4
M has 6 dof
X
f / s x 0 ox 0
1 r11 r12 r13 Tx
M int 0 0
x
w f / sy oy
x2 M intM ext Yw M ext r21 r22 r23 Ty
x Z 0 0 1 0
3 w r31 r32 r33 Tz
1
xim x1 / x3
y x
im 2 3 / x
3/30/2017 187
The Calibration Problem
3/30/2017 188
Direct parameter Calibration Summary
Algorithm (p130-131)
1. Measure N 3D coordinates (Xi, Yi,Zi)
2. Locate their corresponding image points (xi,yi) - Zw
Edge, Corner, Hough
3. Build matrix A of a homogeneous system Av = 0
4. Compute SVD of A , solution v
5. Determine aspect ratio and scale ||
6. Recover the first two rows of R and the first two Xw
components of T up to a sign
7. Determine sign s of by checking the projection
equation Yw
8. Compute the 3rd row of R by vector product, and
enforce orthogonality constraint by SVD
9. Solve Tz and fx using Least Square and SVD, then
fy = fx /
The Calibration Problem
Step 2: Estimate ox and oy
The computation of ox and oy will be based on the
following theorem:
Orthocenter Theorem: Let T be the triangle on the image
plane defined by the three vanishing points of three
mutually orthogonal sets of parallel lines in space. The
image center (ox , oy) is the orthocenter of T.
VP1
Estimating the Image Center
Vanishing points:
Due to perspective, all parallel lines in 3D space appear to meet in a
point on the image - the vanishing point, which is the common
intersection of all the image lines
VP1
VP2
VP3
VP1
VP2
VP3
VP1
VP2
VP3
h1
VP1
h1
(ox,oy)
VP2
Guidelines for Calibration
Pick up a well-known technique or a few
Design and construct calibration patterns (with known 3D)
Make sure what parameters you want to find for your camera
Run algorithms on ideal simulated data
You can either use the data of the real calibration pattern or using computer generated
data
Define a virtual camera with known intrinsic and extrinsic parameters
Generate 2D points from the 3D data using the virtual camera
Run algorithms on the 2D-3D data set
Add noises in the simulated data to test the robustness
Run algorithms on the real data (images of calibration target)
If successful, you are all set
Otherwise:
Check how you select the distribution of control points
Check the accuracy in 3D and 2D localization
Check the robustness of your algorithms again
Develop your own algorithms NEW METHODS?
Finding the disparity map
Inputs:
Left image Il
Right image Ir
Parameters that must be chosen:
Correlation Window size 2W+1
Search Window size
Similarity measure Y
CORR_MATCHING Algorithm
Let pl and pr be pixels on the Il and Ir
Let R(pl) be the search window x on Ir associated
with pl
Let d be the displacement between pl and a point in
R(pl).
pl d
2W+1 2W+1
CORR_MATCHING Algorithm
For each pixel pl=[i,j] in Il do:
The disparity at pl is the vector d with best C(d) over R(pl) (max. Cfg, or min.
SSD)
Edge Features -
Line Features -
Center Features -
Face Detector
10 20 4
Feature, =
Rectangle feature,
=
Nose : + C D E
F
Mouth: + G
H
P and R represent the distance of the sampling points from the center pixel
and the number of the sampling points to be used, respectively.
3/30/2017 Automotive Sensor Systems: Guest Lectures 211
Feature Extraction & Matching
LBP algorithm was further modified to deal with
Texture-based Face Recognition: Uniform LBP textures at different scales and to use
neighborhood at different sizes.
2
Example: 8,2 histogram
Input image is 6060 pixels, it is divided into six regions with window
size of 10 10.
Thus, a (66) 59 = 12124 vector that represents histogram values of
all the labels in the sub images and this vector contains all the useful
information in the image.
Level of semantics
action activity event
walking, watching TV, a volleyball
pointing, drinking tea etc. game, a party
etc. etc.
Shape Features,
Local Features, Bag of Features, Support Vector Machine,
Motion Fisher Vector, Extreme Learning
Features, Machine,
Deep-learned
Feature,
03/23/2017 Automotive Sensor Systems 221
Bag of Features
codewords dictionary
feature detection feature detection
image representation
Gradient computation
Gx=I[-1 0 1] 3
-6
85
94
10
29
-77
-100
-2
-48
3
-19
Gy=I[-1 0 1]T Gx
26
78
107
-8
121
0
-66
86
-96
-64
10
-64 Gy
61 64 146 74 69 72
61 62 -23 -68 -30 14
117 111 211 140 111 92
41 62 -39 -129 9 78 G
Magnitude: 107
176
133
254
214
168
254
254
148
254
158
190
G Gx2 Gy2 193 254 255 231 187 201 56 47 65 66 42 20
193 234 255 195 126 204 46 69 68 180 79 86 Gx
Gy 59 143 -43 114 143 98
Angle: Gy 86 121 41 -23 39 43
ac tan( ) 17 -20 87 -59 -128 14
Gx 0 -20 0 -36 -61 3
56 97 66 101 42 20
46 117 74 206 92 88
Magnitude: 64 179 128 132 172 99
116 121 41 89 75 77
Pixel intensities
G Gx2 Gy2 63 65 90 90 131 20
41 65 39 134 62 78
87 29 81 139 93 81
97 36 67 119 121 102
Orientation: 66 53 20 120 124 84
48 94 90 15 149 146
G
ac tan( y ) 16 18 105 139 103 45
bins
Gx 1 18 180 164 82 2
Range Image
Special class of digital images. Each piece of a range
image express the distance between a known reference
frame and a visible point in the scene
Time-of-flight
advantage fast
disadvantage high cost
Structured light
advantage simplicity and low cost
disadvantage specula reflection
Disadvantages
sensors are expensive
not always eye safe
get a lot of 3d data points which must be processed
Advantages
Simple image acquisition, even more with arrival of digital cameras
Scale independence, same camera for small and large objects
Challenges
Correspondence between points on 2D images
Based on trigonometry:
When a base and two angles are known,
the 3rd point can be calculated.
CCD camera
Computer
Laser projector
Laser stripe
Translation stage
side view H
F
D1
H f
side view F
D2 W
optical center
D1
D1 = D2 W f
top view
f : focal length
D2: laser strip displacement in image
W : Working distance
H= D1 tan()
H: object height
D1: laser strip displacement
F : incident angle