Académique Documents
Professionnel Documents
Culture Documents
Bob Bolles
Chris Eveland
Approach
consider: template-based tracking
maintain template of object
T ( x, y )
correlation used to update object position ( x p , y p )
template is recursively updated to handle changing object
appearance
T ( x, y ) T ( x, y ) (1 ) I ( x x p , y y p )
limitations/problems
1) object initialization/detection
2) template drift
left
background
disparities
foreground
Approach
background
init
background
subtraction
stereo
left
intensity
foreground
detection
person
templates
tracking
detection
tracking
intensity and "support"
templates are recursively
updated
Kalman filtering on person
location in 3D
person templates used to
avoid drift
Related Work
Companies
Universities
Hardware
two CMOS cameras
low power (150mW), inexpensive
($100 components)
adjustable baseline: 2.7'' to 6.2'' in
1'' increments
another version with DSP
processing onboard
Software
stereo algorithm is area
correlation based
optimized C and MMX code
20 Hz on 320x240 image, 24
disparities, 400 MHz Pentium II
left
right
notation:
d( x, y )current disparities
d 0 ( x, ybackground
)
estimate
disparities
Background subtraction
look for disparities closer than background
d ( x, y )
if
f ( x, y )
left
d( x, y ) d 0 ( x, y ) thresh, or
otherwise
background d 0 ( x, y )
disparities d( x, y )
foreground f ( x, y )
Handling scale
idea: range info from stereo can be used to fix scale of
processing
avoid search over scale parameter
person width is proportional to disparity
image
w'
COP
f
z
z
f
w w'
z w' f w const
bf
z
w' dK
d: disparity
b: baseline
K: constant
Detection
foreground
f(x,y)
another
peak?
histogram
no
count
exit
disparity
yes
threshold
disparities
layer(x,y)
correlate with
person
template
found
person?
yes
remove person
from layer(x,y)
no
Detection example
during detection, extract intensity and support template
from layer(x,y)
top view
left
right
image
(x, disparity)
3D
(X, Z)
Tracking Steps
prediction
predict Kalman filter (X, Z)
predict person disparity
segmentation
select foreground layer around predicted disparity
localization
correlate gray level template against left image, weighted by support
template [coarse localization]
correlate head/torso shape template against segmented foreground
layer [re-centering step that addresses template drift]
update
Kalman filter
recursive update of intensity and support templates
Tracking Videos
recursive template update
running
Tracking Videos
TR = tracking rate
TR
96%
98%
96%
89%
92%
86%
79%
85%
84%
78%
69%
68%
70%
FP MTD
0%
6.0
0%
4.0
0% 10.0
10%
2.5
6% 11.0
0%
9.0
3%
7.7
2%
5.0
4%
5.8
1.3%
6.6
5.6%
7.0
3.2%
5.4
6.7%
6.2
unweighted correlation
results:
mean tracking rate (TR) drops 4%
mean false positive rate (FP) increases from 3% to 10%
(qualitative) template drift causes people to be lost and re-detected
Conclusion
Stereo is an effective segmentation tool:
detection: provides a foreground layer divided into different depth
layers
tracking: helps to avoid template drift by focusing on foreground
pixels at objects depth