Vous êtes sur la page 1sur 3

Tracking of polyedric objects in a range data sequence.

F. Ducatez and F. Dufrenois Laboratoire dAnalyse des Syst` mes du Littoral e 50 rue Ferdinand Buisson 62 228 Calais (France) Franck.Dufrenois@lasl-gw.univ-littoral.fr June 20, 2002 Abstract
Even if questions of segmentation and 3D reconstruction have been widely studied on images issued from laser telemetry, the recovery of motion of a 3D structure from such a sensor still remains wide open. Indeed, the time required for the acquisition phase is still now signicant, increasing the risk to observe motions with large amplitudes. So, most of the tracking procedures used in a passive vision system are failing. In this article, we will aim at giving an answer to this question in the case of a polyedric structure undergoing an afne motion. The proposed estimator is based on the principle of lagrangian dynamics [1]. The robustness of our tracking process lies on the integration of an algebraic distance-based measure beetween the sides of the object and the range regions. Preliminary experiments with real data issued from the Odetics laser range nder are then presented. the range images. Overlappings and occlusions make this operation difcult. Then, for every step of the tracking process, we propose to minimize a global likeness measure between surfaces - Another important aspect, is to take into account the geometry of the sensor in the denition of the generalized forces. The projective model of the laser camera is similar to a speudo-spherical transformation and is thus different from the classical perspective, orthographic or steroscopic models used in passive vision.

2 Odetics laser and convention


This sensor gives an in-depth image associating to a number of a u line and a number of a v column, a distance measure w. This image which is dened in the frame can be put back into the camera frame by the speudo-spherical passage matrix as follows [4]:

Key Words :
Afne motion, Lagrangian dynamic, polyedric objects, laser telemetry

(1)

1 Introduction
We are giving, in this article, an original physics-based framework for the tracking of polyedric structures in a sequence of depth images. Since Terzopoulos founding works [1], this forces-based models have been widely developped as much as in its theory as in its application showing their interest for the simultaneous solving of reconstruction, segmentation and tracking in imagery (see [2] for a review). Even if its use is widespread in passive vision, it has been often limited to the 3D reconstruction problem in depth images [3]... In order to deal with the tracking of objects made of facets, we will examine several points : First of all, we must transform the information issued from the images into one or several adequate attraction potential energy maps. To our application, this information is fully determinated from the surfaces and the boundaries of the homogeneous regions in the depth images. We thus dene a potential of attraction combining these two kinds of information - The denition of the region attraction potential is the result of the setting in correspondence between the visible faces of the model and some surfaces within

with and . et are the intrinsic parameters of the indepth camera. The pair determines the coordinates of the image center. By convention, lets call the object frame, the absolute reference frame and the passage matrix between the absolute frame and the camera frame. This transformation is obtained during a classical calibration phase requiring the estimation of 12 parameters.

3 Motion modelization
3.1 Parametrization of the model
The study we will focus on deals with the tracking of polyedric objects made of N sides and M vertices. We will suppose that at each moment the object undergoes an afne transformation including a translation ,a rotation and a scale change . We adopted a quaternion representation of the rotation. Updating quaternions at each time steps is easier than directly updapting a rotation matrix and ensuring that it remains orthogonal. Then, the position-

ing of the object at a t instant can thus be written in ho . mogeneous coordinates : Object motion thus derives from the composition of those three transformations wich can gathered under a same vec . This vector tor written designates the degrees of freedom or the generalized coordinates of the model.

the model and a region correctly selected among the ini (gure tial partition of the image : 1). In order to obtain an positif convex attraction potential, we have choosen g as the classic exponential function : .

3.2 Dynamic
The equations of motion of a conservative system permit to characterize the evolution in time of the vector [3]:

(2)

(a)

(b)

and respectively represent the Where quantities , mass matrix, the viscosity matrix and the generalized forces of the object which derive from an attraction potential V. In most of the cases, C is proportional to M (Raygleigh hypothesis). M is founded from a kinetic analysis such as with L is the jacobian matrix

of the object that maps the generalized velocity vector


into the 3D velocity vector of the model :

(c)

(d)

Figure 1: Steps of creation of the potential attraction V. Range image (a). Detection of the jump and crease edges (b). Contour-based potential (c). Region-based potential (d).

with G is the rotational 34 matrix and 33 matrix of the position vector X(0).

is the dual

(3)

4 Attraction potential V
In order to recover the pose and the motion of an object in a range sequence, we propose to integrate in our dynamic scheme two kinds of information : contour and region informations. In a rst step, we extract two resulting maps by a local approximation of the image surface with a four order bi-dimensional polynomial: the rst map integrates both jump edges and crease edges of the range images under the shape of a binary image I of contours. The extraction of these features can be solved by looking for the zero crossing of the second and third directional derivatives of the polynomial [5] - The second map gives a partition of the range images within the set of C disjointed regions and parametrized by the vector gathering the coefcients of the alge braic equations of each surface. This operation is realized by a region growing algorithm [4]. From the rst binary map, we then set up the contour-based attraction potential which results from the projection of the contours within the (u,v) plan (frame ) of the range image . This potential is a function of the distance between a pixel of the binary image and a vertex of a visible boundary of the . The model, and is expressed through : distance map d(I) corresponds to the classic Chamfer distance and it is quickly obtained by a two times convolution of the image I. The region-based attraction potentiel is dened by a distance measure between a visible side of

This distance d corresponds to the normalized algebraic distance between a surface with a 3D vertex of a side of the object :

(4)

The surface wich will be associated to the visible side is the result of the minimization of a global likeness measure: , with the measure is given by :

(5) The quantities , and which have been estimated both on the visible sides of the object and the range surfaces, represents respectively the unit normal vector, the center and the aera projected in the frame . This measure tends to minimize in a same time the scalar product (rst term), the distance (second term) and the overlapping ratio (third term) between the two associated surfaces. The last term enables to eliminate some ambiguous cases when for example two visibles sides of the model are associated with the same range region. The combination of both those potentials in the tracking process is elaborated in a balanced way. In order to decrease the sensitivity to initialization and noise, the inuence of the region potential will be preponderating rst. Then, the contour potential will take over. The passage between both of them will take place when

the region potential has reached a predeterminated threshold. Thus, the total potential can be written as the combination of both those potentials according to the relation : (6) with is the classical step function.
(a) (b) (c)

5 External generalized forces


In your study, the external forces derive from the attraction potential of the image. Those forces must be converted into generalized forces and brought back to the absolute reference frame . A kinetic analysis permit to connect the velocity of an image pixel (image frame ) with the generalized velocity frame ) :

Figure 2: Illustration of the tracking results on three frames of the sequence. In white, the model at t-1, in black the result of the tting.

(absolute reference (7)

7 Conclusion
We have presented a physics-based tracking process to the case of the afne motion of polyedric structures. We have shown how integrating the region information of the range image in the formulation of the attraction potential. This measure takes into account the polyedrical nature of the range objects and in this case, is more suitable than the classic methods using the spring-forces with a nearest-node stage. Of course, the contour and region extraction steps must be realized with a great care because these operations can affects the course of the tting and so the tracking. In future work, we plan to make the initialization step automatic adopting a SAG (surface adjacency graph) representation of the range image. Another point must be resolved when only one range surface is avalaible during the tting because it involves some instabilities of the recovery of the pose.

with

transformation is dened by :

represents the image Jacobian matrix . The

(8) The region generalized forces estimated in the range images are brought back to the absolute reference frame and are expressed as follows :  (9) With is the set of the visible sides of the object. Following the same principle, the contour generalized forces are calculated on the visible boundaries of the object within the (u,v) plane of the range image (frame ) and brought back in the absolute reference frame.

References
[1] M.Kass, A.Witkin, and D.Terzopoulos., Snakes : active contour models., International Journal of Computer Vision, vol. 1, no. 4, pp. 321331, 1988. [2] F.Dufrenois., Etude dun mod` le d formable de e e fourier pour la segmentation et le suivi dobjets 2d et 3d., Traitement du signal, vol. 17, no. 2, pp. 153178, 2000. [3] S.J.Dickinson, D.Metaxas, and A.Pentland., The role of model-based segmentation in the recovery of volumetric parts from range data., IEEE Transactions on Pattern Analysis and Machine Intelligence., vol. 19, no. 3, pp. 259267, 1997. [4] A.Hoover, G.Jean-Baptiste, X.Jiang, and al., An experimental comparison of range image segmentation algorithms., IEEE Transactions on Pattern Analysis and Machine Intelligence., vol. 18, no. 7, pp. 673689, 1996. [5] A.Davignon., Detection des ar tes dans les images e de distance., Traitement du signal, vol. 9, no. 1, pp. 4556, 1992.

6 Preliminary results
We have made some rst tests on a real image sequence using an Odetics laser range nder (128128, 8-bit pixels). The intrinsic parameters of the camera are given by the constructor : =0.008181, =0.008176, =0.0053, =10. The threshold is xed to 15. The image sequence choosen for the demonstration is the c sequence and it is available on the University of South Florida Range Image Database. This sequence illustrates the sheer rotation motion of a polyedric object. The amplitude of the motion between each images is large and it is approximatively constant and equal to 22.5 degree. Figure 2 presents the results of the tracking for three frames from the range sequence. Despite a large motion amplitude and the loss of some sides during rotation, the tracking is robust.

Vous aimerez peut-être aussi