Académique Documents
Professionnel Documents
Culture Documents
Abstract: TBD but mention transputers, GR6, Marr, vision, Kalman Filtering etc. Note that
Mark Orr will probably tqke over this project...
1. Introduction
The goal of this work is to propose and implement an object recognition methodology using ex-
isting quantitative and qualitative techniques. Further more, no use is made of ad hoc methods;
our attempt being to formulate a computational theory. See Marr '82 who expounds at great
length about the importance of a computational theory. Because of the complexity of object rec-
ognition we state simplifications (Section 2) which constrain the task to a manageable size
whilst retaining the features pertinent to ARRL. Many of the initial simplifications will be with-
drawn in the future as development continues.
For our purposes, we define object recognition to be the task of identifying a data object by
means of comparing it with model objecfs, while simultaneously computing a geometrical trans-
formation which takes the model object to the data object. The result is a logical statement
which can be insened into the logical world model.
As proposed by Marr and as adopted by ARRL, data objects will be composed of collections of
surfaces (inidallyplanar) existing inasurfaceworldmodel. A sisterproject; DamFusion,is
responsible for constructing and maintaining this world model. Determining which surfaces be-
long to data objects is initially performed by data fusion and subsequently augmented during
object recognition.
. Surface connectivity within object models will be provided by the model mak-
er'
-,1
3. Objectives
We wish to implement a flexible set of rools which can be used to develop and evaluate object
recognition methodologies. Thus we do not implement an object recognition program, rather
supply a toolkit for constructing such programs easily. Included in these tools must be a variety
of Man-Machine interfaces enabling fast, efficient debugging and evaluation of both tools and
resulting applications alike. Having implemented these tools we wish to construct and evaluate
various object recognition schemes. Finally, a computational theory and implementation will
Sensor Systems
Grid
Motion Planning
.Obj eCi,,ReCoff tion ,..:.:.:::::::::.:1.:.:
Situation Recognition
Situations
4. Approach
Based largely around the numerical work of Oliver Faugeras and the ideas of Bob Fisher we
decompose objectrecognition into the following tasks: matching,Iocalizationandverification.
During matching we compare surfaces of data objects against surfaces of model objects, result-
ing in grcups of potentially corresponding surfaces, which we call correspondence hypotheses.
Localization takes these hypotheses and attempts to compute a ffansform from model to data for
each in turn resultingin transformation hypotheses. Verification applys these transforms and
compares the data object to the Eansformed model objects. This is our overall approach; there
are many ways to achieve each individual task and many ways to impose a control strategy,
which is why we choose to develop a toolkit. Justification of the overall approach comes from
the statement that it is a generalisation of many recent object recognition systems and that it sat-
Matching
Data &
Model
Objects
Figure 2: Approach
. The approach must facilitate incremental recognition. ie. the ability to update
identification and location by using crurent identificationlocation and new data
surfaces. This is in preferance to having to recalculate using new and old surfac-
es. Thus as we see more surfaces of an object we revise its identification and lo-
cation incrementally.
. The approach must facilitate localization of objects, even if they are unidenti-
fiable. This follows from the fact that all objects must be manipulatble by the ro-
bot, regardless of whether or not it knows what they are, and from the
observation that in the real world it is infeasable to assume that all objects can be
identified
Surface Worlds
One of our simplifications is that data clusters and object models have the same data sffucture
and as such both reside in surface worlds. A surface world is used for the following purposes:-
. To contain surface images loaded from RD8 (and other sensor systems).
Each surface world is organised as a collection of planar surfaces, planar boundaries, planar
joints and clusters. Here we include models under the term clusters. Each of these features is
represented as a node in a network. Nodes refer to each other by name, each name being unique.
Figure 3 dipicts the suructure of a cluster.
C - Cluster node
S - Planar Surface node
B - Planar Boundary node
J - Planar Joint node
Possible
Visualisation
In fact the cluster will have twice an many nodes as this since both sides of each surface are
visible. Note that clusters cannot be hierarchical at present. Section ?.? describes the implemen-
tation of surface worlds in detail which provide generalised network searching algorithms as
well as surface world construction facilities. This software will henceforth be refered to as
swtool.
Inout Imaserv
At present, images are used from the results of RD8. Many images can reside in a single surface
world. Facilities are available for clustering, ie. given a collection of surfaces, boundaries and
We have made the decision to use a numerical technique - Kalman Filtering, as a sound mathe-
matical basis for data fusion and object localization. We chose it because: it is well tried and
tested, its mathematics are well known, it incorporates error measures - essential for us, it is it-
erative (in that partial results can be formed) and because Faugeras used it! Evaluation hascon-
firmed our choice. As previously mentioned object localization is the process of finding a
transform from a model object to a data cluster. A simple combinatorial algorithm has been im-
plemented. It is known that in order to find such a transform, comprising of a rotation and trans-
lation, we need 3 pairs of corresponding surfaces from the object model and data cluster. Our
approach then, is to generate all such tuples of pairs (matching) and try to localize each in tum.
However, a pruning process discards any obviously inconsistent pairings before localization as
follows:-
. If any two of the these surfaces are parallel or anti-parallel then prune.
Here we imply parallel as meaning that two surfaces have the same norrnal and anti-parallel as
two surfaces having complementary normals. These two rules discard most of the potential pair-
ings. The resulting correspondance hypotheses are passed to a Kalman filter which attempts to
extract a transformation from each. A further pruning process is implicit in the Kalman filter
which inherently rejects any hypotheses which are numerically incompatible. The result is a col-
lection of transformation hypotheses, each composed of an axis of rotation &, an angle of rota-
tion about this axis 0 and a covariance mafix S. The covariance matrix indicates the degree of
confidence we have in the result.
Software has been developed to implement this technique as a breadth first generate and test al-
gorithm. It will localize a single data cluster given an object model.
Obiect Verification
. Exact match.
. Failure.
Comparison is based on the so called nearest neighbour standard filter, NNSF. Very much relat-
ed to the Klaman filter this technique provides us with a sound basis for matching features, for
example, testing a value against a threshold criteria. This replaces the ad hoc lx-yl<t.In this in-
stance we compare planar equations.
A simple algorithm has been implemented. For each surface in a given transformation hypoth-
Control strategies across collections of data clusters and object models have not been imple-
mented. Simple combinatorial algorithms could be trivally generated above the existing cluster-
model algorithms, although the cumulative effects of combinatorial seaches will soon provide
an upper limit to the complexity of potential world models and model data bases. A better ap-
proach might be to extend the existing algorithms to have better and wider searching mecha-
nisms and employ some sort of invocation methodology, see Fisher & Orr.
5. Theory
For an explaination of the nomencleture used in the following sections see Appendix I.
Note: A vector is assumed to be a column matrix when used in matrix calculations unless ex-
plicitly transposed notationally.
Note: A lot of this theory involves some horrible looking equations, however, their appearance
usually masks their underlying simplicity which one is well to remember at all times.
fact the same physical surface and in the past efforts have been made to canonical these two rep-
resentations (so Kalman filtering works). However we make use of the distinction to model
'both sides' of a surface. All visible surface sides (henceforth called surfaces) are included in
the model. For example consider the simple illustration below. This model has 4 surfaces and
one planar joint. The viewing angle will determine if this joint is concave or sonvex. All 4 sur-
face must be observed to completely verify the identity of the object. This might seem over com-
plicated. However, consider textured or coloured surfaces and the reasoning becomes obvious.
Thus we do not have to canonicalise {nd} and {-n,-d} since they represent dffirent surfaces.
Planar Surlace'llanslbrmations
The following equations transform a plane represented by {nd} into a plane {n' d'} by applying
the rotation matrix R and translation r. Object localization and object modelling are founded on
these two very important equations:-
n'=Rn
d'-d+Rn-t
- d+n' .t
Rotation Representations
We represent rotation in two ways. Firstly as a traditional 3x3 matrix R (as above) and secondly
by a vector r, representing an axis and angle of rotation. r encapsulates the axis t and angle 0
by the equation r= k0. Note that there are two numerically different, but functionally equiv-
alent ways of representing a particular rotation; these are:-
kg
-k(zn - 0)
We use r in our Kalman formulation since it is more compact than R, requiring only a 3x3 co-
variance matrix. We can transform points and planes using r dircectly or by converting r into R.
To rotate a point or position vector v into v' using r we employ the equation given in Altmann
1986, pp. 163:-
Note that there are other methods - see Appendix tr. Adding translation r we have:-
v' = S(r, v)+t
o -k- k..1
I
z=lu
t"^l
o'-i-l
l-k, k' ol
In computing Jacobians neccessary for the Kalman formulation (see later), we need to differen-
tiate an applied rotation Rn with respect to r. We do this by defining a function K(R,n) which can
be obtained by a combination of the chain rule and Rodrigues's equation.
K(R, n) = =
*ro,
From which we can see that K(R,n) is a 3x3 matrix.ll is the anti-symmetrix matrix, defined:-
,,f
[o -,, _r,l
, = lr, 0
l-', ', ol
s(o)=# h@)-L#
Elementary differentiation gives: -
Note that when performing Kalman filtering we must decide on a canonical form for r otherwise
two rotationally equivalent values of r will be interpreted differently. We chose to canonicalise
The justification for using a Kalman filter was stated in section 3; here we show how such a filter
can be implemented for solving object localization; that is, finding a transformation which maps
a model coordinate frame to a data object/cluster . We must express the transformation in the
form fl-r, a) = O. Where x is an observation vector and a the state vector. To perform localiza-
tion our observations will be equations of the data and model object surfaces, and our state - the
result of the filter - will be the best rotation & transformation satisfying these pairs of equations.
Note that three good pairs of equations are required to compute a translation and two.for rota-
tion. qoodpeing aenned as not parallel or anti-parallel. Thus we set 1 = ln,, n, d, A' ,
o = Vt /) and use the standard transformation equations stated previousfy forl,
n'-Rn=O
d'-d-(n'.rt)=0
L6St'' and
o1
#o,",
Firstly then, differentiating with respectto a, we have the 4x6 matrix:-
o-uJo,', =
f:t:'-":,1
Then differentiating with respect to.r we have the 4x8 matrix:-
Covariance matricies for each observation pair and initial state must be specified. Each obser-
o orot -o2oo
n, - nrn, - nyd
The values for the covariances on data equations will come from RDS. The values for the cova-
riances on mdel equations can be set to 0.0; ie. we assume that they are exactly corect. In prac-
tice it is usual to set a very small figure, say 0.001 instead of 0.0. This is because of the limits
of the implementation. In the future, covariances on models could be utilised in a useful manner;
see ??. Note also that in practice RD8 does not supply covariances, which are presently guessed
by the recognition software - in load-rd8}.Large values are needed for the initial state covari-
ance values, the actual valuebeing fairly arbitary and 100.0 usually suffices. It has been found
in practice that if a good eltimate of the initial state a0 is not available (probably the case), then
a value of co =
[0r", Or"J' is better than a random guess. The reason for this is not known.
Again a value of 0.01 or0.O01 should be used in place of 0.0 to keep the implementation happy.
W*i andWoi are the weight of the current observation and last estimate respectively:-
,Y o,=
#Ora; - 1)Si- ,Lai@, o,- r)'
There are many references to verify these equations, for example see ??.
Apnlvine Tbansformations.
If the plane has no associated covariance matrix then we can simply use the equations ?and ?.
If however the plane has an associated covariance matrix then we must update this as well as
the normal and distance parameters. We can do this by using a Kalman filter similar to the one
described above, changing the definition ofx anda. Note that we only compute al andST since
we believe the observation, (we would of computed the transformational part of the observation
using the filter described above). Our observation will be the existing plane equationL/covariance
matrix and the transformation {r,t}. The initial state is arbitrary and the result is the new plane
equationL/covariance matrix. The same definition of f(x,a)=Q can be used together with the new
values forx and a. x=[nt d rt f 1t, a=[n't d' ].Wemust therefore calculate the Jacobians cone-
sponding to this new formulation, which can then be used in the standard way. We have:-
,o_';tr
Lr.'o,,,=
[* it;"}tr and Lr#,,,=
[o-^ i il
The observation covariance maffix will be 10x10 and the state covariance maffix 4x4. If we
Applying the filter gives {n' ,d' ,C(n' d')} from {n,d,C(n,d)}.
Notice that because of the order of the parameters in x= [ nt n't d d' ltour ob servation covariance
matricies become somewhat complicated and unatural. Unatural because the covariance matrix
of a plane {nd} wrll be held as a 4x4 matrix. A more natural formulation then, is to setx=[n't
d' nt dlt,a remaining unchanged. We then have:-
The iterated Kalman filter is a refinement of the above recursive formulation and can be used
when the initial estimate of state is particularly bad (or 0) - ie. we have no idea what it should
be at all. The result a;cdn be refined within each recusrive step by recomputing it a number of
times until successive estimates are to within a threshold value of eachother ta- or until the op-
eration has been peformed a set number of times no. Note that the covarianve matrix Si-1 and L;
are used unchanged throughout the calculation. Thus we perform:-
ai = di-r-Gf(x,,a,_r) until
la,- o,_rl 3 t, or
i=no
Outlier Detection
During Kalman filtering we can detect those observations which do not lie within a specified
distance of f(x,a)=O. In practice this means that we can eliminate those observations which are
statistically detected to be of no use. We make use of the nearest neighbour standard filter
NNSF) and compute the generalised Mahalanobis distance d. It is known that d has a 12 disri-
Qi = W,.*Wo
dhas q degrees of freedom, where q=rank(Q). Object localization gives q=(, (for either for-
mulation given). dcan now be tested against say the gl%ohypothesis value for 4 degrees of free-
dom e, rejecting observations where d >= e. Note that Q;t itcalculated during localization and
outlier detection is a small overhead.
Matching Features
We make use of the Mahalanobis distance to match two features described by vectors or scalers
and having associated covariance matricies. For example matching planar surface equations
during object verification. In this instance we make use of the equations above and simply for-
mulatef(x,a)=0 as x-a.It does not matter which vector is defined as x and similarly for a. Ad-
ditionally there is no concept of current observation and last state and the i suffixes can be
dropped. To avoid confusion we shall define our equations as matching vector y and vector y'.
We have then, adapted from Faugeras '86:-
Lrrf,'"v) = -l
#r',v)=I and
d = lv, _ vlte-Llv, _ vl
Q=C(v')+C(v)
Of course v' and v must have the same length and C(v' ) and C(v,) must have the same rank. If
the length of v' and v is 1, ie they are both scaler quantities then we can simplify to:-
, (v'-v)z
Lr - ----;-;
o;' + o;
Merging Measurements
We often wish to merge two measurments into a single quantity. For example, merging two state
estimates obtained from independent observations. We again tum to the Kalman filter for the
solution. Obviously we formulatef(x,a)=Q as x:-x2=0,'where r= x1t and a=x2t are the two mea-
sunnents we wish to merge. Again the Jacobians are I and -I for x and a respectively. If the co-
variance matricies of x7 and x2 are C 1 and C2 respectively then the merged result {x3,C j} can
be found by subtituting into the standard Kalman equations:-
These equations, written in slightly different form can be verified in Smith and Cheeseman ??.
Note that the apparant asymmetery in the equations is of no consequence; the same result
formed for both formulations of x and a. We have once again dropped the suffix i since there is
no temporal dependency.
6. Source Location
The source code for GR5:Object Recognition is resident in the directory -crj. See Mark Orr for
password. I am sure the code will move soon, so check with Dave Wheble as well as to whether
or not -crj still exists. Anyway let us assume that there is an object recognition root directory
which we will call '.'. The following subdirectories exist.
./bin
./GR5
GR5/models
GR5/src/clisp
or-utils.lsp Generalobjectrecognitionrelatedutilities.
init.lsp Loaded by LISP on invocation.
swtool.lsp Tool for building surface worlds.
rd8.lsp Function for loading RD8 images into surface worlds.
modeltool.lsp Tool for building object models in surface worlds.
swmmi.lsp Menu based MMI into surface worlds.
swdisplay.lsp Functions for graphical display of surface worlds.
cluster.lsp Function for creating clusters in surface worlds.
covariance.lsp Functions for creating covariance matricies.
rotation.lsp Usefulrotationutilityfunctions.
kalman-utils.lsp Useful utility functions required by kalman filtering.
gkalman.lsp Generalised Kalman filter.
kalman.lsp Kalman filter for localization.
cmatch.lsp Combinatorialmatchingalgorithms.
clocalize.lsp Combinatorial localization algorithms.
cverify.lsp Combinatorialverificationalgorithms.
kalman-test.lsp Test program and MMI.
mmi.lsp Master MMI, executed at end of init.lsp file.
sysmmi.lsp Menu based MMI into implementation parameters.
setup.lsp Loaded at end of init.lsp
setup.p Loaded at end of init.lsp
.ltib
lib/clisp
./docs
.IRD8
7. Invocation
Login as crj.
Execute window env. sunview
then (i)
Move to LISP source cd GR5/src/clisp
Execute poplog env. pwmtool clisp &
or (ii)
Execute script execgrS &
See Mark Orr for pasword.
Software for creating and manipulating surface worlds resides in the frle swmmi.Isp andis com-
monly refered to as swtool: surface world tool. Two examples of use are in RD8.kp andmodel-
tool.lsp.
A surface world is a collection of nodes. Each node has a unique name and nodes refer to ea-
chother by name. Each surface world has a name and a list of currently existing surface worlds
can be found in the variable $$surface-worlds. There is a notion of the cwrent surface world,
which is referenced by the variable $$current-world. Although surface worlds are used to con-
tain planar surface structures, swtool is written so that the number of rypes of node and their
characteristics can be easily extended. The following node types are currently used:-
Narne Description
The name of each node is a prefix character followed by numerals rendering it unique. Prefix
characters are S, B, J and C respectively.
Looking at a typical cluster then, it has a name, typeand references a list of planar surfaces. Each
of these surfaces references the cluster it belongs to, the planar joint of which it is part (if any)
and the boundary nodes which define its shape. Additionally a suface has the following at-
tributes: equation of plane, centre of gravity, area and sometimes a symbolic name such as
RECTANGLE. Each planar jointreferences two surfaces of which it is comprised and addition-
ally has two attributes: joint angle and a joint character which is CONCAVE or CONYEX. See
Fisher for a description of these characteristics. Each boundary node references the surface it
defines and an attribute whose value is a list of 3D coordinates defining its shape.
swtool provides functions to create and manipulate nodes in a flexible manner. It also provides
algorithms for finding node neighbours, both locally and transitively. One use of this is to per-
form clustering. Consider a network defined without clusters; simply a collected of inter-related
surfaces, joints and boundaries. We can perform a simple clustering algorithm by collecting all
groups of surfaces which are transitively connected by convex planar joints - into clusters. A
clustering algorithm is provided in cluster.lsp, which uses a generic neighbourhood finding al-
gorithm parameterised to look for connected convex joints. Other clustering techniques can be
evaluated by using this function.
A large number of functions are provided and unfortunately it is out of the scope of this report
to described each in detail. However each is described breifly in swtool.lsp. In addition, please
refer to the two previously mentioned examplars..
base.lsp provides an exemplar for model building. Models are clusters and reside in surface
worlds. modeltool.lsp provides a collection of algorithms for defining and transforming surfaces
and clusters. Facilities exist to progressively abstract a planar surface - the base component of
a model- into objects. The first stage is to create a surface world for the objects to reside in, then
compose descriptions of the required objects, which can then be instantiated into the surface
world. Descriptions can be parameterised enabling for example a cuboid (x,y,z) to be defined.
Basic surfaces are created with the function create-planar-surface and clusters with the function
create-cluster.The function obj-transform can be used to translate and rotate both individual
surfaces and clusters. Each function operates by supplying a list of keywords and associated val-
ues. Some keywords are optional resulting in default values being used.
create-planar-surface
Keywords:-
create-cluster
Kevwords:-
world As above.
components List of surface node names
forming the cluster Must be supplied
called Reference name for each
node Must be supplied
connecuvlty List of surface pairs which
are physically connected. Use
reference names. Must be supplied
The following example defines a model called cube, which takes 2 aqguments : side defining the
lengthof eachsideof thecubehavingdefaultvalue l.0.name definingthesymbolicnameof
the cube, default name being cube.
Notice that the definition is a LISP function and as such can have local variables etc.
Kevwords:-
Rotation can be supplied in one of 3 forms: see modeltool.lsp for details. Translation is supplied
as a vector. Again, use base.lsp as an exemplar.
To perform object localization two clusters must bemarked: one as the data cluster and the other
as the model cluster. This can be done by selectin g the mark as model and mark as data menu
items. Note that models can be matched to models for evaluation purposes.
The menu item links is used to enter a menu of node types referenced by the current node.
A control panel enabling projection parameters to be altered cannot be used until the menu MMI
is terminated. To reinvoke use (mmi).
Used to evaluate the Kalman filter implementation the MMI operates as follows. Parameters are
selected for input to the filter. The filter is invoked (select go). Results are displayed indicating
its performance. The test algorithm operates as follows.
A number of infinite planes are created at random (minimum 3).A transformation is created at
random. The transformation is applied to each infinite plane. The resulting and original infinite
planes are colrupted with noise (possibly none). The transformation is comrpted with noise and
in a special case set to k-(0,0,0) 0=0. The two sets of planes and the comrpted transformation
are then used to compute a new transformation - the result of the filter. Results and calculations
comparing this calculated transformation and the original are displayed.
Svstem MMI
Not fully implemented; at present simply printing the value of global variables and performing
garbage collection.
Performing Localization
Mark two clusters as described above and select localization option. A surface world calledver-
Having localized a cluster as above, each of the resulting clusters can be verified against the
original data cluster in turn by slecting this menu option. Inspect the attribute verify, of each
cluster to establish the result of the verification.
i. Using Rodrigues equation and the anti-symmetric matrix Z formed, from elements of ft.
I o -k, kr]
R - r+Zsino+ (l- cos})* z - | k, 0 -k,l
l-0, o, ol
Sometimes written:-
R = I+Zsing +zsinzlz2
/.
ii. Using Rodrigues equation and the anti-symmetric matrix.Fl formed from elements of r.
rf
R= r*"# * (1--cosg)
"z , =1,,r,!,-A
iii. By forming the matrix:-