Académique Documents
Professionnel Documents
Culture Documents
Marianna Zichar
Attila Fazekas
I.
I NTRODUCTION
data are frequently used only for hand pixels extraction. If the
hand is properly segmented, the gestures can be recognised by
the shape of the hand contour and other geometric features.
Rens at al. proposed a new contour matching method in [16]
using the series of the relative distance between the contour
points and the hand centre. They achieved 86-100% accuracy
in their own 10-gesture challenging dataset. Klompmaker at
al. developed a interaction framework for touch detection and
object interaction [5]. The fingertips are detected by the vertices of the polygon approximating the hand contour. Yeo et al.
present a similar method, but they compute more features and
give several criteria to classify a polygon vertex as a fingertip
[21]. The authors in [17] used a convex shape decomposition
method combined with skeleton extraction to detect fingertips
and recognise the gesture. Their method accuracy is between
94% and 97% in Rens dataset. Detection of half-closed and
closed fingertips requires other approach, such as geodesic
maxima based fingertip detection [6] or 3D model fitting [7],
[12] but their computational costs are large.
III.
Our aim was to develop a vision-based, applicationindependent cheap mouse control system. The user convenience was an important aspect, just like the small number
of the gestures to memorize. Our solution ensures also some
freedom in gesture presentation for the user.
II.
~80cm
R ELATED WORK
Fig. 1.
B. Controlling
Move cursor: The user can move the cursor in the way
described earlier. The gesture Move (Figure 2b) is a
good position; it is easy to form the click gesture from
it. But users also can control the cursor with index
finger or with open palm if his fingers are closed.
Double click event: Two single click gestures sequentially within about a second cause a double click.
(a)
Fig. 2.
(b)
(c)
(d)
(e)
Start (a), Move (b), Left button (c), Right button (d), Stop (e).
1,
2,
if pz phz < 2r and
Hc (P ) =
0,
otherwise,
C. Constraints
Let us consider a coordinate system where the X- and Yaxis are parallel to the X- and Y-axis of the screen, and the
Z-axis is used to measure the depth. It is assumed, that the
absolute value of angle between users hand and Y-axis is
less than 90 , although this constraint is used only by cursor
moving steps. The other constraints come from the features
of the sensor and arrangement. The user has to sit far enough
from the device, the rotation of hand around the X- and Yaxis should be less than 30 and 20 , the hand is a foreground
object. These restrictions correspond with those reported by
other authors. Methods can tolerate far better the rotation of
hand around Z-axis, because gestures remain parallel to plane
of image [15].
IV.
where P h is the center point of the hand and k.k denotes the
Euclidean distance. The r and r2 are given radius (r < r2 ).
We set r to 14 cm and r2 to 17 cm based on a research of the
physical expansion of hand length and the great span (distance
between the extended thumb and little finger) [1].
2) Removing non-hand part objects: Disturbing objects are
usually either the (non-control) hand of the user or/and other
objects on the table. When the hand is close to other objects,
they can become hand and also arm candidate points. The
whole object almost never becomes hand candidate, because
it can be found usually on the table, while the hand is in the
air. Therefore, we have to delete all the components containing
arm candidate points and not containing the hand center point
(Figure 4c). Our algorithm consists of the following steps.
1)
420
CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy
Fig. 3.
2)
3)
4)
(a)
if P sh or
Hc (P ),
H(P ) =
s : s Sc P s
0,
otherwise.
(b)
(c)
2) Fingertips detection of extended fingers: To detect candidates of fingertips, a well-known shape-based method is
applied [9] with a slight modification. First, the hand contour (sh ) is approximated by a polygon in a coarser way
(epsilon = 15), so the resulting polygon contains only very
few vertices: fingertips, valley between fingers, and a few other
extreme points. Let us describe the polygon as a point sequence
(sh ) = (P0 , P1 , ..., Pn , Pn+1 ),
where P1 , ..., Pn are the polygon corners in clockwise order
and P0 = Pn , and Pn+1 = P0 . Then the method selects the
extreme points by
(a)
(b)
(c)
Fig. 4. Depth image of a user with a mug in the foreground (a), hand points
candidate mask of (a), the final mask (c). (The figures contain only the relevant
parts of the image and masks.)
C. Fingertip detection
1) Preparation of the hand mask: If the hand mask (H)
is made up of two or more components, in this step, we
connect the separated ones to the hand blob (sh ). The Figure
4c suggests to connect the two components at their closest
points, which may also belong to the contour of other fingers.
To avoid this error, we connect s and sh segments at the P
and Q points defined by the following formula:
arg min
kP, Qk
P (sh ), Qs
421
(a)
(b)
(c)
E XPERIMENTS
422
CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy
VI.
The publication was supported by the TAMOP-4.2.2.C11/1/KONV-2012-0001 project. The project has been supported by the European Union, co-financed by the European
Social Fund.
R EFERENCES
[1]
[2]
[3]
The first time testers control access time were 6.4 time
slower, the advanced users were only 2.4 times slower with
gestures than with the traditional mouse. The track bar became
the least popular control, although all the false selections
belonged to the scrollable list box.
[4]
[5]
TABLE I.
Gesture-based mouse
radius
[7]
Computer mouse
speed (px/ms)
in
out
speed (px/ms)
in
out
0.09
51
0.25
65
10
0.10
67
0.34
84
15
0.12
64
0.36
53
20
0.14
81
0.31
63
25
0.16
72
0.41
66
30
0.15
71
0.33
79
[8]
[9]
423
C ONCLUSION
Fig. 7.
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
The results of the GUI test: Median of the elapsed time between the appearance of the controls and the success clicking.
424
[18]
[19]
[20]
[21]
[22]
[23]
[24]
nica