Gesture-Based Computer Mouse Using Kinect Sensor

CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy
Gesture-based computer mouse using Kinect sensor

Szilvia Szeghalmy
Marianna Zichar
Attila Fazekas
Department of Computer Graphics and

Image Processing
University of Debrecen
H-4010 Debrecen Pf. 12
Email: szeghalmy.szilvia@inf.unideb.hu

Image Processing
Email: zichar.marianna@inf.unideb.hu

Image Processing
Email: attila.fazekas@inf.unideb.hu
AbstractThis paper introduces a vision-based gesture mouse

system, which is roughly independent from the lighting conditions, because it only uses the depth data for hand sign
recognition. A Kinect sensor was used to develop the system,
but other depth sensing cameras are adequate as well, if their
resolutions are similar or better than the resolution of Kinect
sensor. Our aim was to find a comfortable, user-friendly solution,
which can be used for a long time without getting tired. The
implementation of the system was developed in C++, and two
types of test were performed too. We investigated how fast the
user can position with the cursor and click on objects and we
also examined which controls of the graphical user interfaces
(GUI) are easy to use and which ones are difficult to use with
our gesture mouse. Our system is precise enough to use efficiently
most of the elements of traditional GUI such as buttons, icons,
scrollbars, etc. The accuracy achieved by advanced users is only
slightly below as if they used the traditional mouse.
I.
I NTRODUCTION
In Human-Computer Interaction (HCI), mouse is still one

of the most commonly used input devices. Its great benefit
is, that it makes possible for the users to control all kinds
of application with GUI. But this device can not be used in
several cases in public area and/or by handicapped people.
Some gesture-based systems take over the control of the mouse
pointer and mouse events to solve this problem. This technique
is rather popular among the head- and eye-mouse systems
[2], [18], but it appeared also in some touch-free medical
applications [4], [20] and of course in the computer gaming
world [8], [13].
According to the definition of a new research field called
cognitive infocommunications [22], HCI applications may be
classified as coginfocom applications. In the case of current
research this statement is extremely true, because it has links
with both infocimmunications and the cognitive sciences [24].
Actually our gesture-based computer mouse belongs to the
inter-cognitive communication.
data are frequently used only for hand pixels extraction. If the
hand is properly segmented, the gestures can be recognised by
the shape of the hand contour and other geometric features.
Rens at al. proposed a new contour matching method in [16]
using the series of the relative distance between the contour
points and the hand centre. They achieved 86-100% accuracy
in their own 10-gesture challenging dataset. Klompmaker at
al. developed a interaction framework for touch detection and
object interaction [5]. The fingertips are detected by the vertices of the polygon approximating the hand contour. Yeo et al.
present a similar method, but they compute more features and
give several criteria to classify a polygon vertex as a fingertip
[21]. The authors in [17] used a convex shape decomposition
method combined with skeleton extraction to detect fingertips
and recognise the gesture. Their method accuracy is between
94% and 97% in Rens dataset. Detection of half-closed and
closed fingertips requires other approach, such as geodesic
maxima based fingertip detection [6] or 3D model fitting [7],
[12] but their computational costs are large.
III.
Our vision-based mouse control system relies only on depth

data, thus the lighting conditions hardly influence it. The depth
data are provided by the depth sensor, namely Kinect [11].
A. Arrangement of device for sitting position
It is the best arrangement, when the sensor is a bit bellow
the screen. The sensor should be parallel to the monitor, but it
is also permitted, if it looks slightly upwards to the user. The
user is sitting in front of the screen in a position, where he
can hold his hand inside the sensors field of view (Figure 1).
Our aim was to develop a vision-based, applicationindependent cheap mouse control system. The user convenience was an important aspect, just like the small number
of the gestures to memorize. Our solution ensures also some
freedom in gesture presentation for the user.
II.
OVERVIEW OF THE SYSTEM
~80cm
R ELATED WORK
The hand and fingertips detection and gesture recognition

methods have been studied for several decades, from which
methods designed for depth sensor are in focus now. Depth
978-1-4799-7280-7/14/$31.00 2014 IEEE

419
Fig. 1.
Arrangement of devices where K denotes a depth sensor [19].
Sz. Szeghalmy et al. Gesture-based computer mouse using Kinect sensor
B. Controlling
A. Depth data and preprocessing
User can move the cursor in joystick-like way. If his hand

or finger is in vertical position slightly lined forward, the
cursor does not move, otherwise the cursor keeps going to the
direction the hand is pointing. If the user tips his hand to the
left/right, the cursor goes to the left/right. If he tilts downward
or upward, the cursor goes downward/upward as well. Each
movement lasts until the hand starts to move other direction.
The Kinect sensor can measure the distance of objects from

the sensor with the infra projector and the infra cam. We use
OpenNI2 SDK [13] and NiTE Middleware Libraries [14] to
extract depth image in 640 480 resolution and track user
hands. We assigned a predefined gesture (wave) to an instance
of the NiTE HandTracker class. The hand tracking starts if
the given gesture is recognised. If the tracking works and
finds at least one hand, we retrieve the center point of hand
nearest the sensor, the depth image and the corresponding real
world (using the terminology of Microsoft the Kinect space)
coordinates The depth data are usually quite noisy, thus we
apply 33 median filter on the depth image.
User can trigger the following events:
Start and Stop: The gesture-based controlling starts

due to hand waving (Figure 2a) and stops if user opens
all fingers (Figure 2e).
Move cursor: The user can move the cursor in the way
described earlier. The gesture Move (Figure 2b) is a
good position; it is easy to form the click gesture from
it. But users also can control the cursor with index
finger or with open palm if his fingers are closed.
In this paper P = (px , py ) denotes a pixel, pz denotes the

depth values of P , and c(P ) = (c(P )x , c(P )y , c(P )z ) denotes
the world coordinates corresponding to P .
Single click event with left button: Initially, the hand

is in the move posture, then user should extend his
thumb finger and close it again (Figure 2c).
Double click event: Two single click gestures sequentially within about a second cause a double click.
Single click event with right button: Initially, the hand

is in the move posture, then user should open the index
and ring fingers and close it again (Figure 2d).
Many solutions consider a big (or the biggest), foreground,

connected component as a hand [6], [10], [17], [21], but in
the depth image, the hand points often belong to multiply N8 connected segments, because of the self-occlusion of the hand
and fingers. Based on our experience, these small parts around
the biggest segment almost always are fingertips. Thus, they
play an important role in hand sign recognition.
Hold button down: Initially, the hand is in the move

posture, then user presents left button or the right
button sign and holds his hand in this posture. After
some frame, the cursor can move as well.
(a)
Fig. 2.
(b)
(c)
(d)
(e)
Start (a), Move (b), Left button (c), Right button (d), Stop (e).
B. Hand points detection
In our previous work [19], we defined the hand points by

using the part of the sphere around the known hand center.
Since the hand is in the foreground during the control period,
the method works well in most cases. Now, we complement
the algorithm with a filter part to make it more robust.
1) Determining the hand candidate points: First, we create
a mask image (Figure 4b) by the following formula, where 1
denotes the hand candidate points, 2 denotes the arm candidate
points and 0 identifies the background.
if pz phz < 2r and kc(P ), c(P h )k < r,
1,
2,
if pz phz < 2r and
Hc (P ) =
r kc(P ), c(P h )k < r2 ,
0,
otherwise,
C. Constraints
Let us consider a coordinate system where the X- and Yaxis are parallel to the X- and Y-axis of the screen, and the
Z-axis is used to measure the depth. It is assumed, that the
absolute value of angle between users hand and Y-axis is
less than 90 , although this constraint is used only by cursor
moving steps. The other constraints come from the features
of the sensor and arrangement. The user has to sit far enough
from the device, the rotation of hand around the X- and Yaxis should be less than 30 and 20 , the hand is a foreground
object. These restrictions correspond with those reported by
other authors. Methods can tolerate far better the rotation of
hand around Z-axis, because gestures remain parallel to plane
of image [15].
IV.
M ETHOD AND IMPLEMENTATION DETAILS
where P h is the center point of the hand and k.k denotes the
Euclidean distance. The r and r2 are given radius (r < r2 ).
We set r to 14 cm and r2 to 17 cm based on a research of the
physical expansion of hand length and the great span (distance
between the extended thumb and little finger) [1].
2) Removing non-hand part objects: Disturbing objects are
usually either the (non-control) hand of the user or/and other
objects on the table. When the hand is close to other objects,
they can become hand and also arm candidate points. The
whole object almost never becomes hand candidate, because
it can be found usually on the table, while the hand is in the
air. Therefore, we have to delete all the components containing
arm candidate points and not containing the hand center point
(Figure 4c). Our algorithm consists of the following steps.
1)
The main steps of our system are presented in the Figure

3. In the following sections we describes these steps in detail.
420
Create the binary version of Hc , where the arm and

hand candidate pixels form the foreground and the
zero value pixels form the background.
Fig. 3.
2)
Main steps of our system
Label the N8 -connected components (C) of this binary image.

Let Sc denote the set of the clear hand components:
Sc = {s | s C and P s Hc (P ) = 1}
and let Sm denote the other foreground component:
Sm = {s | s C and s
/ Sc and
P s Hc (P ) 6= 0}.
3)
4)
(a)
Find sh Sc Sm containing the center of the hand.

Use the following formula to determine hand mask:
if P sh or
Hc (P ),

H(P ) =
s : s Sc P s
0,
otherwise.
(b)
(c)
Fig. 5. Connection of separated hand segments: original mask (a) , contour

approximation (b), connected mask (c).
2) Fingertips detection of extended fingers: To detect candidates of fingertips, a well-known shape-based method is
applied [9] with a slight modification. First, the hand contour (sh ) is approximated by a polygon in a coarser way
(epsilon = 15), so the resulting polygon contains only very
few vertices: fingertips, valley between fingers, and a few other
extreme points. Let us describe the polygon as a point sequence
(sh ) = (P0 , P1 , ..., Pn , Pn+1 ),
where P1 , ..., Pn are the polygon corners in clockwise order
and P0 = Pn , and Pn+1 = P0 . Then the method selects the
extreme points by
(a)
(b)
Fe = {Pi | Pi (sh ) (sh )},
(c)
Fig. 4. Depth image of a user with a mug in the foreground (a), hand points
candidate mask of (a), the final mask (c). (The figures contain only the relevant
parts of the image and masks.)
C. Fingertip detection
1) Preparation of the hand mask: If the hand mask (H)
is made up of two or more components, in this step, we
connect the separated ones to the hand blob (sh ). The Figure
4c suggests to connect the two components at their closest
points, which may also belong to the contour of other fingers.
To avoid this error, we connect s and sh segments at the P
and Q points defined by the following formula:
arg min
kP, Qk
where (sh ) is a set of the convex hull corners of the hand

component (Figure 6a).
The next step is to classify the elements of Fe into the
finger candidate (Figure 6b) and non-finger classes based on
the distance between point and hand center, and the angle of
the fingertip. The original method calculates a midpoint of two
points located in a given distance before and after the candidate
points along the contour. If the distance of the candidate and
the midpoint is larger than the given limit, the candidate is a
fingertip. We want to be more permissive, because we need to
detect as a fingertip not only a single finger, but the fingers
close together, like our Move sign (Figure 2b). Thus, we check
the angle of polygon at this corner. More precisely, the set of
candidates is defined by
Fc = {Pi Fe | (Pi ) < 90 and kc(Pi ), c(P h )k < 50}
P (sh ), Qs
where k.k is the Euclidean distance, and (sh ) denotes the

corners of polygon approximating the contour of sh . The
approximation is performed by the Douglas-Peucker method
[3] with = 10. Finally, we draw a wide line between the P
and Q on the hand mask (Figure 5).
421
where (Pi ) the angle between Pi P~i+1 and Pi P~i1 .

The fingertip detection methods often give false detection
around the wrist. The authors in [6] define an ellipse around
the wrist points and penalize if the path between the hand
centre and the points contains ellipse points. Li [9] proposed

to detect the bounding box of hand and remove candidates fall
in the lower region. Our hand segmentation method detects
reliably the part of the forearm (expect if the hand totally hide
the forearm region), thus we can easily eliminate the points
located too close to wrist region. Final set of fingertips (Figure
6.c) is defined by
F = {P Pc | Hc (P ) = 2 kc(Pi ), c(P )k > 50}.
F. The 3D orientation of hand and fingers

To realize the joystick-like movement of the cursor, we
need 3D orientation, but previous steps computed only 2D one.
If the fingers are almost parallel to the plane of the sensor, the
orientation can be calculated easily. Otherwise, it is easy to get
wrong result. The orientation of full hand can be determined
more precisely, but moving whole hand requires more effort
from the user. This issue was solved by applying Weighted
Principal Component Analysis (WPCA).
The orientation of the whole hand and forearm point cloud
with WPCA are computed only if the index finger was found
in the previous steps. The weights of points are determined
based on the distance from the index finger. The weight of the
index finger is five, then it is reduced by one per each three
centimetres, until the weight becomes one.
G. Events
(a)
(b)
(c)
Fig. 6. Common points of approximating polygon (gray line) and convex

hull (a), fingertip candidates (b), detected fingertips (grey circle) and removed
candidates (black rectangle) (c).
D. Thumb and index finger recognition

If we have found a fingertip in the previous step, we need to
recognise the thumb and the index finger. First, we compute the
orientation of the hand mask (H = 1 or H = 2) with the PCA.
Then we rotate the fingertip points (F ) in clockwise direction
around the hand centre by an angle between the vector of the
largest eigenvalue and the vertical axis. Then fingertips are
sorted in increasing order of their x coordinate. Let F10 , ..., Fn0
denote the rotated, sorted fingertips. Labelling steps:
1)
2)
3)
Set the label of F10 (leftmost fingertip) to thumb, if

the angle of the line between the hand center and F10
from the Y-axis is larger than 60 .
Otherwise, set the label of F10 to index, if the angle
of the line between the hand center and F10 from the
Y-axis is larger than -20 .
Otherwise, if F20 exists set it to index, if the angle
of the line between the hand center and F20 from the
Y-axis is larger than -20 .
E. Control sign recognition

The simple rule based solution can recognise the sign, since
the necessary data, the number of fingertips, and their labels
are already available.
Move sign: only one fingertip is detected and this is

the index finger.
Left button sign: two fingertips are detected and one is

the index and the other is the thumb, and the distance
between these fingers is larger than 6 cm.
Right button sign: two fingertips are detected; one is

the index, the other is not the thumb. The distance
between the fingers is between 2.5 cm and 7 cm.
Stop sign: the number of the fingertips equals to 5.
Start sign: this gesture is a pre-built in NiTE.
In a real-time control system, even one misclassification out

of 100 can drive the user mad. A state machine can radically
reduce this problem by ignoring most of the unexpected signs.
For the sake of clarity, the figure does not show the stop sign
and the unexpected events. The stop sign set the machine state
to Stopped immediately. Sequence of unexpected signs take the
machine to the Wait state, from every other state except the
Stopped.
Initially, the system is in the Stopped state waiting for the
start signal to go to the Alarm state. From this state, we can
wake up the system easily with the move sign. These two
states help us to avoid involuntary movements. The Stopped
state refers that the user wants to pick the cursor off for middle
or long-term, while the Alarm state is applied when the user
needs only a moment of pause.
Basically, move sign makes the cursor move, but it is also
possible to move cursor while the mouse button is down. In
the second case, the cursor movement does not fire the button
down event right away, in order to prevent cursor from moving
while clicking.
The waiting cycle in the Left button up event state ensures
that the cursor does not move if one left click followed by
another in about a second.
Since the speed of click-gesture is lower than the speed of
traditional mouse clicking, we increase the time belonging to
the double-click in the operating system during the Alarm state.
Here, we reduce the speed of cursor as well. Certainly, we load
the original values back when the user stops the system.
V.
E XPERIMENTS
The most common evaluation measures regarding the

pointing devices are speed and accuracy. Because this section
reports only about the first experimental results we do not
consider other types of accuracy measures [23].
A. Speed test
We have developed a very simple application to measure
how fast the user can position the cursor and click on an object.
This is a simple panel, on which a circle appears. The tester
422
has to click on the circle as fast as he can. After clicking, the

size and the position of the circle change randomly (even if
the user clicks on wrong place).
B. Graphical User Interface test
First, several buttons (requires left mouse button single
click), then several icons (requires double click) appear in
the screen and the user has to click on the yellow one. The
icons, and also the buttons appear in descending order by size
(128128, 6464, 10030, 3232, 2424, 1616, 1212).
Then the marked item or value of different types of controls
(list box, list box with scroll bar, combo box, radio button,
check box and track bar) have to be selected.
C. The Scenario
The position of the screen and the Kinect were fixed
(Figure 1). The users were asked to sit comfortably and could
keep their elbows on the armrest during the test. Some figures
and a few texts helped the users learn the functions of gesture
mouse. Below the description there were some controls, which
were addressed to a particular function. Users could try to use
some of the elements, so the learning process was short (1-2
minutes), but interactive.
After that, the users did the GUI and speed test with
gesture-based mouse and the traditional computer mouse as
well. (The order of the control devices always changed.)
Finally, they were asked to rate the comfort of controls and to
share their experience.
Users could solve some exercises in different ways. Let us
take for example a scrollable list box. The tester could click
the scroll bar more times to scroll down the list or he could
pull down the thumb of the scroll bar.
Nine volunteers performed the test, and two of them were
advanced user, who used the system on number of occasions.
D. Results
VI.
We have developed a visual based gesture mouse system.

The proposed method uses only depth data, thus the system
works even at night. We combined successfully a well-known
shape based method and the easy extrema based solution to
detect fingertips needed for controlling. Based on the reliable
detection of the hand points and the wrist-forearm region, we
can filter out the false fingertip candidates very effectively. In
the on-line system some false detections always occur, but the
state machine ensures the robustness of our system.
We have performed a test to examine how easy it is to use
the commonly used GUI elements with our system and how
quickly users can click on the certain objects on the screen.
Each tester was able to handle even the smallest elements
(button with 1212 pixel) but most of first time testers found
easy to use only the larger ones, at least 3232 pixels. Users
told that controlling by gestures is weird a bit initially, but
it becomes easy once you get the hand of it. Some practice
resulted much faster control for the user than before. The
advanced users reached a bit lower accuracy with the gesturebased system than with the traditional mouse (97.60% and
98.56% on the speed test).
ACKNOWLEDGEMENT
The publication was supported by the TAMOP-4.2.2.C11/1/KONV-2012-0001 project. The project has been supported by the European Union, co-financed by the European
Social Fund.
R EFERENCES
[1]
[2]
[3]
The first time testers control access time were 6.4 time
slower, the advanced users were only 2.4 times slower with
gestures than with the traditional mouse. The track bar became
the least popular control, although all the false selections
belonged to the scrollable list box.
[4]
[5]
The Table I presents the result of advanced users speed

tests. The Figure 7 presents the outcome of GUI tests.
[6]
TABLE I.
T HE SPEED TEST RESULTS OF TWO ADVANCED USERS . T HE

FIRST COLUMN SHOWS THE RADIUS OF CIRCLES . T HE IN AND OUT
COLUMNS CONTAIN THE NUMBER OF CLICKING INSIDE AND OUTSIDE THE

CIRCLES .
Gesture-based mouse
radius
[7]
Computer mouse
speed (px/ms)
in
out
speed (px/ms)
in
out
0.09
51
0.25
65
10
0.10
67
0.34
84
15
0.12
64
0.36
53
20
0.14
81
0.31
63
25
0.16
72
0.41
66
30
0.15
71
0.33
79
[8]
[9]
423
C ONCLUSION
A. K. Agnihotri, B. Purwar, N. Jeebun, S. Agnihotri, Determination Of

Sex By Hand Dimensions. The Internet Journal of Forensic Science, 1
(2), (2006)
M. Betke, J. Gips, and P. Fleming, The camera mouse: visual tracking
of body features to provide computer access for people with severe
disabilities, IEEE Transactions on Neural Systems and Rehabilitation
Engineering, 10(1), pp. 110. (2002)
D. Douglas and T. Peucker, Algorithms for the reduction of the number
of points required to represent a digitized line or its caricature, The
Canadian Cartographer 10(2), pp. 112-122. (1973)
C. Graetzel, T. Fong, S. Grange, C. Baur, A non-contact mouse for
surgeon-computer interaction. Technology and Health Care, 12(3), 245
257. (2004)
F. Klompmaker, K. Nebe, A. Fast, dSensingNI: a framework for advanced tangible interaction using a depth camera, In Proceedings of the
Sixth International Conference on Tangible, Embedded and Embodied
Interaction, ACM, pp. 217224. (2012)
P. Krejov, R. Bowden, Multi-touchless: Real-time fingertip detection
and tracking using geodesic maxima, In Proceeding of 10th IEEE
International Conference and Workshops on Automatic Face and Gesture
Recognition (FG), IEEE, pp. 17. (2013)
N. Kyriazis, I. Oikonomidis, and A. Argyros, A GPU-powered computational framework for efficient 3D model-based vision, ICS-FORTH,
TR420, (2011)
P. D. Le, V. H. Nguyen, Remote Mouse Control Using Fingertip Tracking
Technique, In AETA 2013, Recent Advances in Electrical Engineering
and Related Sciences, Springer Berlin Heidelberg, pp. 467476. (2014)
Y. Li, Hand gesture recognition using Kinect. In Proceeding 3rd International Conference on Software Engineering and Service Science
(ICSESS), IEEE, pp. 196199 (2012)
Fig. 7.
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
The results of the GUI test: Median of the elapsed time between the appearance of the controls and the success clicking.
H. Liang, J. Yuan, D. Thalmann, 3D fingertip and palm tracking in

depth image sequences, in Proceedings of the 20th ACM international
conference on Multimedia, pp. 785788. (2012)
Microsoft Corporation (2012), MS Developer Network - Kinect Sensor
URL http://msdn.microsoft.com/en-us/library/hh438998.aspx Accessed
26 August 2013
I. Oikonomidis, N. Kyriazis, and A. Argyros, Efficient model-based 3D
tracking of hand articulations using Kinect, in British Machine Vision
Conference, (2011)
R
OpenNI (2013),
OpenNI 2.0 Software Development Kit
URL http://www.openni.org/openni-sdk/ Accessed 25 July 2013
PrimeSenseTM (2013), Natural Interaction Middleware libraries version
2.2
URL http://www.primesense.com/solutions/nite-middleware/ Accessed
25 July 2013
A. R. Sarkar, G. Sanyal, S. Majumder, Hand Gesture Recognition
Systems, A Survey. International Journal of Computer Applications,
71(15), (2013)
Z. Ren, J. Yuan, Z. Zhang, Robust hand gesture recognition based
on finger-earth movers distance with a commodity depth camera, In
Proceedings of the 19th ACM international conference on Multimedia,
ACM, pp. 10931096. (2011)
S. Qin, X. Zhu, Y. Yang, Y. Jiang, Real-time hand gesture recognition
424
[18]
[19]
[20]
[21]
[22]
[23]
[24]
nica
from depth images using convex shape decomposition method. Journal

of Signal Processing Systems, 74(1), pp. 4758. (2014)
K. Sung-Phil, J. D. Simeral, L.R. Hochberg, J.P. Donoghue, G.M.
Friehs, M.J. Black, Point-and-Click Cursor Control With an Intracortical
Neural Interface System by Humans With Tetraplegia, IEEE Transactions
on Neural Systems and Rehabilitation Engineering, 19(2), pp. 193203.
(2011)
Sz. Szeghalmy, M. Zichar, A. Fazekas, Comfortable mouse control using 3D depth sensor, in IEEE 4th International Conference on Cognitive
Infocommunications, pp. 219222., (2013)
Wachs J. P., K. Mathias, S. Helman, E. Yael, Vision-based hand-gesture
applications, In Commun. ACM 54(2), 6071. (2011)
H. S. Yeo, B. G. Lee, H. Lim, Hand tracking and gesture recognition
system for human-computer interaction using low-cost hardware, Multimedia Tools and Applications, 129. (2013)
P. Baranyi, A. Csapo, Definition and Synergies of Cognitive Infocommunications, Acta Polytechnica Hungarica, vol. 9, 67-83, (2012)
MacKenzie, I. Scott, Tatu Kauppinen, Miika Silfverberg, Accuracy
measures for evaluating computer pointing devices, Proceedings of the
SIGCHI conference on Human factors in computing systems. ACM 9
16, (2001).
G. Sallai, The Cradle of Cognitive Infocommunications, Acta PolytechHungarica, vol. 9, no. 1, 171-181, (2012).

Gesture-Based Computer Mouse Using Kinect Sensor

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Gesture-Based Computer Mouse Using Kinect Sensor

Transféré par

Droits d'auteur :

Formats disponibles

CogInfoCom 2014 5th IEEE International Conference on Cognitive Infocommunications November 5-7, 2014, Vietri sul Mare, Italy

Gesture-based computer mouse using Kinect sensor

Department of Computer Graphics and

Department of Computer Graphics and

Department of Computer Graphics and

AbstractThis paper introduces a vision-based gesture mouse

In Human-Computer Interaction (HCI), mouse is still one

Our vision-based mouse control system relies only on depth

OVERVIEW OF THE SYSTEM

The hand and fingertips detection and gesture recognition

978-1-4799-7280-7/14/$31.00 2014 IEEE

Arrangement of devices where K denotes a depth sensor [19].

Sz. Szeghalmy et al. Gesture-based computer mouse using Kinect sensor

A. Depth data and preprocessing

User can move the cursor in joystick-like way. If his hand

The Kinect sensor can measure the distance of objects from

User can trigger the following events:

Start and Stop: The gesture-based controlling starts

In this paper P = (px , py ) denotes a pixel, pz denotes the

Single click event with left button: Initially, the hand

Single click event with right button: Initially, the hand

Many solutions consider a big (or the biggest), foreground,

Hold button down: Initially, the hand is in the move

B. Hand points detection

In our previous work [19], we defined the hand points by

if pz phz < 2r and kc(P ), c(P h )k < r,

r kc(P ), c(P h )k < r2 ,

M ETHOD AND IMPLEMENTATION DETAILS

The main steps of our system are presented in the Figure

Create the binary version of Hc , where the arm and

Main steps of our system

Label the N8 -connected components (C) of this binary image.

Find sh Sc Sm containing the center of the hand.

Fig. 5. Connection of separated hand segments: original mask (a) , contour

Fe = {Pi | Pi (sh ) (sh )},

where (sh ) is a set of the convex hull corners of the hand

where k.k is the Euclidean distance, and (sh ) denotes the

where (Pi ) the angle between Pi P~i+1 and Pi P~i1 .

Sz. Szeghalmy et al. Gesture-based computer mouse using Kinect sensor

centre and the points contains ellipse points. Li [9] proposed

F. The 3D orientation of hand and fingers

Fig. 6. Common points of approximating polygon (gray line) and convex

D. Thumb and index finger recognition

Set the label of F10 (leftmost fingertip) to thumb, if

E. Control sign recognition

Move sign: only one fingertip is detected and this is

Left button sign: two fingertips are detected and one is

Right button sign: two fingertips are detected; one is

Stop sign: the number of the fingertips equals to 5.

Start sign: this gesture is a pre-built in NiTE.

In a real-time control system, even one misclassification out

The most common evaluation measures regarding the

has to click on the circle as fast as he can. After clicking, the

We have developed a visual based gesture mouse system.

The Table I presents the result of advanced users speed

T HE SPEED TEST RESULTS OF TWO ADVANCED USERS . T HE

COLUMNS CONTAIN THE NUMBER OF CLICKING INSIDE AND OUTSIDE THE

A. K. Agnihotri, B. Purwar, N. Jeebun, S. Agnihotri, Determination Of

Sz. Szeghalmy et al. Gesture-based computer mouse using Kinect sensor

H. Liang, J. Yuan, D. Thalmann, 3D fingertip and palm tracking in

from depth images using convex shape decomposition method. Journal

Vous aimerez peut-être aussi