Vous êtes sur la page 1sur 4

VII Latin American Symposium on Circuits and Systems (LASCAS) 2016

Telepresence Using the Kinect Sensor


and the NAO Robot
Jose Avalos , Student Member, IEEE, Sergio Cortez , Student Member, IEEE, Karina Vasquez , Student Member, IEEE,
Victor Murray , Senior Member, IEEE, and Oscar E. Ramos , Member, IEEE
Department of Electrical Engineering, Universidad de Ingenieria y Tecnologia - UTEC, Lima, Peru.
Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, New Mexico, USA.

AbstractSeveral applications require that a telepresence sys-


tem does not only transmit images, audio and video but also
real-time motion. This work presents the implementation of a
system that allows for telepresence using a humanoid robot and
a motion capture device. The proposed methodology for motion
imitation is fast and uses a low-cost motion acquisition sensor. The
objective is to make the robot reproduce the motion of a person.
In this way, remote actions can be executed through the robot
sending also images and audio from the environment. The method
has been applied to the humanoid robot called NAO, which has
been able to reproduce human motions. This framework can also
be used for different education purposes. Fig. 1: Example of robot teleoperation. The real and simulated NAO robot
KeywordsNAO robot, kinect sensor, robotic kinematics, telep- follows the movements of the person.
resence

I. I NTRODUCTION In this work, we propose a methodology to achieve real-time


Telepresence is a resource that allows people to be con- telepresence using a geometric approach for planar motion and
nected from any place in order to virtually accomplish a an inverse kinematics approach for a complete 3D motion of a
specific task. An example is a videoconference in which both humanoid robot. The methods have been validated using both
visual and audio are enabled to recreate a certain scenario. Kinect sensor v1 and v2 [6] controlling the NAO robot [7].
However, the real target of telepresence is to get the user have An example of the proposed teleoperation method is shown
the control of all his faculties in any place at any time. This in Fig. 1 where the robot imitates the motion shown by the
will improve communication and, as a result, it can be applied demonstrator in real time. The remainder of this paper is
on any educational field. organized as follows. In Section II we present the methods to
Different methods have been applied and analyzed to achieve motion imitation, which is crucial for teleoperation.
achieve telepresence through the usage of robots. Suleiman Section III shows the experimental results of two methods.
et al. [1] present human motion imitation by a human-
size humanoid robot considering inverse dynamics, where the II. M ETHODS
physical properties of the robot constitute the main problem. To apply the teleoperation to any robot-sensor system, we
The method by Cela et al. [2] is a low-cost system that consists propose to divide the task in three main parts. The first one
in the use of accelerometers attached to the human body to consists in moving the robot joints through inverse kinematic
detect the human joint motion; however, this method is limited operations in the operational space or in the joint configuration
to simple robots due to imprecise data. On the other hand, It is space. The second part obtains compatible data from the
also possible to use visual sensors to acquire the human motion motion capture device and processes it to make it suitable
without needing to attach any special device to the person as in for motion imitation. The last part consists in creating a
[3] where a robot with eyes capable of capturing motion is bridge between the robot and the motion capture device, which
presented. In this front, the Kinect sensor has also been used to will allow the transmission of data in real time, achieving
directly detect human body joints [4] and to use the point cloud teleoperation. The process can be geared to work under a two
data with some further processing in a virtual environment or a three dimensional model.
[5]. In all these cases, the data recovered from the motion
capture system is processed to recover the motion and is then A. Two dimensional teleoperation motion
sent to the robot. An important issue yet to be solved is the 1) Robot Motion Control: In this case, a joint-space based
time needed to process the motion and to determine the most control system is more efficient and easier to implement since
adequate joint configuration for the robot, which is usually not the motion constrained to a plane represented using joint
generated in real time due to computational complexity. angles. Therefore, the model will directly work in the joint

ISBN 978-1-4673-7835-2/16/$31.002016 IEEE 303 IEEE Catalog Number CFP16LAS-ART


VII Latin American Symposium on Circuits and Systems (LASCAS) 2016

space rather than in the Cartesian space. In two dimensions,


specific angles relative to the motion of certain limbs will be
the data sent to the robot. The program will receive joint angles
which will constitute the motion targets.
2) Data acquisition: There are different ways and devices
that act as motion capture systems to recover data from
the human body [2][5]. Another example is a depth sensor
such as the Kinect which can capture specific joints of the
human body, in Cartesian coordinates, related to the arms,
Fig. 2: Skeleton showing the important angles for the method. represents
legs, head and torso. These sensors usually come with an SDK the angle relative to the torso and the forearm. + represents the angle at
(Software Development Kit) which can be used to transform the elbow, between the forearm and the arm.
Cartesian into spherical coordinates to obtain the desired joint
angles. The propose method is based on vector operations.
This method can be extrapolated for other limbs as long as three dimensional motion. In this case, the robot control will
the motion is constrained to a plane. be defined with respect to Cartesian coordinates instead of
First, we analyze both arms of the person who will tele- joint angles. Since the motion is now three dimensional, a full
operate the robot. Note that the inertial reference frame is tele-operation can be achieved.
a motionless torso, and local reference frames lie on each Let the robot joint angles be represented by the joint
shoulder; that is, the position of each arm is represented with configuration vector q = (q1 , q2 , , qn ), where n is the
respect to its shoulder. The third component of these positions number of degrees of freedom of the robot. Let a Cartesian
is not important, since we are using a two dimensional model position be xi = (xi , yi , zi ). Forward kinematics relates the
(it is only included for mathematical purposes). joint configuration to an operational space configuration, such
Let the sensor output describing the ith joint position as the Cartesian space, as
with respect to the sensor reference frame be represented in xi = fi (q) (4)
Cartesian coordinates as:
where fi is the forward kinematics function. It can be seen in
xi = (xi , yi , zi ). (1) (4) that the Cartesian coordinates are a function that depends
For instance, let the shoulder position be xsj , the elbow directly on the joint angles. This is the case of the joint control
position be xej and the hand position be xhj , where j refers model described in section II-A. Since we are interested in the
to either the left or right limb. Using these positions, let the joint angles that will achieve a desired Cartesian position, (4)
vector referred to the forearm, vfj and the vector referred to needs to be inverted as
the arm vaj be defined as: q = fi1 (xi ), (5)
vfj = xej xsj (2) which is referred to as inverse kinematics. Due to the non-
and linearities of (4), it is difficult to obtain the inverse map as
vaj = xhj xej . (3) in (5) in a general case. One usual approach to solve this
problem is through differential inverse kinematics which acts
In the two-dimensional case, let the angle between the torso as a linearization of the model around the working point. This
and the forearm be represented by as Fig. 2 shows. This relation is represented with a differential map which, in this
angle can range from 0 to 180 where 0 corresponds to the case, corresponds to the Jacobian Ji = f i
q so that
forearm close to the torso and 180 corresponds to the forearm
fully raised. The angle at the elbow, between the forearm and xi = Ji q. (6)
the arm is + , where is obtained in a similar way as . As (6) shows, the Jacobian relates the rate of change of the
Note that the angle + ranges from 0 to 360 . joint angles to the rate of change of the Cartesian coordinates.
3) Data Transmission: Once the motion capture sensor is The system (6) is linear and needs to be solved for q. In the
acquiring raw data, which is processed as in [4] to obtain the general case [2], this solution is given by
desired joint angles as described in the previous section, they
can be sent to the robot. For these data to be transmitted to q = Ji# xi + (I Ji# Ji )z, (7)
the robot, a bridge must be implemented to make the system where Ji#denotes the Moore-Penrose pseudoinverse of Ji , and
work in real time. The implementation of the bridge depends z is an arbitrary vector which is projected onto the nullspace
upon the specific characteristics of the robot and the sensor to of Ji using (I Ji# Ji ). Note that if both q and x have
be used. the same size, leading to a square Jacobian, and if Ji is a
B. Three dimensional teleoperation motion nonsingular matrix, then the pseudoinverse becomes the usual
inverse Ji# = Ji1 . Using Euler integration, the joint velocities
1) Robot Motion Control: Contrary to the planar motion
obtained with (7) can be transformed into joint positions as
described in section II-A, a control based on inverse kinemat-
ics or differential inverse kinematics is more suitable for a qk+1 = qk + qk T (8)

ISBN 978-1-4673-7835-2/16/$31.002016 IEEE 304 IEEE Catalog Number CFP16LAS-ART


VII Latin American Symposium on Circuits and Systems (LASCAS) 2016

where qk is the joint configuration at time k. These joint values


can then be directly sent to the robot.
2) Data acquisition: This part is similar to its counterpart
in the two dimensional model. However, in this case the data
process is only referred to the Cartesian coordinates obtained
by the motion capture device instead of the joint angles. The
constraint to be taken into account is that the positions relative
to the human bones need to be normalized. The normalization
depends on the specific robot that is used since different robots
have different link lengths.
3) Data transmission: The data obtained with the motion Fig. 3: NAOs arm design. Image of Aldebaran [10].
capture device needs to be transmitted to the robot using a
bridge which depends on the specific hardware. One option
for this case is to use ROS (Robot Operating System) as a are sent to NAO. For instance, if a sudden movement is sent
middleware for interconnecting the system. to the robot, the program will end the motion execution as a
safety measure. Because of this issue, the data from the Kinect
III. R ESULTS AND D ISCUSSION are restricted to the robot joint limit values with the purpose
For the validation of the proposed methodology, we used the of avoiding physical damage and undesired interruptions. This
Kinect For Windows as a motion capture device. The reason is similar to clamping these values as in [9]. The angles are
was its very low price compared to other motion capture sys- multiplied by a factor obtained by experimental tests to achieve
tems, and the fact that it is open source. This sensor allows the better result. Likewise, tests were made to adjust the speed to
recovery of the data through its SDK. The humanoid robot that reproduce the motion avoiding interruptions due to torques or
we used for the experiments is NAO from Aldebaran Robotics, angular limits.
which resembles a human in its kinematical structure. This
allows us to reproduce anthropomorphic movements. B. Results of the three-dimensional telepresence
The Kinect for Windows v1 sensor detects depth based To generate motion for the robot in three dimensions we
on structured light. Combining both depth by stereo and used a control system based on inverse kinematics. Since we
depth by focus techniques, the Kinect sensor provides a three are interested in reproducing the motion shown by a human
dimensional reconstruction of a 3D space. The algorithm used demonstrator, it is important to ensure that the reference
by Microsoft to recognize a person relies on the comparison of frames of both the motion sensor and the robot are consistent.
the current obtained data with an internal database using a tree- In the general case, these reference frames have a different
type algorithm [8]. If the operation is successful, the person orientation as clearly presented in [4], where a Kinect v2 and
is properly recognized and represented as a skeleton. The a 7-DOF KUKA robotic arm were used. To solve this problem,
function to achieve this recovery in the Kinect SDK is called it is necessary to use a rotation matrix that can be obtained
Skeletal Tracking. This provides the position (in Cartesian from a prior calibration process. In our case, contrary to [4],
coordinates) of the main joints in a human being. Contrary we are assuming that the chest remains fixed throughout the
to its predecessor, the Kinect v2 is based on the principle of motion and then, it is straightforward to obtain the rotation
time-of-flight. matrix relating both frames from simple inspection.
The software for robot control has been mainly developed
A. Results of the two-dimensional telepresence for Linux since most robots use a Linux-based real-time
This method uses the Kinect sensor v1 with its correspond- embedded computer. We used ROS (Robot Operating System)
ing SDK V1.8, which can only be installed on Windows. The in Linux as a middleware to get the acquired motion from the
NAO robot is programmed using the interface provided by sensor, and to send the generated joint command to the robot.
Aldebaran Robotics called NAOqi. This work uses C++ to Since the Kinect SDK needed to extract the human motion data
program the robot motion. has been developed only for Windows, we needed to send the
The joint angular limits of the robot limbs do not correspond data from a machine running Windows to a machine running
to the values used in section II-A2. For example, the joint angle Linux. This was achieved through a logical (non-physical)
corresponding to the shoulder neither begins at 0 nor ends serial connection, using the rosserial package, which acted as
at 180 degrees, but ranges from -18 to 76 (LShoulderRoll) a bridge between Windows and ROS in Linux. This connection
as Fig. 3 depicts. This figure also shows that the values for scheme is shown in Fig. 4.
RelbowRoll range from 0 to 88.5 instead of the assumed The human motion is then used as input to the control sys-
ones. To solve this problem, a linear map relating these values tem where inverse differential kinematics is used to generate
was used. If the robot angular values exceed the joint limits, the desired joint angles qdesired using (8). The NAOqi Bridge
the robot stops moving due to its internal safety procedure. package acts as an interface between ROS and the NAOqi
Another factor that leads to automatically stopping the motion which control the robot. It takes the desired joint angles that
is the nominal torque which appears when sharp movements have been published to a ROS topic and sends them to NAOqi.

ISBN 978-1-4673-7835-2/16/$31.002016 IEEE 305 IEEE Catalog Number CFP16LAS-ART


VII Latin American Symposium on Circuits and Systems (LASCAS) 2016

Fig. 4: Scheme showing the parts of the system used for the 3D motion, from the human motion acquisition (with the Kinect) to the humanoid robot execution.

Fig. 6: Comparison of Kinect v1 (red) and v2 (blue) in terms of smoothing


in the motion acquisition. xaxis is time and yaxis is the position in x.

Fig. 5: Example of the results of the three-dimensional telepresence system


under three different positions. first step to achieve full telepresence. Future work will involve
audio and video transmission since the humanoid platform
(NAO) possesses these sensors.
Similarly, it takes the joint measured angles qmeasured coming
from the NAOqi and feedbacks them publishing in a ROS R EFERENCES
topic that the controller can read. The NAOqi is the ultimate [1] W. Suleiman, E. Yoshida, F. Kanehiro, J.-P. Laumond, and A. Monin,
interface between the robot and the host computer, which is On human motion imitation by humanoid robot, in IEEE International
normally used to control the NAO robot. The results of three Conference on Robotics and Automation (ICRA), 2008, pp. 26972704.
[2] A. Cela, J. J. Yebes, R. Arroyo, L. M. Bergasa, R. Barea, and E. Lopez,
dimensional telepresence are shown in Fig. 5. The software Complete low-cost implementation of a teleoperated control system for
we developed for this work is freely available1 . a humanoid robot, Sensors, vol. 13, no. 2, pp. 13851401, 2013.
In our experience, the Kinect v1 presented problems when [3] S. Filiatrault and A.-M. Cretu, Human arm motion imitation by a
humanoid robot, in IEEE International Symposium on Robotic and
some joints start overlapping with other joints. To analyze Sensors Environments (ROSE), 2014, pp. 3136.
these problems, we chose a movement parallel to the ground [4] Y. Yang, H. Yan, M. Dehghan, and M. H. Ang, Real-time human-robot
where the arm starts fully extended and moves towards the interaction in complex environment using kinect v2 image recognition,
in IEEE International Conference on Cybernetics and Intelligent Systems
torso. Fig. 6 shows the resulting path: we can see that the v2 (CIS), 2015, pp. 112117.
maintains a smoother track than the v1. Therefore, we used [5] H. Hisahara, S. Hane, H. Takemura, Y. Chin, T. Ogitsu, and H. Mi-
the Kinect v1 only for the two dimensional approach, but we zoguchi, 3d point cloudbased virtual environment for safe testing
of robot control programs, International Conference on Intelligent
used the Kinect v2 for the three dimensional motion. Systems, Modelling and Simulation (ISMS), pp. 2427, 2015.
[6] H. Furntratt and H. Neuschmied, Evaluating pointing accuracy on
IV. C ONCLUSIONS kinect v2 sensor, in International Conference on Multimedia and
Human-Computer Interaction (MHCI), 2014, pp. 124.1124.5.
A method for the human teleoperation of a humanoid [7] S. Shamsuddin, L. I. Ismail, H. Yussof, N. I. Zahari, S. Bahari,
robot for motion in two dimensions and three dimensions H. Hashim, and A. Jaffar, Humanoid robot nao: Review of control
was presented in this paper. The proposed approach was and motion exploration, in IEEE International Conference on Control
System, Computing and Engineering (ICCSCE), 2011, pp. 511516.
validated using the NAO robot and the Kinect sensor. There [8] J. Shotton, T. Sharp, A. Kipman, A. Fitzgibbon, M. Finocchio, A. Blake,
exist different applications for telepresence. In particular, this M. Cook, and R. Moore, Real-time human pose recognition in parts
work wants to use telepresence as a method for long-distance from single depth images, Communications of the ACM, vol. 56, no. 1,
pp. 116124, 2013.
teaching with active participation of the users. Moreover, this [9] D. Raunhardt and R. Boulic, Progressive clamping, in IEEE Interna-
work could be used as an interactive tool for teaching children tional Conference on Robotics and Automation (ICRA), 2007, pp. 4414
with autism in order to improve their social abilities. This is a 4419.
[10] A. Robotics, Nao software 1.14. 5 documentation, URL http://www.
1 https://github.com/utecrobotics/nao aldebaran-robotics. com, 2014.
kinect teleop

ISBN 978-1-4673-7835-2/16/$31.002016 IEEE 306 IEEE Catalog Number CFP16LAS-ART

Vous aimerez peut-être aussi