Vous êtes sur la page 1sur 77

Course SHS

Program in Cognitive Psychology


Spring 2007

Human-Robot Interaction
Social learning and skill acquisition
via teaching and imitation

Aude G. Billard

Learning Algorithms and Systems Laboratory - LASA


EPFL, Swiss Federal Institute of Technology
Lausanne, Switzerland

aude.billard@epfl.ch
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
A.G. Billard - SHS Program in Cognitive
Psychology
Calinon, S. and Billard, - Spring
A. (2007) Incremental 2007
Learning of Gestures by Imitation in a Humanoid Robot. in
http://lasa.epfl.ch
Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction (HRI).
Gesture Recognition
How are actions perceived?
How is information parsed?

Imitation
Level of granularity: What is copied?
Should it copy the intention,
goal or dynamics of movement?

Motor Learning
How is information transferred
across multiple modalities?
Visuo-motor, Auditor-motor
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Gesture Recognition

Biological Robotic
Inspiration Learning by Imitation Implementation

Motor Learning

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
BIOLOGICAL INSPIRATION

Prior to building any capability in robots, we


might want to understand how the equivalent
capability works in humans and other
animals

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Imitation Capabilities in Animals
Which species may exhibit imitation is still a main
Gesturearea
Recognition
of discussion and debate

One differentiate true imitation from copying


(flocking, schooling, following), stimulus
Biological enhancement, contagion or emulation
Inspiration Learning by Imitation

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Imitation Capabilities in Animals
Copying and Mimicry: Rats, Monkeys
Gesture
Observe Recognition
companion actor rats performing
different spatial tasks differing according to the
experimental requirements. After the observational
training, surgical ablation to block any further
Biological learning
Inspiration Learning by Imitation

Legio et al, Brain Res. Protocols, 2003


A.G. Billard-Heyes,
SHSTrends
Program in Cognitive
in Cog. Sciences, 2001

Psychology - Spring 2007 http://lasa.epfl.ch


Imitation Learning
Imitation Capabilities in Animals
The observer rats displayed exploration abilities
Gesture Recognition
that closely matched the previously observed
behaviors.

Biological
Inspiration Learning by Imitation

Legio et al, Brain Res. Protocols, 2003


A.G. Billard-Heyes,
SHSTrends
Program in Cognitive
in Cog. Sciences, 2001

Psychology - Spring 2007 http://lasa.epfl.ch


Imitation Learning
Imitation Capabilities in Monkeys

Gesture Recognition
Subjects who saw the Lever demonstrations tended to
use a levering movement to pop open the lid whereas
subjects who viewed Poke, as well as the controls, did
not display this behavior at all.
Biological
Inspiration Learning by Imitation

A.G. Billard - Whiten


SHS etProgram inComparative
al, Journal of Cognitive Psychology, 1996
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Imitation Capabilities in Animals
True imitation: Ability to learn new actions not part
Gesture Recognition
of the usual repertoire
The appanage of humans only, and possibly great apes

Biological
Inspiration Learning by Imitation

Whiten & Ham, Advances in the Study of Behaviour, 1992


Savage & Rumbaugh, Child Devel, 1993
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Imitation Capabilities in Animals
Complex Imitation capabilities in Dolphins &
Gesture Recognition
Parrots. Large repertoire of imitation capabilities,
demonstrating flexibility and generalization in
different contexts.

Biological
Inspiration Learning by Imitation

Moore, Behaviour, 1999.


- SHS
A.G. Billard Program
Herman, Imitation inin Cognitive
Animals & Artifacts, MIT Press, 2002

Psychology - Spring 2007 http://lasa.epfl.ch


Imitation Learning
Developmental Stages of Imitation
Innate Facial Imitation (newborns 3 months)
Gesture Recognition
Tongue and lips protrusion, mouth-opening, head
movements, cheek and brow motion, eye blinking
Delayed imitation up to 24 hours
Imitation is mediated by a stored representation
Biological
Inspiration Learning by Imitation

Meltzoff & Moore, Early Development and Parenting, 1997


A.G. BillardMeltzoff
- SHS Program
& Moore, in Cognitive
Developmental Psychology, 1989
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Developmental Stages of Imitation
Deferred and delayed imitation - 18 month (Piaget),
Gesture Recognition
9-12 months (Meltzoff)
Deferred imitation of novel behavior
67% of the infants who saw the display reproduced the act
after the week's delay, as compared to 0% of the control
Biological infants who had not seen the novel display.

Inspiration Learning by Imitation

A.G. Billard Piaget,


- SHS Play, Dreams and Imitation in Infancy, 1962 ;
Program in Cognitive
Meltzoff, Body and the self, 1995
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Goals and Intentions
Infants aged 14 months.
Gesture
Children Recognition
imitate new action to achieve the same goal
only if they consider it to be the most rational
alternative.

Biological
Inspiration Learning by Imitation

Gergely, Bekkering, Giraly, Nature415,755,2002


A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Goals and Intentions
18-months infants
Gesture Recognition
Differentiate between human and machine
demonstration
Attribute intentions only to the human

Biological Learn from unsuccessful examples


Inspiration Learning by Imitation

Meltzoff, Dev. Psychol. 31, 1995.


A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Goals and Intentions
Imitation is hierarchical and goal-directed
Gesture motions:
Single-hand Recognition
accurate ipsilateral imitation,
48% subsitution for crosslateral imitation
Two-hand motions: only 10% substitution for
crosslateral imitation.
Biological
Inspiration Learning by Imitation

Two-phase motion eliminates mistakes


Adding constraints of hand gestures increases mistakes
Bekkering, Wolschlager & Gattis, Quart. J. of Exp. Psych, 2000
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Imitation in adults
Reaches highest level of complexity
IsGesture Recognition
present in all activities:
Social influence in establishing group norms; collective
frame of reference, transmission of phoebias

Biological
Inspiration Learning by Imitation

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Imitation Capabilities in Adults
Movement observation influences movement execution
Gesture Recognition
Priming process occurs involuntarily and is not under
the actors control.

Biological
Inspiration Learning by Imitation

A.G. Billard - Brass,


SHSBekkering,
Program inActa
Prinz, Cognitive
Psychologia, 2001
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Neural Correlates
Mirror Neuron System F5 Area of Monkey M1
Gesture Recognition

Biological
Inspiration Learning by Imitation

Gallese et al, Brain, 1996. ; Rizzolatti et al, Cog. Brain Res., 1996
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Neural Correlates
Mirror Neuron System locus of visuo-motor
Gesture Recognition
transformation (STS, PM, Broca)

Biological
Inspiration Learning by Imitation

Iacoboni et al, Science 1999


Arbib, Billard, Iacoboni, Oztop, Neural Networks, 2000.
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning in Animals
Take-Home Message
Range of imitative behaviors in animals
Increasing in complexity across species

Stages of development in children imitation


innate facial imitation
inferring goals
hierarchy of goals driving imitation
(hand motion takes precedence over arm gesture and location in
space)

Imitation in adulthood is influenced by mvmt observation,


handedness, orientation of the demonstrator

The underlying neural mechanisms are not yet completely


deciphered
A better understanding of those would help shed light on
A.G. Billardlevels
the different - SHSof Program
imitation ininanimal
Cognitive
behavior
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning in Animals
Take-Home Message

Advantages: When is Imitation useful?

It is a powerful means of transferring skills

It speeds up the learning process by showing


possible solutions or conversely by showing bad
solutions

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning in Animals
Take-Home Message

Disadvantages:
When is Imitation not useful?

Not appropriate: When a good solution for the


teacher is not a possible solution for the learner

Disadvantageous: When it induces you in error -


bad teacher (e.g. phoebia of spiders)

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Gesture Recognition

Biological Robotic
Inspiration Learning by Imitation Implementation

Motor Learning

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
The Transfer Problem
Demonstrator Imitator

, , , ,
1 2 3 ? 1 3

4 4


x x1 , x2 , x3 x x1, x2 , x3
, , , ,
5 6 7
5 6 7

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
What to imitate?

x x Same Object, same target location

d d Same direction of motion

v v Same speed, same force

Same posture

, ,
1 2 3
, ,
1 2 3

4

x x1 , x2 , x3 x x1, x2 , x3
4


, ,
5, 6, 7 d
5 6 7 d

v Billard - SHS Program in Cognitivev
A.G.
Psychology - Spring 2007 http://lasa.epfl.ch
How to Imitate?
The correspondence problem
Demonstration Imitation
?

No solutions (smaller range of motion)

Find the closest solution according to a metric

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Following an imitation mechanism
While following the teacher, the learner robot learns to
Gesture Recognition
associate a word with a meaning in terms of sensory inputs

Robotic
Learning by Imitation Implementation

Billard et al, ESANN1997,


Billard & Dautenhahn, Robotics &
A.G.
Autonomous Systems Billard - SHS
1998,
Billard & Hayes, 99,00
Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Following an imitation mechanism

Gesture Recognition
Teaching path in a Maze
Demiris & Hayes, 1994, 1996;

Teaching how to climb a hill


Dautenhahn, Robotics & Autonomous Systems, 1995 Robotic
Learning by Imitation Implementation
Teaching a path in the environment
Billard & Hayes, Adaptive Behavior, 1999
Moga, Gaussier, Applied Artificial Intelligence, 2000
Kaiser et al, Robotics & Autonomous Systems, 2002
Nicolescu & Mataric, AGENTS 2003

Teaching a vocabulary
Billard 1997, 1998, 1999
Vogt & Steels, ECAL, 1999

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
One-Shot Learning Methods
Segmentation of demonstration into primitives
Gesture
Classification of gestures into Recognition
predefined states (e.g.
grasp, collision)
Built-in controller for producing sequences of states

Robotic
Learning by Imitation Implementation

Kuniyoshi et al. IEEE Trans. on Robotics and Automation,1994.


Dillmann et al, Robotics & Autonomous Systems, 2001.
Ritter et al, Rev
A.G. Billard
Neuroscience, - SHS Program in Cognitive
2003
Aleotti et al, Robotics & Autonomous Systems, 2004.
Psychology - Spring 2007 http://lasa.epfl.ch
Robot Programming by Demonstration
One-Shot Learning Methods

Sensors: Data Gloves, Fixed cameras, Speech processing


Actuators: Mobile robot, 7 DOF arm, 2 fingers Gripper

R. Dillmann, Robotics & Autonomous Systems 47:2-3, 109-116, 2004


A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
One-Shot Learning Methods
Explicit teaching/learning:
- Reasoning about tasks Gesture Recognition
- Verbal instructions

Gesture Recognition:
For each sensor a context-dependent Robotic
model based on background Learning by Imitation
knowledge Implementation
is provided: opening the refrigerator
door, extracting the bottle and closing
the door

Task Reproduction:
Store action sequences in a tree-like
structure of macro-operators

R. Dillmann, Robotics & Autonomous Systems


A.G. Billard - SHS Program in Cognitive
47:2-3, 109-116 2004
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Robot Programming by Demonstration:
Grasping
Gesture Recognition
Because of the large range of
possible shapes, generalizing pre-
programmed grasps to new and
general objects is a rather hard task:
Robotic
Learning by Imitation Implementation
Orientation of the hand
Positioning of the fingers
(correspondence problem!)
Tactile forces, stable object contact

A.G. Billard - SHS Program in Cognitive


Steil et al, Robotics & Autonomous Systems 47:2-3, 129-141, 2004
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Robot Programming by Demonstration:
Grasping
Gesture Recognition
(i) a nave imitation strategy, in which
the observed joint angle trajectories
(after their transformation into the
three-finger geometry) were directly
applied to control the fingers Learning
of the
Robotic
by Imitation Implementation
TUM hand during the grasp, until
complete closure around the object

(ii) a strategy in which the visually


observed hand posture is matched to
the initial conditions of a power grip, a
precision grip, a three-finger and two-
finger grip, respectively, in order to
identify the grip type.

A.G. Billard - SHS Program in Cognitive


Steil et al, Robotics & Autonomous Systems 47:2-3, 129-141, 2004

Psychology - Spring 2007 http://lasa.epfl.ch


Robot Programming by Demonstration

Other related works are, e.g.:


Kuniyoshi et al, ICRA, 1994
Aleotti et al, Robotics & Autonomous Systems, 47:2-3, 153-167, 2004
Zhang & Roessler, Robotics & Autonomous Systems 47:2-3, 117-127, 2004

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Learning of Dynamical Systems
Learning the optimal controller
Model of physical system Gesture
(pendulum)
Recognition
Reinforcement and locally weighted learning

Robotic
Learning by Imitation Implementation

A.G.ICML,
Atkeson & Schaal, Billard
1997.-
SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Learning of Dynamical Systems
Locally weighted learning
Learning primitives of the Gesture
system
Recognition

Robotic
Learning by Imitation Implementation

A.G.Schaal,
Ijspeert, Nakanishi, Billard - SHS
ICRA01, Program
NIPS02 in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Learning of Dynamical Systems
Locally weighted learning
Learning primitives of the Gesture
system
Recognition

Robotic
Learning by Imitation Implementation

A.G.Schaal,
Ijspeert, Nakanishi, Billard - SHS
ICRA01, Program
NIPS02 in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning
Learning of Dynamical Systems
The learned trajectory is not sufficient to control the
Gesture Recognition
actual robots walking pattern.
Phase resetting using foot contact information is
necessary.
on-line adjustment of the phase of the CPG by
sensory feedback from the environment is essential to Robotic
Learning by Imitation
achieve successful locomotion Implementation

A.G.
Nakanishi et al, Billard
Robotics - SHSSystems,
& Autonomous Program47:2-3,in Cognitive
79-91, 2004.
Psychology - Spring 2007 http://lasa.epfl.ch
Imitation Learning in Robots
Granularity Cognition
How to imitate?

Level 3: Learning primitives of motion


Level of granularity:

Level of cognition
Level 2: Exact reproduction of trajectories

Level 1: One-shot learning

Level 0: Following an implicit imitation mechanism

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
IMITATION LEARNING VERSUS MOTOR LEARNING

Imitation learning Programming by


Demonstration:
A way to speed up learning, to reduce the
search space
A way to share with the robots the same
vocabulary of motor skills

Self Motor Learning - Reinforcement


Learning
To adapt to novel situations
To adapt the demonstrated motions to the robots
body Correspondence problem
A.G.
(Nehaniv & Billard
Dautenhahn - SHS
Program in Cognitive
1999)
Psychology - Spring 2007 http://lasa.epfl.ch
Learning What to imitate

The robot should learn that the important feature in


this task is that the queen should be moved 2 steps
A.G. Billard - SHS Program
forward in Cognitive
vertically
Psychology - Spring 2007 http://lasa.epfl.ch
Learning How to Imitate

Once the robot has learned the rule of motion for the queen,
it can apply this rule for moving the queen from
locations not seen during the demonstrations
A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
IMITATION LEARNING VERSUS MOTOR LEARNING

Imitation learning Programming by


Demonstration:
A way to speed up learning, to reduce the
search space
A way to share with the robots the same
vocabulary of motor skills

Self Motor Learning - Reinforcement


Learning
To adapt to novel situations
To adapt the demonstrated motions to the robots
body Correspondence problem
A.G.
(Nehaniv & Billard
Dautenhahn - SHS
Program in Cognitive
1999)
Psychology - Spring 2007 http://lasa.epfl.ch
Transmitting human skills and knowledge to robots

A.G. Billard - SHS Program in Cognitive


Learning a Packaging Task
Psychology - Spring 2007 http://lasa.epfl.ch
From Recognizing to Reproducing Gestures

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
From Recognizing to Reproducing Gestures

GMM/HMM Encoding: Recovers generalized


signal by regression
Mixture of k Gaussians

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Learning What to imitate?

GMM over 26 demonstrations


3D velocities of the end effector
Tracking of object stereovision
Tracking of joint trajectories
A.G. Billard - SHS Program in Cognitive
Calion, S., Psychology
Guenter, F. and Billard, -A.Spring
(2007) On2007
Learning, Representing andGeneralizing a Task in a
http://lasa.epfl.ch
Humanoid Robot. IEEE Transactions on Systems, Man and Cybernetics, 37:2. Part B. Special issue on robot
Learning What to imitate?

A.G. Billard - SHS Program in Cognitive


Calion, S., Guenter, F. and Billard, A. (2007) On Learning, Representing and Generalizing a Task in a
Psychology - Spring 2007
Humanoid Robot. IEEE Transactions on Systems, Man and Cybernetics, 37:2. Part B. Special issue on robot
learning by observation, demonstration and imitation.
http://lasa.epfl.ch
Learning What to imitate?

Correlations in the latent space of Hands-Bucket Correlations


the two hands

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
IMITATION LEARNING VERSUS MOTOR LEARNING

Imitation learning Programming by


Demonstration:
A way to speed up learning, to reduce the
search space
A way to share with the robots the same
vocabulary of motor skills

Self Motor Learning - Reinforcement


Learning
To adapt to novel situations
To adapt the demonstrated motions to the robots
body Correspondence problem
A.G.
(Nehaniv & Billard
Dautenhahn - SHS
Program in Cognitive
1999)
Psychology - Spring 2007 http://lasa.epfl.ch
IMITATION LEARNING VERSUS MOTOR LEARNING

Imitation learning Programming by


Demonstration:
A way to speed up learning, to reduce the
search space
A way to share with the robots the same
vocabulary of motor skills

Self Motor Learning - Reinforcement


Learning
To adapt to novel situations
To adapt the demonstrated motions to the robots
body Correspondence problem
A.G.
(Nehaniv & Billard
Dautenhahn - SHS
Program in Cognitive
1999)
Psychology - Spring 2007 http://lasa.epfl.ch
Computing the Metric of Imitation Performance

Hands Paths

Joints Trajectories

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Adapting the Demonstration to Fit the Robots Body

Minimizing H
Under the kinematics constraint:
By Lagrange, we compute
the optimal solution:

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Adapting the Demonstration to Fit the Robots Body

Minimizing H
Under the kinematics constraint:
By Lagrange, we compute
the optimal solution:

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Adapting the Demonstration to Fit the Robots Body

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
1st Limitation of the System

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
1st Limitation of the System

Misses the bucket

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
IMITATION LEARNING VERSUS MOTOR LEARNING

Imitation learning Programming by


Demonstration:
A way to speed up learning, to reduce the
search space
A way to share with the robots the same
vocabulary of motor skills

Self Motor Learning - Reinforcement


Learning
To adapt to novel situations
To adapt the demonstrated motions to the robots
body Correspondence problem
A.G.
(Nehaniv & Billard
Dautenhahn - SHS
Program in Cognitive
1999)
Psychology - Spring 2007 http://lasa.epfl.ch
Dynamic Adaptation of Gesture Reproduction

Dynamical system modulation to be robust to perturbations


(novel context, obstacles, etc)

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Dynamical System Modulation

Different initial conditions


Adaptation to sudden target
displacement

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Different initial conditions
Adaptation to sudden target
displacement

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
Dynamic Adaptation of Gesture Reproduction

Adaptation to different Online adaptation to changes


contexts in the context
A.G. Billard - SHS Program in Cognitive
Hersch, Psychology
M., Guenter, - Spring
F., Calinon, S. and Billard,2007
A. (2006) Learning Dynamical Systemhttp://lasa.epfl.ch
Modulation for
Constrained Reaching Tasks. In Proceedings of the IEEE-RAS International Conference on Humanoid Robots.
2nd Limitation of the System

If the novel situation differs


Importantly from the demonstrated
one, then adapting the
demonstrated trajectory is no longer
sufficient to satisfy the task.

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
2nd Limitation of the System

If the novel situation differs


Importantly from the demonstrated
one, then adapting the
demonstrated trajectory is no longer
sufficient to satisfy the task.

Need to relearn the task -- Reinforcement Learning

Need to define a new metric the reward


A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007 http://lasa.epfl.ch
RL - To Adapt to Novel Situations

Reinforcement Learning episodic Natural Actor Critic (NAC) is


applied to learn a new trajectory, so as to overcome the
obstacle.
Gaussian Stochastic Policy

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
RL - To Adapt to Novel Situations

The robot is rewarded for reaching the obstacle, as well as


staying close to the original demonstrated trajectory.

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
RL - To Adapt to Novel Situations

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
RL - To Adapt to Novel Situations

In each case, at least 1 run fails to


find a solution
Need to run the algorithm several
times A.G. Billard - SHS Program in Cognitive
Psychology - Spring 2007
Time consuming http://lasa.epfl.ch
INCREMENTAL LEARNING

The robot
A.G.records
Billardthe-position of the objects,
SHS Program the position of the teacher's
in Cognitive
hands, the joint angles of the teacher's upper-body motion (motion sensors)
or/andPsychology - Spring
the joint angles 2007upper-body motion (kinesthetics).
of the robot's http://lasa.epfl.ch
INCREMENTAL LEARNING

Movie

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
INCREMENTAL LEARNING

Movie

6 demonstrations of moving the white Knight to catch the black King

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
INCREMENTAL LEARNING

Movie

Trajectories of the hand with respect to the first and second object.
(Left) Superimposed, the trajectories of each of the 6 demonstrations
Middle: Gaussian
A.G. Mixture
Billard - SHSModel, Right,
Program in Gaussian
Cognitive Mixture Regression
Psychology - Spring 2007 http://lasa.epfl.ch
INCREMENTAL LEARNING

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
INCREMENTAL LEARNING

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
INCREMENTAL LEARNING

Movie

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch
SUMMARY

Learning new tasks relies on various means of teaching the


robots.

Imitation learning is useful in so far that it gives hints as to


the optimal solution
The robot must however rely on generic skills of its own to
adapt the demonstration to its own body and to the context

Learning of complex skills is overall relatively slow and must


proceed incrementally

A.G. Billard - SHS Program in Cognitive


Psychology - Spring 2007 http://lasa.epfl.ch

Vous aimerez peut-être aussi