Vous êtes sur la page 1sur 12

Oziel de Oliveira Carneiro

University of Southern California


Neural Model Write-Up
The movements performed during the usage of sign language by experienced signers show
continuous velocity and acceleration profiles (Jennings and Poizner, 1988), the same way movements
realized in different human activities such as writing, dancing and sports. Therefor to use a model of
one of these movements as base to model signaling seems appropriate. An effective brain model for
writing is the AVITEWRITE model, proposed by Grossberg and Paine (2000), which suggests how the
brain is able to properly learn and replicate the movements involved in handwriting.
Writing movements however are different from signing movements as writing happens in a 2D
environment and signing in a 3D. Also signing involves usage of multiple hand shapes and multiple
points of interest as opposed to just one point used in writing. To accommodate these difference the
model also builds on reach and grasp models, which model arm movement and hand shape control.
Learning to replicate movements can be seen as a cooperative effort of two brain replication
systems, the ability to retrace a template path, presented as a constant visual input (retracing letters in a
paper, for example) or a memory input (internal representation of the path observed), and the ability to
identify movement patterns performed sequentially and replicate this sequence. Retracing abilities are
modeled algorithmically in the AVITEWRITE model, and movement pattern identification is
performed by the Mirror Neuron System (Oztop and Arbib, 2002).
Sign Language Analysis
As any language, signs can be broken down into phonemes. This is important even though when
using a symbol, the actor might not directly think about the phonemes being used, to properly
reproduce the symbol it is necessary for the actor to have all the phonemes being used in his repertoire,
an example of it is the challenge it is for native english speakers to reproduce nasal phonemes from
latin languages, as this kind of phonemes are not in their native repertoire. So the model must be able to
properly learn and replicate a set of sign language phonemes, even if the learning and repetition is
directly tied to the signs that use them. So the first step is to analyze the Sign Language and see what
are the capabilities the model needs to have in order to properly learn and replicate signs.
Stokoe (1976) was the first to propose the analysis of signs by parts as opposed to one whole
entity. He defined signs as having three parts, a hand configuration, a location as referred to the signers
body position and a movement pattern. However the system proposed by Stokoe didn't addressed the
sequential characteristics in the sign, the non-manual component, and had a limited set of symbols
which wasn't enough to properly specify all the needed details in a sign (Valli and Lucas, 2000).
Later Lidell and Johnson (1989) proposed a system called Movement-Hold, that breaks down
signs into small sequential segments which can be defined as two categories: (a) movement, segments
that contain any change on the state of the body elements being used; (b) and hold, segments where the
state of the body parts suffer no change for a brief time period. Each segment is then described by
smaller parts that can be clustered in two groups.
The first group describes the characteristics of the segment using five types of information: (a)
major class of segment, defines if it is a movement or a hold segment; (b) contour, describes the path
the signs follows; (c) plane of contour, indicates the plane where the contour happens; (d) quality,
indicates fine details of the segment, as shortened, prolonged, touching (as in brushing movements);
and (e) local movements, defining movements happening in specif points (fingers, for example) as
opposed to hand as a whole.
The second group characterizes the articulation of the sign with nine descriptors: (a) hand
configuration, describing the hand shape used (fingers and thumb configurations) and inclusion of the
forearm in the sign; (b) location, identifies the place where the articulation will be located; (c) hand

part, the part of the hand that will anchor the hand to the location of the sign; (d) proximity, how near
(or touching) the part of the hand is to the location; (e) spatial relationship, describing the direction at
which the part of the hand is offset from the location; (f) facing hand part, indicating the part of the
hand that will point at a specific location; (g) facing location, target location of the pointing
described in (f); (h) orientation hand part, identifies the part of the hand (different from facing hand
part) that will be pointing to a specified plane (usually the ground); and (I) orientation plane, plane used
as reference for the orientation hand part.
Two handed signs are described as two chains of segments, one for the strong hand (dominant
hand), and other for the weak hand. Non-manual signs are directly described after the manual signs are
indicated.
AVITEWRITE
The AVITEWRITE model (Grossberg and Paine, 2000), aims to replicate the process occurring
in the brain while learning to write by retracing letters. The model also includes volitional control of
speed and shape sizes, meaning that once a letter is learned, variations in shape and size can be
induced. It also uses the concept of area of attention, which is an error threshold, meaning that while
retracing the subject actor wants to contain his movements to not diverge a certain distance from the
template trace.
The model assumes that while trying to retrace a template curve, the brain uses a sequence of
targets (selected one at the time) in the curve, and reach movements are done in between them. As the
retrace of a certain shape is repeated multiple times, the brain starts to assimilate the pattern of
neuromuscular activation needed to replicate the shape. Smoothness of the movement will be a result of
the brain structure. AVITEWRITE doesn't model internal representations and transformations from
visual (targets) and sensorial (current position of the hand) inputs, it simply assumes that the brain
already properly represented them in the same positioning mapping.
The model takes advantage of a hypothesis stating that motor control is organized in synergies,
or groups of neurons and muscles that are have their activities coupled to perform a basic movement.
Asynchronous and synchronous activations of these basic synergies will lead to more complex
movements, figure 1 shows how synergies for vertical and horizontal movement could be activated to
perform three different strokes. This idea implies simplification of the movement control process, since
now the planing is done related to synergies, and the individual control of muscles and joints is done

separately after the planning, reducing then the degrees-of-freedom involved in the planing phase.
Fig. 1: Activation of different synergies, one synergy controlling positive x (line) and other controlling positive y (dotted
line) with different timing relationship (a) and (c) illustrate asynchronous activation, and (b) synchronous activation.

The AVITEWRITE is built as an extension to the VITEWRITE model (Bullock, Grossberg and
Mannes, 1993), which is an extension to the VITE model (Bullock and Grossberg, 1989). The VITE
models reach to target movements as synchronous control of synergies. It includes volitional start of
action and speed control, and, as the AVITEWRITE, assumes the target and hand position are already
represented in the same frame. No learning is involved in the VITE component used in the

AVITEWRITE model. Figure 2 displays the block diagram for the VITE model.

Figure 2: VITE diagram. (a) The match interface continuously computes the difference between the target position vector
(TPV) and the present position vector (PPV) and adds the difference vector to the PPV. (b) The difference vector is
multiplied by the GO signal which will trigger execution and scale the rate of update in the PPV (Bullock and Grossberg,
1989).

The VITE model works by computing a Difference Vector (DV) from the Target Position Vector
(TPV) as compared to the Present Position Vector (PPV). The DV is multiplied with the GO signal,
representing the volitional control signal for execution and speed, and then integrated to the PPV. GO is
never negative, and since it is multiplying DV it will indicate the rate of integration in PPV. To handle
multiple synergies the circuit just needs to be replicated for each synergy to be controlled. It its
necessary to point out that for the circuit to properly produce movements with bell shaped velocity
profile, the GO function has to be proportional to:

tn
,if t>0
g (t)= n +t n
, where n>1, =1, =1
0
, t0
However VITE can only perform synchronous activations of synergies. To address this issue
VITEWRITE was proposed by Bullock, Grossberg and Mannes (1993).
The VITEWRITE uses the same integration to target principle as the VITE, however its
circuitry is extended to allow synchronous and asynchronous activation of synergies. This characteristic
is designed based on the principle that asynchronous activation of different synergies happens at the
peak of velocity profile of the first activated synergy. Figure 3 shows the block diagram for the
extended circuitry for the VITEWRITE model.

Figure 3: VITEWRITE model block diagram. Given a predefined Vector Plan (a set of DV p), it iterates among them scaling
them by GRO (volitional size control), these are integrated to become the target position vector (TPV), which is compared
to the present position vector (PPV) to compute the difference vector (DV m) that will be scaled by the GO signal (volitional
speed control) and integrated to the PPV. The DVm*GO is also the signal that triggers the change in DVp.

The VITEWRITE model has the ability to perform predetermined asynchronous activations of
different synergies. The predetermined pattern of movement is described as a set of difference position
vectors (DVp) which will be called individually iteratively. At the start of movement the first DVp is
passed down the model from the Vector Plan, and is then scaled by the GRO signal (volitional control
of size). This will be used to set up a target by integrating the DVp with the previous target position
vector (TPV), before any movement happens TPV is set to be equal to the Present Position Vector
(PPV). Now that the target is set the VITE circuit can be used to create the movement. The scale
Difference Vector (DVm) computed in the VITE circuit is used to indicate to the Vector Plan that a new
DVp needs to be passed down, this happens when the velocity profile of the current movement reaches
it peak. Now the Vector Plan can either indicate a new target position, which will lead to curvature
movements, or it can simply let the synergy finish its current movement before setting a new target,
leading to a concatenation of straight strokes.
The VITEWRITE however is not able to learn the activation pattern that makes the synergies
perform the desired stroke, it can only replicate the stroke as a set of sequential targets that guide the
movement. To resolve this issue the AVITEWRITE model was proposed by Grossberg and Paine
(2000), using the VITEWRITE and VITE as base principles, and incorporating a learning mechanism.
They propose that the memory for the learned sequence of activations of synergies that form a letter are
stored in the Cerebellum, and model it using a Purkinje Cell Spectrum.
The Purkinje Cell Spectrum is a series of phase delayed Purkinje Cells depolarizations that are
triggered by the conditional stimulus and upon presence of a unconditional stimulus (learning signal)
these depolarizations suffer Long Term Depression, which after enough learning experience will be
enough to remove the inhibition the Purkinje cell applies to the nuclear cells. An example of the
Purkinje Cell Spectrum is shown in Figure 4. In the right side we can see that some of the cells in the
spectrum have already suffered LTD.

Figure 4: Purkinje Cell Spectrum. (a) Before Long Term Depression and (b) After Long Term Depression to some cells.

In the AVITEWRITE model each Purkinje Cell Spectrum are responsible to storing the
activation pattern of a synergy during the performance of the movement of a specific letter. The circuit
for the AVITEWRITE model is shown in Figure 5.

Figure 5: AVITEWRITE model block diagram. PC: Purkinje Cell; mf: mossy fiber; cf: climbing fiber .

The learning phase in the AVITEWRITE model works through retracing a template curve. The
model uses an algorithm for target selection in the presented path and then it uses the target to perform
a straight reach movement towards it. The algorithm constrains the target to be reachable without
breaking a threshold distance from the original template.
As the targets are reached the Purkinje Cell spectrum slowly learns through LTD the pattern of
activation for each synergy. The Purkinje Cell spectrum output is them summed into R. R then passes
through a buffer, which works as a discrete sampler, to become WM. When WM is strong enough it
suppresses the activity coming from the reactive system, and the movement is performed only based in
the memory information provided by WM. The Difference Vector (DVs) is computed as the sum of
WM and the reactive DVvis, which is then scaled by the GRO signal (volitional size control). The DVs

is then integrated to the PPV following the integration rate indicated by GO signal (volitional speed
control), and also integrated in a forward model of the target for comparison with the current PPV that
gates the read out of the Working Memory Buffer.
If the memory movement breaks the threshold distance from the template curve the reactive
system takes control again and updates the Cerebellum module to perform better using the error signal
(TPV-PPV). After enough repetition the memory is able to control independently the movement.
Cerebellum Model in Prism Adaptation
Arbib, Schweighofer and Thach (1994) modeled the role of the Cerebellum when adaptation to
the use of wedge prisms spectacles while performing a throwing to target task. The task consists in
throwing a dart or a ball to a target initially with no prisms, then with prisms, and then without the
prisms. The results from human experience showed that the brain is able to adapt to the presence of the
prisms which led to the subjects being able to hit the target while wearing the prisms, but when the
prisms were removed, the subjects demonstrated error opposite to the initial error when the prisms
where first introduced, but with similar amplitude, before adapting to perform an accurate throw again.
It is also shown that when the mechanics of the throw are changed right after the adaptation to
the presence of the prisms is complete and the prisms were removed, the subject either didn't made an
error or it didn't take many throws to make an accurate throw, and when the throw mechanics were
back to the ones used during adaptation the mistakes were present, indicating that the correction
happens to the aiming process related to a specific throwing mechanic, suggesting aiming is coupled
with the type of throw. So the Cerebellum acts as a rotational bias to the aim of a specific type of throw,
compensating for the effect of the prisms.
The model assumes some simplifications of the throwing process: (a) the accuracy of the throw
is only dependent of the horizontal position of the shoulder; (b) the throw mechanics are distinguished
by the vertical position of the shoulder (overarm or underarm throws); (c) positions are all encoded in

the same reference frame. The model neural network is displayed in Figure 6.
Figure 6: Arbib, Schweighofer and Thach (1994) cerebellum model for prism adaptation. Mossy Fibers carry the desired
shoulder position (before adaptation) for the aiming and the knowledge of the presence of the prisms. Arrow connectors
mean excitatory input, and cricle connectors inhibitory input. The visual feedback encodes the learning signal as how off the
target (to the left or to the right and how intense the error is). The output of the dentate cells provide the correction signal to
the throwing movement.

The model works by receiving input from the mossy fibers encoding the knowledge of the
presence of the prisms or not, and the desired position of the shoulder to the process before adaptation,
this is stored in the granule cells that are regularized by the Golgi cell. The visual feedback from the
throw is inputed to the inferior olive cells, and it indicates how off the target the throw was (side and
intensity of the error). The purkinje cells receive weighted input from the granule cells, and the learning
signal from the climbing fiber from the inferior olive cells. The purkinje cells will then apply inhibition
to the dentate cells, which will output the throwing correction signal. This Cerebellum model is
inserted in a bigger model of the throwing procedure, which is displayed in Figure 7.

Figure 7: Brain model for the throwing process proposed by Arbib, Schweighofer and Thach (1994). The planing of the
aiming process is done based on the gaze direction and passed to the premotor cortex, which also receives the corrective
signal from the cerebellum, combining these two and passing the correct positioning to the motor cortex, which moves the
shoulder to the correct position. The result of the throw is used then as teaching signal to the adaptive mechanism in the
cerebellum.

The schematic of the model clearly defines the correction being learned in the cerebellum, and
not in the premotor cortex, which is based on the experimental results showing that subjects with lesion
in the cerebellum are able to properly aim when no prisms are involved, but are not able to adapt when
the prisms are on.
General Cerebellar Module
Most models of the Cerebellum tend to focus on the specific tasks the Cerebellum performs
while the brain performs certain tasks, in this way the cerebellum modules tend to be somewhat
simplistic, such as the Purkinje Cell Spectrum used in the AVITEWRITE model. In an attempt to create
a more generic module that would be able to explain all different behaviors of the Cerebellum, on-line
control, adaptation and timing, in different tasks, Spoelstra (1999) proposed the module shown in
Figure 8.

Figure 8: Cerebellar General Module proposed by Spoelstra (1999). Color filled connections indicate inhibitory input, and
white connections indicate excitatory input.

As seen before the Mossy fibers carry the circuit inputs which will trigger the activity in the
module, the Golgi cell regularizes the activity in the Granule cells as seen in Arbib, Schweighofer and
Thach (1994), the Purkinje cells are activated by the Granule Cells output. The Purkinje cells response
to the parallel fibers will then inhibit activation of the nuclear cells that are also triggered by the mossy
fibers input, the Nuclear cells activity is the output of the module. The output is also compared with the
desired outputs (when present) in the Inferior Olive which is passed through the Climbing fibers to the
Purkinje cells as learning signal.
Spoelstra then replicates the throwing task used in the Arbib, Schweighofer and Thach (1994)
model, but now using the Cerebellar General module. The output of the Cerebellar module is also
placed in a different place, as it is directly connected to muscle control and not the premotor cortex,
similarly to what is done in AVITEWRITE. The model was able to qualitatively reproduce
experimental data from humans performing the task.
Next Spoelstra uses the Cerebellar module as a feed-forward controller to apply corrective
signal to the feedback-controller in reaching tasks in the presence and absence of a disturbing forcing
field, speeding the reach as it now does not solely depend of the feedback signal, which has delays, but
also on a state estimation provided by the Cerebellar module. As the feedback controller is considered
to be present at the premotor and motor cortex (Fagg and Arbib, 1998) the Cerebellar output is directed
connected to the spinal cord along with the feedback-controller output to perform the movements. The
learn signal for the cerebellar module in this configuration is derived from muscular feedback. The
model was able to properly replicate several reaching tasks , while reaching at fast speeds.
The last model configuration for the Cerebellar module is its usage as feed-forward estimator
and temporal coordination in Reach and Grasp tasks. In this model the Cerebellar module estimation is
connected to the feedback-controller in the premotor cortex to be used as trajectory generation, and the
temporal coordination capabilities are used to properly synchronize the grasp and reaching modules.
The results showed that the inclusion of the Cerebellar module reduced considerably the overshooting
and oscillation present in the delayed state feedback.
Ultimately the author proposes a final theory for Cerebellum connectivity in the control of reach
and grasp. This theory is summarized in Figure 9.

Figure 9: Summary of theory of Cerebellum function during control of reach and grasp movements proposed by Spoelstra
(1999).

In the proposed theory the Cerebellum is divided into three sub-modules. The Vermis is
responsible for contribute to postural control, or the ability to keep limbs in static position. The
Intermediate Cerebellum functions as the inverse dynamics controller creating the ability to perform
fast movements and adaptive correction on perturbed conditions. And the Lateral Cerebellum used as
state estimator for improving the performance of the control signal going through the premotor cortex
and motor cortex.
Model Outline
Since the motor control of one arm and hand is being modeled, there is no need to provide
machinery for the non-manual phonemes. But the model will need to be able to properly replicate
hand-shape, location of the hand, hand orientation and the trajectory the hand performs during the
movement phases. This representation also allows us to identify the elements of the motor control that
don't have sign related constraints, such as elbow and shoulder, and therefore when learning or
performing a sign the model will have certain freedom to use the redundant degrees-of-freedom in the
human arm.
The next step on the development is to sketch a controller diagram representing the overall
dynamics the model will try to replicate. This will facilitate the identification of the proper brain
regions that need to be modeled and their roles. This diagram is designed using the AVITEWRITE
model as starting point. So the model will reproduce the following control diagram:

Figure 3: Control diagram for model proposed

The first two modules to be developed need to be that hand and arm controllers, as they proper
function is essential for the learning and performance of the other modules. As reaching models point
(Fagg and Arbib, 1998) , the primary motor cortex is responsible for direct control of the hands, with
feedback being provided by the sensory area S1 and visual feed back.
The target replication can be modeled following two distinct paths. The first idea is to follow
how AVITEWRITE models it, as an algorithmic module. The algorithm picks the furthest target in the
template trace that it is able to reach without breaking the error threshold, defined as a distance to the
original trace. Reach models identify the premotor cortex as the area responsible for target selection
during movement planing. So one way to model the target replication algorithm using neuron models,
is to create a subnetwork that will learn to trace a given path (already broken into discrete points). The
learning could be implemented using reinforcement learning where the error and the number of points
used could be used as punishment measures, since the idea of the trace algorithm is to perform the
minimum sequence of reach movements (straight movements) that is enough to replicate the shape
without braking the error threshold. This module would be trained before the rest of the model, since it
is proper functionality is need to proper training of the long term memory.
Other way to model the target replication is to use Mirror Neuron Systems models (Oztop and
Arbib, 2002), which models the ability of the brain to identify and mimic movements patterns. In this
case the trace would be the visual memory of someone performing the desired signal, and the module
would properly identify the patterns and transform this information to motor controls that will then
properly replicate the signal. The learning will then happen through properly associating the sequence
of identified movement patterns (hand configuration, path of movement, and so on) with the
correspondent sign.
The long term memory module is responsible to learn the signal being replicated by the target
replication module while associating it to its identification. Once learned the memory should be able to,
given a sign identification, pass as input to the motor controllers the correct sequence of activations
that will trigger the correct movements from the effectors.
The Sign Generator won't be modeled here, it will directly replicate the identification coding
that the MNS uses during learning of a sign, to trigger the long term memory to perform the desired
signal.

References
Arbib, Michael A., Nicolas Schweighofer, and W. Thach. "Modeling the role of cerebellum in prism
adaptation." From Animals to Animats 3. Proceedings of the Third International Conference on
Simulation of Adaptive Behavior. 1994.
Bullock, Daniel, and Stephen Grossberg. "VITE and FLETE: Neural modules for trajectory formation
and postural control." Volitional action (1989): 253-298.
Bullock, Daniel, Stephen Grossberg, and Christian Mannes. "The VITEWRITE model of handwriting
production." CAS/CNS Technical Report Series 011 (1993).
Fagg, Andrew H., and Michael A. Arbib. "Modeling parietalpremotor interactions in primate control
of grasping." Neural Networks 11.7 (1998): 1277-1303.
Grossberg, Stephen, and Rainer W. Paine. "A neural model of cortico-cerebellar interactions during
attentive imitation and predictive learning of sequential handwriting movements." Neural
Networks 13.8 (2000): 999-1046.
Jennings, Peggy J., and Howard Poizner. "Computergraphic modeling and analysis II: threedimensional reconstruction and interactive analysis."Journal of Neuroscience Methods 24.1 (1988): 4555.
Liddell, Scott K., and Robert E. Johnson. "American Sign Language: The Phonological Base." Sign
language studies 64 (1989): 195-278.
Oztop, Erhan, and Michael A. Arbib. "Schema design and implementation of the grasp-related mirror
neuron system." Biological cybernetics 87.2 (2002): 116-140.
Spoelstra, Jacob. Cerebellar learning of internal models for reaching and grasping: Adaptive control in
the presence of delays. Diss. University of Southern California, 1999.
VB Comments
1. Find a few signs and decompose them. Show which ones you are going to be able to handle,
which ones you wont tackle (why etc..). Minimal pairs: pairs of signs that differ only in one
phonemic dimension.
2. Think about (multiple) ways to encode those signs, ie motion capture (if so how many sensors,
where should they be positioned, how do you handle the hand?(glove), etc..), arm model, etc
Are there existing databases?? Or look for papers/lab that did motion capture (or whatever) and
we can email them asking for their data!
3. Model: Make sure you incorporate more of the AVITEWRITE discussion to anchor your own

model (VITE -> VITEWRITE -> AVITEWRITE -> OZIELS MODEL), while also showing
what is insufficient in Grossbergs work or what does not fit our purpose:
a. We dont want to give it a trajectory: Start with FARS, MNS -> Discuss these two models
b. Problem of body location of the sign: trajectory needs to be properly located with respect to some
body
parts. How do you incorporate that in your model?
4. Break down the model into more precise subparts: give more specs!

Vous aimerez peut-être aussi