Vous êtes sur la page 1sur 6

A Brief History of Motion Capture for Computer Character Animation

David J. Sturman
104, av. du Prsident Kennedy
75016 Paris France
Reference: "Character Motion Systems", SIGGRAPH 94: Course 9
The use of motion capture for computer character animation is relatively new, having begun in
the late 1970's, and only now beginning to become widespread.
Motion capture is the recording of human body movement (or other movement) for immediate or
delayed analysis and playback. The information captured can be as general as the simple position
of the body in space or as complex as the deformations of the face and muscle masses. Motion
capture for computer character animation involves the mapping of human motion onto the
motion of a computer character. The mapping can be direct, such as human arm motion
controlling a characters arm motion, or indirect, such as human hand and finger patterns
controlling a characters skin color or emotional state.
The idea of copying human motion for animated characters is, of course, not new. To get
convincing motion for the human characters in Snow White, Disney studios traced animation
over film footage of live actors playing out the scenes. This method, called rotoscoping, has
been successfully used for human characters ever since. In the late 1970's, when it began to be
feasible to animate characters by computer, animators adapted traditional techniques, including
rotoscoping. At the New York Institute of Technology Computer Graphics Lab, Rebecca Allen
used a half-silvered mirror to superimpose videotapes of real dancers onto the computer screen to
pose a computer generated dancer for Twyla Tharp's "The Catherine Wheel." The computer used
these poses as keys for generating a smooth animation. Rotoscoping is by no means an automatic
process, and the complexity of human motion required for "The Catherine Wheel," necessitated
the setting of keys every few frames. As such, rotoscoping can be thought of as a primitive form
or precursor to motion capture, where the motion is "captured" painstakingly by hand.
1980-1983: Simon Fraser University Goniometers
Around this same time, biomechanics labs were beginning to use computers to analyze human
motion. Techniques and devices used in these studies began to make their way into the computer
graphics community. In the early 1980's, Tom Calvert, a professor of kinesiology and computer
science at Simon Fraser University, attached potentiometers to a body and used the output to
drive computer animated figures for choreographic studies and clinical assessment of movement
abnormalities. To track knee flexion, for instance, they strapped a sort of exoskeleton to each leg,
positioning a potentiometer alongside each knee so as to bend in concert with the knee. The
analog output was then converted to a digital form and fed to the computer animation system.

Their animation system used the motion capture apparatus together with Labanotation and
kinematic specifications to fully specify character motion.[1]
1982-1983: MIT Graphical Marionette
Soon after that, commercial optical tracking systems such as the Op-Eye and SelSpot systems
began to be used by the computer graphics community. In the early 1980's, both the MIT
Architecture Machine Group and the New York Institute of Technology Computer Graphics Lab
experimented with optical tracking of the human body.
Optical trackers typically use small markers attached to the bodyeither flashing LEDs or small
reflecting dotsand a series of two or more cameras focused on the performance space. A
combination of special hardware and software pick out the markers in each camera's visual field
and, by comparing the images, calculate the three-dimensional position of each marker through
The technology is limited by the speed at which the makers can be examined (thus affecting the
number of positions per second that can be captured), by occlusion of the markers by the body,
and by the resolution of the camerasspecifically for their ability to differentiate markers close
together. Early systems could track only a dozen or so markers at a time. More recent systems
can track several dozen at once. Occlusion problems can be overcome by the use of more
cameras, but even so, most current optical systems require manual post-processing to recover
trajectories when a marker is lost from view. This will change as systems become more
sophisticated. The problem of resolution involves a trade-off of many variables, including
camera price, field of view, and space of movement. The more resolution you need, the more the
camera costs. The same camera can give you greater movement resolution if focused on a
smaller field of view, but this limits the size of motions that are possible. Because of these
limitations, almost all the uses of optical tracking systems today rely on post-processing
procedures to analyze, process, and clean up the data before they are applied to the computer
In 1983 Ginsberg and Maxwell at MIT, presented the Graphical Marionette, a system for
"scripting-by-enactment"one scripts an animation by enacting the motions. The system used
an early optical motion capture systems called Op-Eye that relied on sequenced LEDs. They
wired a body suit with the LEDs on the joints and other anatomical landmarks. Two cameras
with special photo detectors returned the 2-D position of each LED in their fields of view. The
computer then used the position information from the two cameras to obtain a 3-D world
coordinate for each LED. The system used this information to drive a stick figure for immediate
feedback, and stored the sequence of points for later rendering of a more detailed character. The
slow rate of rendering characters, and the expense of the motion capture hardware was the largest
roadblock to the widespread use of this technology for animation production. Since that time,
however, hardware rendering has sped up considerably, and the methods employed in the
Graphical Marionette project are becoming more commonly used for computer character
1988: deGraf/Wahrman Mike the Talking Head

In 1988, deGraf/Wahrman developed "Mike the Talking Head" for Silicon Graphics to show off
the real-time capabilities of their new 4D machines. Mike was driven by a specially built
controller that allowed a single puppeteer to control many parameters of the character's face,
including mouth, eyes, expression, and head position. The Silicon Graphics hardware provided
real-time interpolation between facial expressions and head geometry as controlled by the
performer. Mike was performed live in that year's SIGGRAPH film and video show. The live
performance clearly demonstrated that the technology was ripe for exploitation in production
1988: Pacific Data Images Waldo C. Graphic
As early as 1985, Jim Henson Productions had been trying to create computer graphics versions
of their characters. They met with limited success, mainly due to the limited capabilities of the
technology at that time. Finally, in 1988, with availability of the Silicon Graphics 4D series
workstation, and with the expertise of Pacific Data Images, they found a viable solution. By
hooking a custom eight degree of freedom input device (a kind of mechanical arm with upper
and lower jaw attachments) through the standard SGI dial box, they were able to control the
position and mouth movements of a low resolution character in real-time. Thus was Waldo C.
Graphic born. Waldo's strength as a computer generated puppet was that he could be controlled
in real-time in concert with real puppets. The computer image was mixed with the video feed of
the camera focused on the real puppets so that everyone could perform together. Afterwards, in
post production, PDI re-rendered Waldo in full resolution, adding a few dynamic elements on top
of the performed motion.[4]
Subsequently PDI developed a light-weight plastic upper-body "exoskeleton" to track the
movements of the upper torso, head, and arms so that actors could control computer characters
by miming their motions. Potentiometers on the plastic frame measure body motion which is
picked up by the computer in real-time. They have used the suit in many projects, although they
have not found it to be the ideal body tracking device due to the noise in the electronics and the
encumbering nature of the exoskeleton.[5]
1989: Kleiser-Walczak Dozo
In 1989, Kleiser-Walczak produced Dozo, a (non-real-time) computer animation of a woman
dancing in front of a microphone while singing a song for a music video. To get realistic human
motion, they decided to use motion capture techniques. Based on experiments in motion capture
from Kleiser's work at Digital Productions and Omnibus (two now-defunct computer animation
production houses), they chose an optically-based solution from Motion Analysis that used
multiple cameras to triangulate the images of small pieces of reflective tape placed on the body.
The resulting output is the 3-D trajectory of each reflector in the space. As was described above,
one of the problems with this kind of system is tracking points as they are occluded from the
cameras. For Dozo, this had to be done as a very time-consuming post-process. Luckily, some
newer systems are beginning to do this in software, significantly speeding up the motion capture
1991: Videosystem Mat the Ghost

Having seen the possibility of animating characters by performance techniques in Waldo C.

Graphic, Videosystem, a French video and computer graphics producer, turned the attentions of
its newly formed computer animation division to the problem of computer puppets. The result
was a real-time character animation system whose first success was the daily production of a
character called Mat the Ghost. Mat was a friendly green ghost that interacted with live actors
and puppets on a daily childrens' show called Canaille Peluche. Using DataGloves, joysticks,
Polhemus trackers, and MIDI drum pedals, puppeteers interactively performed Mat, chromakeyed with the previously-shot video of the live actors. Since there was no post-rendering,
animation sequences were generated in the time it took the performers to achieve a good take.
Seven minutes of animation (one week's worth) were normally completed in a day and a half of
performance. Mat appeared on Canaille Peluche every day for over three and a half years.[7]
Videosystem, now known as Medialab, has continued to develop the performance system to the
point where it is a reliable production tool, having produced several hours of production
animation in total, for more than a dozen characters.
Typically, each character is controlled by several puppeteers or actors working in concert. Two
puppeteers control the facial expressions, lipsynch, and special effects such as shape
transformations for Mat the Ghost, or bubbles from the mouth of a fish, and an actor mimes the
upper body motions while wearing a suit with electromagnetic trackers (Polhemus) on the torso,
arms, and head. The finger motions, joystick movements, and so on, of the puppeteers are
transformed into facial expressions and effects of the character, while the motion of the actor is
directly mapped to the character's body.
1992: SimGraphics Mario
SimGraphics has long been in the VR business, having built systems around some of the first
VPL DataGloves in 1987. Around 1992 they developed a facial tracking system they called a
"face waldo." Using mechanical sensors attached to the chin, lips, cheeks, and eyebrows, and
electro-magnetic sensors on the supporting helmet structure, they could track the most important
motions of the face and map them in real-time onto computer puppets. The importance of this
system was that one actor could manipulate all the facial expressions of a character by just
miming the facial expression himselfa perfectly natural interface.
One of the first big successes with the face waldo, and its concomitant VActor animation system,
was the real-time performance of Mario from Nintendo's popular videogame for Nintendo
product announcements and trade shows. Driven by an actor behind the scenes wearing the face
waldo, Mario conversed and joked with audience members, responding to their questions and
comments. Since then, SimGraphics has concentrated on live performance animation, developing
characters for trade shows, television, and other live entertainment.
During the past few years, SimGraphics has been continually updating the technology of the face
waldo, improving reliability and comfort.
1992: Brad deGraf Alive!

After deGraf/Wahrman's Mike the Talking Head, Brad deGraf continued working on his own,
developing a real-time animation system which is now called Alive! For one character performed
with Alive!, deGraf developed a special hand device with five plungers actuated by the
puppeteers fingers. The device was used to control the facial expressions of a computergenerated friendly talking spaceship, who, much like Mario, promoted its "parent" company at
trade shows.[8]
DeGraf subsequently joined Colossal Pictures where he used Alive! to animate Moxy, a
computer generated dog who hosts a show for the Cartoon Network. Moxy is performed in realtime for publicity, but post-rendered for the actual show. The actor's motions are captured by an
electromagnetic tracking system with sensors on the hands, feet, torso, and head of the actor.
1993: Acclaim
At SIGGRAPH '93 Acclaim amazed audiences with a realistic and complex two-character
animation done entirely with motion capture. For the previous several years, Acclaim had quietly
developed a high-performance optical motion tracking system, much like the ones used for the
Graphical Marionette and Dozo, but able to track up to a 100 points simultaneously in real-time.
Acclaim mainly uses the system to generate character motion sequences for video games. Their
system is proprietary and they do not plan to market the technology except as a production
Today: Many players using commercial systems
In the past few years, Ascension, Polhemus, SuperFluo, and others have released commercial
motion tracking systems for computer animation. In addition, animation software vendors, such
as SoftImage, have integrated these systems into their product creating "off-the-shelf"
performance animation systems. Although there are many problems yet to be solved in the field
of human motion capture, the practice is now well ensconced as a viable option for computer
animation production. As the technology develops, there is no doubt that motion capture will
become one of the basic tools of the animator's craft.
[1] T. W. Calvert, J. Chapman and A. Patla, "Aspects of the kinematic simulation of human
movement," IEEE Computer Graphics and Applications, Vol. 2, No. 9, November 1982, pp. 4150.
[2] Carol M. Ginsberg and Delle Maxwell, "Graphical marionette," Proc. ACM
SIGGRAPH/SIGART Workshop on Motion, ACM Press, New York, April 1983, pp. 172-179.
[3] Barbara Robertson, "Mike, the talking head," Computer Graphics World, July 1988, pp. 1517.
[4] Graham Walters, "The story of Waldo C. Graphic," Course Notes: 3D Character Animation
by Computer, ACM SIGGRAPH '89, Boston, July 1989, pp. 65-79.

[5] Graham Walters, "Performance animation at PDI," Course Notes: Character Motion Systems,
ACM SIGGRAPH 93, Anaheim, CA, August 1993, pp. 40-53.
[6] Jeff Kleiser, "Character motion systems," Course Notes: Character Motion Systems, ACM
SIGGRAPH 93, Anaheim, CA, August 1993, pp. 33-36.
[7] Herve Tardif, "Character animation in real time," Panel: Applications of Virtual Reality I:
Reports from the Field, ACM SIGGRAPH Panel Proceedings, 1991.
[8] Barbara Robertson, "Moving pictures," Computer Graphics World, Vol. 15, No. 10, October
1992, pp. 38-44.

Main Animation Page

Mirror from HyperVis. Last changed , G. Scott Owen, owen@siggraph.org