Vous êtes sur la page 1sur 11

Internet Systems

Chapter 21. Multimedia: Audio, Video, Speech


Synthesis and Recognition
The multimedia revolution began on the desktop, with the
widespread availability of CD-ROMs.

Because of bandwidth dependency, we expect desktop technology


to lead Web technology.
Multimedia files can be big.
But streaming audio and video technologies allow the audios and
videos to begin playing while the files are downloading.

Creating audio and video clips to incorporate into Web page often
requires powerful software.
We focus on using existing audio and video clips.

The BGSOUND Element


The simplest way to add sound to a page is with the BGSOUND
element (in the header).
Key properties:

SRC the URL of the audio clip to play

-1 (default): the clip loops indefinitely


LOOP > 0: the number of times to play the clip
0 or < -1: play the clip exactly once

BALANCE between –10000 (only left speaker) and 10000


(only right. The default is 0 (balanced).

VOLUME between –10000 (min) and 10000 (max, the default)

These properties can be set by scripting.

274
Internet Systems

The DYNSRC Property of IMG


In an IMG tag, instead of the SRC property, use DYNSRC if the
value is the URL of a video clip.

Other properties to use with DYNSRC:


LOOP as before
START one of the events fileopen or mouseover

You should also use the ALT property, whose value is text
displayed if the browser can’t handle the clip.

The EMBED Element


The EMBED element embeds a media clip (audio or video) into a
page.
It lets us display a GUI that gives the user direct control over the
media clip.

Key properties:
SRC the URL of the media file
LOOP true to loop indefinitely; else false for just once
HIDDEN true to hide the GUI; default is false

When the browser encounters an EMBED tag, it plays the specified


clip with the player registered to handle the media type on the
client computer.

If the media clip is a .wav (Windows Wave) file, Internet


Explorer typically uses the Windows Media ActiveX control.

275
Internet Systems

Windows Media Player ActiveX Control


Microsoft ActiveX controls are embedded in Web pages displayed
in Internet Explorer.

Embedding the Windows Media Player ActiveX control in a Web


page gives access to the media formats supported by the
Windows Media Player.

The GUI lets the user


• play, pause, and stop a media clip,
• move quickly forward or backward through the clip, and
• control the volume of audio.

Key parameters in the OBJECT element:


NAME VALUE
FileName the URL of the media clip
AutoStart true if the clip plays when loaded
Loop true if the clip plays indefinitely
ShowControls true if the controls are displayed

The values of parameters can be set by scripting.

276
Internet Systems

Microsoft Agent
Microsoft Agent is a technology for interactive animated
characters in a Windows application or Web page.

The Microsoft Agent ActiveX control gives access to four


predefined characters:
Peedy the Parrot
Genie
Merlin
Robby the Robot

These characters allow users to interact with a page in natural ways


(including speech).

The control accepts both mouse and keyboard interaction.


It generates speech if a compatible text-to-speech engine is
installed.
It recognizes speech if a compatible speech recognition engine is
installed.

You can create your own characters with


• Microsoft Agent Character Editor and
• Microsoft Linguistic Sound Editing Tool
Both are downloadable from the Microsoft Agent Web site.

We’ll also look at the following ActiveX controls:


Lernout and Hauspie TruVoice text-to-speech (TTS) engine
Microsoft Speech Recognition Engine

See the references in the text to Microsoft’s downloads and


documentation.

277
Internet Systems

The OBJECT elements for all three of these ActiveX controls are
in the page header.

The CODEBASE property of all three OBJECT elements


specifies the version of the control to download.

Typically, no parameters are given in the OBJECT element.

The Characters collection of an Agent object is accessed with


agent_name.Characters
where agent_name is the ID of the Microsoft Agent object.

To load the character information for one of the characters from


the Microsoft Web site, use the Load method for the collection:
agent_name.Characters.Load( character_name, url );
where
character_name is the name of a character (e.g., “peedy”) and
url is the URL for the character information.

The Character method of the Characters collection takes as


its argument the name of a character (e.g., “peedy”) and returns
a reference (a “character”) to the Agent object (associated with
this character by the Load method).

For example, if the Agent ID is agent, then


parrot = agent.Characters.Character(
“peedy” );
assigns to parrot a reference to the agent object, which
represents the Peedy the Parrot character.

278
Internet Systems

Where agent_ref is a reference to an agent object (a character),


agent_ref.Get( behavior_type, behavior_element );
downloads specific behavior information (behavior_element) for
a type of behavior (behavior_type).
For example, for type “state”, some of the elements are
“Showing”: the behavior when the character is first displayed
“Speaking”: the behavior when the character is speaking
“Hiding”: the behavior when the character disappears
For type “animation”, some of the elements are
“Greet”
“MoveUp”
“GetAttention”
Animation behavior elements are activated with the Play method.

Example:
parrot.Get( “state”, “Showing” );
parrot.Get( “state”, “Speaking” );
parrot.Get( “animation”, “Greet” );
// Display the Showing behavior:
parrot.Show();
parrot.Play( “Greet” );
// Display the Speaking behavior:
parrot.Speak( “Hello!” );
parrot.Play( “GreetReturn” );
The GreetReturn behavior is downloaded with the Greet
behavior.
The Speak() method makes use of the TTS object.
There is also a MoveTo( x, y ) method for a character.

279
Internet Systems

Some of the tags inserted into the text string that’s spoken:
\Pau = n\ Pause for n millseconds.
\Pit = n\ Set the pitch to n hertz, 50 ≤ n ≤400

For speech recognition, the voice commands that can be used to


interact with the Agent object must be registered in the
character’s Commands collection:
agent_ref.Commands.Add( cmd_name, display_string,
recognition_string, enabled_flag, display_flag );
where
cmd_name is the name used in scripting for the command,
display_string is displayed in a pop-up menu when the
character or the Agent taskbar is right-clicked,
recognition_string is the string of words recognized as the
command,
enabled_flag is true if this string of words is currently a
candidate for recognition, and
display_flag is true if the command’s name is listed in the
character’s pop-up menu.
In the recognition string, optional words are placed in []’s.

Example:
parrot.Commands.Add( “order”,
“Order a widget”,
“Order [a widget]”, true, true);

When the Scroll Lock key is pressed, a small rectangular area


appears below the character, eventually announcing that it is
listening for a command.

280
Internet Systems

Some properties of the Commands collection:


Caption the text appearing below the character
Voice the text appearing with the list of commands when the
taskbar is right-clicked
Visible if true, the commands appear in the pop-up menu

281
Internet Systems

Example:
Suppose we have
parrot.Commands.Caption =
“Ordering information”;
When the Scroll Lock key is pressed, the area below Peedy contains
-- Peedy is preparing to listen --
Please wait to speak.
This changes to
-- Peedy is listening --
for “Ordering information” commands
After the user says (for example) “Order a widget”, the following
(hopefully) appears:
-- Peedy is not listening --
Heard “Ordering information”

When a voice command is received, the Agent control’s Command


event fires with the name of the command as a parameter.
For example, the above example would fire the event
Command( order ).

Some other methods for a character (i.e., agent reference):


Activate: Make this the currently active character when
multiple characters appear.
Innterrupt: Interrupt the current animation, and display the
next animation in the queue of animations for this character.
StopAll: Stop all animations of a specified element for this
character.
Example:
parrot.Activate();

282
Internet Systems

RealPlayer ActiveX Control


RealPlayer supports streaming audio (e.g., radio stations) and
video

To embed a RealPlayer object in a Web page, use an EMBED


element with attributes
ID
SRC the URL of the source
WIDTH, HEIGHT the dimensions of the control
AUTOSTART true or false, as before

CONTROLS which controls are available;


Default gives the standard set
the MIME type of the embedded file; for
TYPE
audio, this is
audio/x-pn/realaudio-plugin

Some of these parameters can be set by scripting.

Where rp_object is a RealPlayer object and url is an appropriate URL:


rp_object.SetSource( url ) sets the source URL of the
audio or video stream.
rp_object.DoPlayPause() toggles between pausing and
playing the stream.
It starts playing it after the source is set by scripting.

283
Internet Systems

Embedding VRML in a Web Page


VRML (Virtual Reality Modeling Language) is a markup language
for specifying objects and scenes.
It’s purely text and (like HTML) can be created with a text editor
(e.g., Notepad).
Many 3D modeling programs can save 3D designs in VRML format.

A world is a VRML file (extension .wrl).

Both Netscape and Internet Explorer have free, downloadable


plug-ins for viewing worlds.

In a Web page, use an EMBED element with attributes SRC,


WIDTH, HEIGHT.

The object rendering has controls that allow you (using the mouse)
to change your perspective and to walk around a scene.

284

Vous aimerez peut-être aussi