Vous êtes sur la page 1sur 6

Conceptual, Communicative and Pragmatic Aspects of Interaction Forms - Rich Interaction Model for Collaborative Virtual Environments

Tony Manninen Department of Information Processing Science, University of Oulu, FINLAND tony.manninen@oulu.fi Abstract
This paper provides a form-oriented description and applicable model of interaction in the context of Collaborative Virtual Environments (CVE). The construction and the main categories of the conceptual interaction form model are described through examples and previous work. The evaluation and validation of the model is illustrated by delineating the research process conducted by the author. Furthermore, the benefits and limitations of the model are discussed in the light of CVE analysis and design. point of interest, thus, is to find more understanding about the possibilities and effects of rich interaction in the CVE context.

2. Theoretical background and related work


The forms of interaction have been described and discussed in the existing literature in the field of communication, for example, under topics such as nonverbal communication channels and communication codes [2, 8, 10]. The focus and starting point for these models is on the face-to-face interaction between humans who share the same physical place. Within the context of computer-supported co-operative work (CSCW), the focus of research is not limited to specific communication channels. One area of interaction research relates to the embodied actions that cover the movements and actions of the participants who interact with each other and with their environment [16]. However, the interaction forms described in this paper only exist in virtual environments, and thus, the actions occurring in the physical world are out of the scope of this paper. The concept of rich interaction is not only a quantitative measure describing the amount of available interaction forms [8]. However, a basic set of interaction form categories helps the CVE designers to consider all the necessary areas of action representations. Rich interaction can be enabled by a set of interaction forms which is large, flexible and focused on the content. The contextual and communicative support for interaction is essential in providing users with meaningful ways to express themselves and their actions. The richness itself is, at the end of the day, achieved by users who are able to exploit the available interaction forms in an intuitive and non-deterministic style. The related research of interaction forms covers a wide area, ranging from CVE design to communication and collaboration support. This section outlines some of the previous research that is closely related to the line of work described in this paper.

1. Introduction
The lack of intuitive and non-intrusive non-verbal cues is one of the distinctive features that separates computermediated communication settings from face-to-face encounters. One solution to the interaction problems is the CVE design approach that draws from the theories and conceptual models of available interaction forms. The interaction form model, which has been constructed during this research, directs the design to start from the natural areas of interaction. By taking a holistic view of the concept of interaction forms, the model enables designers to take into account all the necessary manifestations and representations of interaction. The aim of this research is to conceptualise and delineate the mutually perceivable interaction forms available for avatar-based CVEs. Interaction forms are actions that can be perceived as manifestations of the user-user and user-environment interaction. These forms are used to convey the actions of the user to oneself, and to others. The forms enable awareness of actions by offering mutually perceivable visualisations and auralisations within the virtual environment. The scope of this work covers the manifestations of interaction. The emphasis is on every action and interaction that can be perceived in CVEs. The mutually perceivable actions and behaviours have been described without tackling the social or cultural aspects of communication, co-ordination and collaboration. The

Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA03) 1087-4844/03 $17.00 2003 IEEE

Benford et al. [4] have created a spatial model of interaction which provides a basic set of abstractions for managing interactions in a wide range of spatial systems. The authors argue that the spatial model of interaction provides a novel and powerful set of abstractions for managing interactions in a variety of large-scale virtual spaces. However, the model does not imply any specific interaction forms that could be used, for example, to represent and execute participants aura, focus and nimbus. Robertson has constructed a taxonomy of embodied actions [16], which consists of individual actions (in relation to physical objects, other bodies, and to the physical workspace) and group activities constituted by individual embodied actions. While Robertson's taxonomy does not contradict the model presented in this paper, the level of abstraction is different. Robertson delineates actions that describe the incentive of the participant. Furthermore, there are actions that can be presented with various interaction forms. Additional descriptions of modes and types of interaction have been presented, for example, in the areas of autonomous agents [9]. The non-verbal communication aspects of CVEs have been studied, for example, in the context of user embodiment [3], communicative behaviour [18], conversational interface agents [6], and realistically expressing avatars [17]. These approaches, however, tend to concentrate on a highly specific and limited interaction support. The approach selected by this author follows, to a certain degree, the lines of the related research. The interaction form model has been constructed in order to obtain a clear overall picture of the concepts related to interaction. The basic theories have been selected from the areas analysing physical world communication in order to enable the continuum in the level of communication. In addition, the relatively mechanistic perspective to interaction (i.e., the focus is on forms instead of functions) is believed to enable user-driven communication, control and collaboration.

3. Construction of the model


The interaction form model has been constructed by collecting theoretical knowledge (e.g., communication literature) and empirical material (video recordings, interviews, walkthroughs, observations, and heuristic evaluations) from networked games, game events and from self-organised gaming sessions [12]. The main aim of the model is to provide as exhaustive a set of interaction forms as possible. However, due to the high number of available interaction forms, the model is structured into several main categories, which, in turn, contain a number of sub-concepts.

The starting point for the conceptual modelling was the interaction form support offered by the existing CVEs. In order to acquire a basic set of currently available interaction forms, a number of contemporary multi-player games have been studied and the material has been expanded with heuristic evaluations [18, 19]. The games that were observed include, for example, a 3D multiplayer action game called Counter-Strike. Additional games that have been studied include action games (e.g., Action Quake, Team Fortress) and role-playing games (e.g., EverQuest, Ultima Online). Some material was also obtained from text-based games and flight simulators. The next step in constructing the concept model was the categorisation of different interaction forms. The basic criteria for categorisation were the closeness and relations of different sub-concepts. For example, all the interaction forms that relate to the appearance of the avatar were grouped together. The construction was based on the conceptual analysis, and thus, offered a somewhat coherent illustration of the corresponding interaction forms. The first versions of the model were constantly refined and revised according to the additional evaluations of games and other CVEs. The theoretical framework that would support the aforementioned preliminary model was applied on a later stage. Before the integration, the preliminary model was used as a basic design guideline in constructing an empirical experiment called Tuppi3D. The experiment was then evaluated by using the synthesised model of non-verbal communication channels (hereinafter referred to as NVC) described in the communication literature (cf. [2, 8, 10]). The perceivable interaction forms encountered in the Tuppi3D experiment were analysed based on the NVC model. Manninen and Kujanp [14] have elaborated on the experiment and the outcome of the analysis. After the analysis of perceivable non-verbal communication forms in the experiment, the concept model was modified in a way that the structure of nonverbal communication forms created the backbone of the model. The categories from the preliminary concept model were then added to the base model. The combined model, thus, covers a wider range of interaction forms. Furthermore, it takes into account the aspects of virtual environments by also describing the forms that are not necessary applicable in physical world interactions.

4. Description of the concepts in the model


Figure 1 represents the first layers of the decomposition that forms the proposed concept model of interaction forms. The map illustrates the main interaction types that can be found within current multimedia games. The forms have been categorised into 12 classes, each

Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA03) 1087-4844/03 $17.00 2003 IEEE

consisting of a number of sub-concepts. The basis for this taxonomy is the categorisation of various interaction forms in terms of communication channel, context, and
Hair Physique Clothes Adornment Face & Skin Equipment Music Silence

acting entities (e.g., body parts, environment, fellow team members, etc.).

Paralanguage

Sound effects

Chat

Speech Phrases Sign Language

Avatar Appearance

Non-verbal Audio Olfactics

Language-based Communication
Orientation

Postures Body Movement

Chronemics Kinesics
Gestures

Proximity & Distance Positions & Locations

Spatial Behaviour

Head Movement Eye Movement Eye Contact

Occulesics
Visual Orientation

INTERACTION FORMS IN CVEs


Tactile

Physical Contact
Defensive

Agressive Emotional Signals

Facial expressions Autonomous AI AutoReflection Use of Object Artefacts

Environmental Details
Mutual Affects

Destruction Setting Construction Object Moving

Follow-me

actions

Object Exchange

Modification

Figure 1. Concept model of interaction forms. It has been attempted to keep the naming of the subconcepts within the level of interaction forms without describing the functions that can be executed by using the forms. However, there are some sub-concepts that can be considered as styles of interaction. For example, emotional physical contact does not imply any specific interaction forms. It merely directs the thinking towards the instances of emotional forms, such as hugging, petting, holding hands, etc. In relation to this, the consistency of the model is compromised. Still, the naming convention aims at providing as clear a tool for analysis and design as possible. The following section illustrates the forms, or manifestations, of interaction. A brief description of each of the main categories is provided. The model does not try to replicate the physical world interaction forms. The relation to the physical world is taken into account when applicable, but the potential of virtuality is also harnessed. Autonomous AI category includes a set of preprogrammable actions and reactive behaviour that resembles subconscious and intuitive actions in the physical world. Some of these actions can be regulated by the system while others are modifiable by the participants. This group of interaction forms overlaps with most of the other categories. However, the autonomous actions are considered as a separate group because of the specific nature of most of the actions and their importance in terms of design. Avatar appearance defines the attributes of image and presentation of self [8, 10]. Appearance contains the visual aspects of one's presentation. Argyle [2] divides this into two: those aspects under voluntary control clothes, bodily paint and adornment - and those less controllable - hair, skin, height, weight, etc. The aspects of appearance can, thus, be thought of as static or dynamic communicational messages, depending on the attribute. The CVE context enables novel ways of utilising appearance as a form of interaction because the physical constraints are not necessarily replicated from the real world. For instance, avatar size and shape can be dynamically altered in order to convey particular messages. The Tuppi3D experiment included only static representations of avatars that conveyed the identities of the players. However, the participants felt that they would have liked more freedom in customising their visual images. A dynamic way to change ones appearance was considered as effective channel for communicating. Chronemics involves the use and perception of time [5]. Masterson [15] describes the example of being punctual versus being late as one illustration of this group. However, there are several other possibilities in using time as a communication tool. For example, pauses can be used to increase anticipation and to make others pay closer attention to ones actions. The interaction analysis of Tuppi3D [14] suggested that the chronemical forms emerge when there is flexible set of various interaction forms available for the participants. Facial expressions may be broken down into the subcodes of eyebrow position, eye and mouth shape and nostril size. These, in various combinations, determine the expression of the face, and it is possible to write a 'grammar' of their combinations and meanings [7]. Furthermore, Argyle [2] classifies blushing and perspiration as facial expression.

Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA03) 1087-4844/03 $17.00 2003 IEEE

Although face is the most significant channel of nonverbal communication in physical world, the virtual environments tend to diminish the role of face due to graphics resolution limitations. However, in close encounters and discussion-oriented situations the facial expressions are fully perceivable even in games and CVEs. Environmental details define the appearance of surroundings providing contextual cues [15]. These include artifacts that can be used and manipulated within the environment [5]. Argyle [2] states that moving objects and furniture, leaving markers, and architectural design can be used to communicate through space and place. The mutual affect involves the effect environment has on the user and vice versa. For example, physical boundaries (e.g., walls), lighting (e.g., shadows), and the matter filling the virtual space (e.g., water) can change the performance of the participants. The dynamics of the setting vary according to the implementation. However, usually at least a restricted destruction and modification of the environment is allowed. Kinesics includes all bodily movement commonly referred to as body language [5]. Head movements are involved mainly in interaction management, particularly in turn taking in speech, and can consist of one or several sequential (rapid) nods at various speeds [7]. Posture defines the way of sitting, standing and lying [2]. Gestures involve the hand and arm as the main transmitters, but gestures of the feet and head are also important. They are usually closely co-ordinated with speech and supplement verbal communication. Computer games support kinesics relatively well. There are numerous examples of crawling, walking, running, jumping and waving animations that portray the action of the player. Language-based Communication is the major channel for interpersonal information sharing in most of the current CVEs. The use of language, or symbols, can be modelled and conveyed textually, aurally, and in the form of images. Text chat and voice-over-IP speech support are examples of the channels that support this group of interaction forms. Although language itself is not within the focus of this work, the level of language abstraction and automation is highly relevant. Speech audio with spatial auralisations represents the level of highest 'manual' control. Instead, pre-programmed phrases and textual parser support represent a somewhat abstracted and automated utilisation of language. Non-verbal audio includes the use of voice in communication, which is often referred to as paralanguage [5]. The non-verbal aspects of speech contain prosodic and paradigmatic codes [7]. The former is linked to speech (e.g., timing, pitch, and loudness) and the latter are independent of the speech (e.g., personal

voice quality and accent, emotion, disturbances). Nonverbal vocalisations [2] are an essential part of communication as they can significantly change the meaning of the message. In addition to these, CVE systems can contain various sound effects and background music that can be used as manifestations of interaction. Furthermore, successful, i.e., purposeful, silence is a strong form of interaction that often causes problems in networked settings because of lag (e.g., communication partner does not mean to be silent but the network lag makes this happen). Occulesics are movements of the eyes, e.g., gaze [15]. Eye movement and eye contact depict the focus, direction and duration of the gaze in relation to other participants [7]. Allbeck and Badler [1] use the term visual orientation to differentiate occulesics from spatial behaviour. Argyle [2] describes two groups of variables associated with the gaze: amount of gaze (e.g., how long people have eyecontact) and quality of gaze (e.g., pupil dilation, blink rate, opening of eyes, etc.). In CVEs, the effective use of this category requires modelling of eye movements and detailed enough visuals. Olfactics refer to the non-verbal communicative effect of one's scents and odours [15]. Perhaps the most common example of this category is the use of perfumes. This group is not widely supported in CVEs, because of the user-interface limitations. However, there are examples, especially in games, of the successful application of simulated olfactics as an interaction form. Physical Contact reflects the use of touch in communication situations [5]. This category consists of actions such as handshakes and patting [1]. Furthermore, bodily contact stimulates several receptors that are responsive to touch, pressure, warmth or cold, and pain [2]. In the context of CVEs, this group consists of virtual interaction between the avatars of the participants. Spatial behaviour consists of proximity, orientation, territorial behaviour and movement in a physical setting [2]. Burgoon and Ruffner [5] use the term proxemics to include actions relating to the use of personal space. Proximity consists of the various actions corresponding to the use of personal space, i.e., how closely we approach someone can give a particular message about our relationship. Different distances usually convey different meanings. Orientation defines the direction to which a person has turned to. This code conveys information about our point of interest, or, focus [7].

5. Evaluation of the model


The usefulness of the interaction form model has been evaluated in two ways. First, the model has been used as a framework for analysing the existing CVEs. Second, the model has been used as a design guideline in constructing

Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA03) 1087-4844/03 $17.00 2003 IEEE

new CVE experiments. The evaluation has been conducted in an iterative manner. The results of previous evaluation have been used to refine the model, which, in turn, have then been used as a framework for further analyses. The preliminary version of the model was used as a framework in the analysis interaction forms perceivable in multi-player game sessions [11]. The results of the study indicate the successful application of the model as a tool for structuring the data into coherent and descriptive categories. Furthermore, the model was helpful in pointing out several areas of interaction forms that were not adequately supported by the systems. The model was analysed and compared against the social theoretical framework [10]. The different approaches to interaction (i.e., the interaction form approach and the higher-level social action view) adequately supported each other. The model of interaction forms was successfully mapped as a set of executing instances for the higher-level social activities. The next phase in the evaluation was the design and development of the Tuppi3D experiment. The interaction form model was used as a design guideline in constructing a CVE that would support rich interaction. A qualitative video analysis was then conducted by using the NVC model as the framework. The results indicated that the purely communicational NVC model does not represent a complete set of interaction forms [14]. Based on this, the interaction form model and NVC model were combined in order to obtain a holistic view towards the concept of interaction. The earlier interaction form analysis of multi-player games was re-evaluated using the final version of the model. The results indicated that the findings support the new model relatively well. There were no interaction forms that did not fit into the framework. On the contrary, the model illustrated several concepts that were not evident in the current data [13]. The final evaluation of the model was conducted from a process point-of-view in a research project which focused on a value-added service production for mobile platforms. The importance of non-verbal communication in multi-player game environments was successfully brought forward and demonstrated in theory (i.e., the concept model) and in practice (i.e., the experiment). Furthermore, the rich interaction model and the corresponding design philosophy led to a solution that was, in many ways, superior to the solution designed and developed purely with the technological focus. The holistic approach to a complex phenomenon, such as interaction forms, makes the modelling difficult because the models may grow to be too large. There are hundreds of specific interaction forms that can be applied to the CVE context. Successful illustration of all these

concepts may prove to be a relatively difficult task. Still, categorisation of the concepts enables the coherent breakdown of the model into a set of sub-models that can be considered individually. The nature of multi-player games maintains the level of abstraction relatively high, so, for example, the detailed direct manipulation and complex artefact sharing that are common in numerous groupware applications may not be fully supported. Nevertheless, this type of interaction can be illustrated with a set of corresponding interaction forms. Although the model is useful in analysing and understanding the concept of interaction forms, it is difficult to design and implement CVEs based on the model alone. The implementation generally requires abstraction, which, in turn, forces the designers to consider higher levels of interaction. This may lead to a traditional top-down design approach, in which the available interaction forms are suffocated by the higherlevel activities [12]. The usefulness of the model is in providing a conceptual framework that can be applied as a basis for the analysis, evaluation and design of CVE applications. The model can aid the designers to support meaningful and useful interaction by providing a holistic view to the applicable interaction forms. Furthermore, the conceptual understanding of the phenomenon helps the researchers and practitioners to tackle the issues of interaction design collaboratively because it provides a common language and terms for the various interaction form categories. Due to the somewhat mechanistic approach, the model serves well as a basis for lower level interaction support. The focus being on forms, and not on functions, makes it possible to design a number of interaction forms that support numerous functions. The aim is not to dictate how the participants should exploit the available forms of interaction. Instead, the goals and incentives of the users can be well supported by the lower level manifestations of interaction described by the model. The model is beneficial for CVE designers, as it illustrates the available interaction forms. Thus, it is possible to reduce the limitations and restrictions of computer mediation by supporting more flexible and natural interaction. Although the naturalness and intuitiveness of face-to-face communication is hard to achieve, the rich interaction model can direct designers to additional and novel ways that could enhance the weak areas of interaction.

6. Concluding remarks
The concept model of rich interaction forms illustrates the various forms of communication, co-ordination and collaboration in CVEs. A creative combination of

Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA03) 1087-4844/03 $17.00 2003 IEEE

interaction forms makes it possible to enhance the overall interaction and further increases the communicative, collaborative and constructive uses of the virtual environments. Although the mechanistic approach to interaction modelling may be criticised, the interaction form oriented analysis and design could be a solution that would support the construction of more communicative and collaborative systems. The top-down approaches to system design have, to date, been unable to solve problems of computermediated group activities. However, the proposed bottomup approach could be used as a design guideline that would help the implementation of flexible-enough interactions. The interaction form model is significant for CVE designers, as it illustrates the possible representations of, for example, non-verbal communication in networked settings. Thus, it is possible to reduce the limitations and restrictions of computer mediation by enabling more flexible and natural interaction. When designers know the possible means of interaction, they can direct their effort to the ones that best suit the application domain. Furthermore, the basic set of interaction forms, being given by the interaction model, enables the designers to apply other design approaches, such as artistic selectivity, to enrich the interaction.

7. References
[1] Allbeck, J. M., and Badler, N. I. (2001). Consistent Communication with Control. Workshop on Non-Verbal and Verbal Communicative Acts to Achieve Contextual Embodied Agents in Autonomous Agents 2001 [2] Argyle, M. (1975). Bodily Communication, International Universities Press Inc., New York. [3] Benford, S., Bowers, J., Fahlen, L. E., Greenhalgh, C., and Snowdown, D. (1997). Embodiments, Avatars, Clones and Agents for Multi-User, Multi-sensory Virtual Worlds. Multimedia Systems, 5(2), 93-104. [4] Benford, S., and Mariani, J. (1993). Requirements and Metaphors of Interaction - COMIC Deliverable 4.1, Lancaster University, Lancaster. [5] Burgoon, M., and Ruffner, M. (1978). Human Communication, Holt, Rinehart and Winston. [6] Cassell, J. (2000). Embodied conversational interface agents. Communications of the ACM, 43(4), 70 - 78. [7] Fiske, J. (1982). Introduction to Communication Studies, Routledge, London. [8] Laurel, B. (1993). Computers as Theatre, Addison-Wesley Publishing Company, Inc. [9] Maes, P., Darrell, T., Blumberg, B., and Pentland, A. (1997). The ALIVE System: Wireless, Full-body Interaction with Autonomous Agents. Multimedia Systems, 5(2), 105-112. [10] Manninen, T. (2000). Interaction in Networked Virtual Environments as Communicative Action - Social Theory and

Multi-player Games. In proceedings of CRIWG conference, Madeira, Portugal, IEEE Press, 154-157. [11] Manninen, T. (2001a). Virtual Team Interactions in Networked Multimedia Games - Case: "Counter-Strike" - Multiplayer 3D Action Game. Workshop on Presence, Philadelphia, USA. [12] Manninen, T. (2001b) Rich Interaction in the Context of Networked Virtual Environments - Experiences Gained from the Multi-player Games Domain. In proceedings of IHM-HCI, Lille, France, Springer-Verlag, 383-398. [13] Manninen T. (2002) Interaction Forms in Multiplayer Desktop Virtual Reality Games. In Proceedings of Virtual Reality International Conference, June 19-21, Laval, France. In press [14] Manninen, T., and Kujanp, T. (2002). Non-Verbal Communication Forms in Multi-player Game Session. In proceedings of HCI, London, UK. In press. [15] Masterson, J. (1996). Nonverbal Communication in Text Based Virtual Realities, Master of Arts Thesis, University of Montana. [16] Robertson, T. (1997). Cooperative Work and Lived Cognition: A Taxonomy of Embodied Actions. European Conference on Computer Supported Cooperative Work, 205220. [17] Thalmann, D. (2001). The Role of Virtual Humans in Virtual Environment Technology and Interfaces. Frontiers of Human-Centred Computing, Online Communities and Virtual Environments, R. Earnshaw, R. Guejd, A. van Dam, and J. Vince, eds., Springer-Verlag, London, 27-38. [18] Vilhjlmsson, H. H., and Cassell, J. (1998). BodyChat: Autonomous Communicative Behaviors in Avatars. Second International Conference on Autonomous Agents, Minneapolis, MN, US, 269-276.

Proceedings of the 16th International Conference on Computer Animation and Social Agents (CASA03) 1087-4844/03 $17.00 2003 IEEE

Vous aimerez peut-être aussi