Académique Documents
Professionnel Documents
Culture Documents
Erin A. Carroll
HCILab
University of North
Carolina at Charlotte
Charlotte, NC, USA
e.carroll@uncc.edu
Danielle Lottridge
Department of
Communication
Stanford University
Stanford, CA, USA
lottridg@stanford.edu
enables us to generate graphs that provide insights into experiences of dancers and audience members. The initial question that arises is how choreographers and theater directors
in performance arts would use quantitative audience engagement information if it were available to them. This is a complex question because there are many facets of engagement,
many possible ways to measure engagement and many possible ways to interpret and use that information once collected. The deeper cultural and moral questions are: should
we collect quantitative audience engagement data? What
would it mean to performing art practitioners if they could
see exactly how their audiences responded?
ABSTRACT
Author Keywords
H.5.2 Information Interfaces and Presentation: User InterfacesUser-centered Design; J.5 Computer Applications:
Arts and HumanitiesPerforming Arts
General Terms
Human Factors
INTRODUCTION
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
CHI 2011, May 712, 2011, Vancouver, BC, Canada.
Copyright 2011 ACM 978-1-4503-0267-8/11/05...$10.00.
1845
ric response as measured through skin conductance. Our results show that skin conductance is a reliable indicator of
emotional response to a performance, which is what our performing arts experts are most interested in.
The first part of this paper is a conceptual exploration of audience engagement - what does it mean? Should technology mediate audience engagement, and if so how should that
work? We draw on literature from the performing arts, humanities, market research, psychology, psychophysiology,
HCI and design. In the second part of this paper, we describe our exploratory study and the insights we gained from
having performing arts experts work with sample biometric
audience engagement data. In the third part of this paper, we
detail an empirical study that explores the meaning of skin
conductance data collected during a performance. Finally,
we discuss how these different theoretical, exploratory and
empirical results combine to form a clear picture of the possibilities in using biometric audience response.
AUDIENCE ENGAGEMENT
Radbourne et al. have considered measures of audience experience in performing arts and also emphasize the importance of engagement [27]. As they point out, there are many
possible ways to consider audience engagement in the performing arts. Audience engagement can act as an evaluation tool to help us understand how a performance is received. We can investigate various ways to increase audience engagement by involving the audience in the production. While there is a range of audience engagement possibilities, in this work we focus on measuring audience engagement as a process tool for performing arts experts.
Defining Engagement
Temporal Art
Before continuing, it is worth noting the scope of our explorations. There are many temporal arts that could be investigated, including music, cinema, theater, dance, comedy,
television, and others that are less neatly categorized. Our
current work focuses primarily on dance, and secondarily on
theater, and so in this discussion they will be the domains
we refer to. We believe our ongoing explorations of audience engagement with dance and theater may apply to other
live performances such as music, comedy and even cinema.
We can also look to the neurophysiology literature to better understand affect and the distinctions between valence
and arousal. William James began the early accounts of how
bodily reactions created emotional experience [14]. Today,
neurophysiologists agree that emotions have a neurophysiological basis, precede conscious explanation and can exist
outside of consciousness [19]. Zajonc describes how affective reactions are difficult to verbalize [35]. Lang describes
emotions as dispositions, or states of readiness, that function to prepare and facilitate interaction with the environment [18]. When participants rated pictures, Lang found that
pleasantness ratings, heart rate, and facial muscles tended to
load onto one factor (valence) and interest ratings and skin
conductance tended to load on another (arousal).
1846
during a performance, but we need to be careful about confounding two factors: how the performance makes a person
feel versus how much they like the performance. Given this
confound, measuring valence is considerably more complicated than measuring arousal.
Given the previous work in affect, we can state that emotional engagement is a complex phenomena that involves
both valence and arousal. For the purposes of dance and theater performance, we focus on arousal, which we consider to
be the most relevant component of engagement.
Audience members already give explicit feedback to performers, and performance artists talk about how they feed
off the energy of their audiences. However, the bandwidth
of this communication is limited and noisy. The most common measure of audience engagement and appreciation, final applause, is temporally offset from the performance itself, and prohibits that information from feeding back into
the performance. More explicit methods for measuring audience engagement include post-performance surveys, focus groups and audience interviews. A complicating factor in post-performance questionnaires is the peak-end effect [15], which shows that a measure of emotional experience taken immediately after an experience is strongly influenced by the peak emotion and by the emotion experienced
at the tail end of the experience.
1847
Figure 1. Screen shot during session with C2. The lines topped with
blue circles are segmenting lines C2 added to define semantic chunks.
Between these lines the pink fill indicates the aggregate response over
that segment. C2 preferred to see individualized response lines, rather
than a single response line.
as measured through GSR. In other words, while the participant watched the performance, they were able to view
engagement levels that corresponded to every second of the
performance. Engagement levels were displayed below the
video in a line graph (See Figure 1). The experts were able to
interact with the data by clicking on the line graph or by sliding the seeker bar. The video player also featured a built-in
ambient display in the form of a video border which changed
color saturation in response to the current aggregate arousal
level.
EXPLORATORY STUDY
Many of the technologies that could be used to measure audience engagement data are either expensive, complicated
or invasive. Before anyone goes to the expense and effort
of instrumenting an audience, it seems reasonable to explore
how performing arts experts would construe audience engagement data. To this end, we developed an exploratory
study using a small amount of audience engagement data and
two different performance videos. The study is not meant to
perfectly replicate live performance measurement or usage,
but rather to present sample data to these experts that is reasonable and to gage their reactions to it. The study used
individual arousal data collected through GSR sensors.
Procedure
1848
The data doesnt necessarily provide clear direction to performing arts experts in all cases. T4 clearly understood
the dangers of a stimulus-response approach to using the
data and was adamantly against using it to make second-bysecond adjustments to increase arousal level, stating:
I could literally make a play where people are talking
in whispers and screaming every other sentence - technically, vocally manipulate them... to keep the response
constantly on an up level.
A theme that strongly emerged during the think aloud sessions was related to the issue of arousal versus valence. The
data that our experts were shown was arousal, but they were
asked if they would also be interested in valence data. Our
participants were either not interested in valence at all because they considered it purely subjective, or they were only
interested in getting valence if they could also get causal explanations. C2 explained:
All of our experts felt that valence is limited because reactions to dance and theater are subjective. T3 commented:
...so Im saying that as designers and directors are trying to explore different things, it [audience engagement
data] becomes very useful. It doesnt become useful for
anyone in terms of Did they enjoy my work? because
thats so subjective and limiting.
When asked specifically if he would be interested in seeing
like/dislike data, T3 replied:
1849
Application to Practice
To study the meaning of biometric audience response, we recruited 49 participants (18 male, 31 female, all students) to
watch a video of an 11-minute dance performance. Ten participants were from fine arts, the rest were from other varied
disciplines. Since watching a video of a dance performance
is different than the experience of attending a live dance concert, we made environmental decisions to increase the participants sense of immersion, which included projecting the
dance onto a 60-inch projector screen, having the participants wear headsets to listen to the soundtrack and watch
the video in a dimly lit, temperature-controlled room.
Finally, four of our experts commented on the value of gathering audience engagement data for educational and training
purposes. T2 was interested in using the data to train actors:
But with him [referring to a point of high audience
arousal while an actor is on-stage], I could be like,
That is your moment - now what are you doing there?
Like, really working on that specific moment where the
audience was a little more interested in him...
Each participant wore Thought Technology GSR fingerwraps on two fingers of their non-dominant hand, leaving their dominant hand available to rate their engagement
with the performance using a physical slider. The biggest
methodological challenge was determining the best way to
have participants report their explicit, conscious responses
to the dance performance. We wanted to give users a simple
physical slider that could be used mostly eyes-free. We experimented with a number of different scales and labels to try
to capture the concept of engagement with words that participants would easily relate to. During pilot studies, we found
that simply labeling the slider with No Engagement and High
Engagement was confusing. Participants could not detach
valence from the word, and tended to only rate themselves
as being engaged when they liked what they saw. Others just
didnt really seem to know what we meant by engagement,
and still others didnt seem to know how engaged they were.
Given this, we had to investigate alternative vocabularies to
help users report their engagement levels.
1850
The experiment ran as a between-subjects study, with participants randomly assigned to one of the two groups. Each participant was run individually and thus we did not try to capture any of the social effects that would occur in a true performance setting. We instructed the participants that while
watching the dance video, they should use their dominant
hand to rate the performance using the slider, since their nondominant hand was attached to the GSR sensor. Prior to the
start of the dance performance, we captured three minutes of
GSR baseline while the participants sat alone in a dimly lit
room. After the baseline period, our software launched the
dance performance. Participants were told:
Please change the location of the slider as often as
necessary to reflect changes in your feelings toward the
video. You should move the slider as often as your feelings change: this could be every few seconds, or if your
feelings remain constant, there may be periods where
the slider stays at the same location, to reflect that.
GSR samples and slider ratings were taken every 500ms, and
our study concluded with 49 participants files consisting of
11 minutes worth of GSR data and slider ratings. After collecting the data, we applied a three-second moving average
filter to both the GSR data and the self-report slider data.
We smoothed the GSR data using this filter because GSR responses lag stimulus by 1-5 seconds [6]. We also smoothed
the self-report responses to account for different slider reaction times between participants.
Results
Hypotheses
Engagement is not particularly well-defined, and as mentioned previously, determining the right questions to ask of
our participants in order to elicit their engagement level was
difficult. We chose the emotional reaction labels because the
literature suggests that the arousal level measured by GSR
is most related to the intensity of emotional reactions. Thus,
we anticipated a positive correlation between GSR readings
and the ER slider values:
1851
Participants indicated a low reaction when they were confused I think the reason I went down was because of the
talking. It was confused why there was talking during the
performance... I couldnt understand where it came into the
performance. A few participants seemed to have a difficult
time using the emotional reaction scale. For example, one
person said, I think that I stayed at the half way point because I didnt know what to feel. In the Emotional Reaction
group, we also had one person move their slider only twice
(this did not happen in the LH group) I didnt experience
any emotions... I wasnt really bored but at the same time,
I wasnt really interested. Comprehension is not well captured by the emotional reaction slider. Participants in the LH
group could lower their love-hate rating if something confused them, but the mapping to the ER scale is less clear.
Qualitative Results
In addition to correlating explicit self-report ratings to autonomic biometric readings, we had a third probe into the participants response to the performance, which came from the
post-hoc interviews in which we showed participants their
self-report data while watching the video and asked them to
explain their ratings. Themes emerged that elucidate the aspects of the performance that generated conscious responses.
Love/Hate Group
Participants that used the LH scale were able to list very specific aspects of the dance performance that they liked or did
not like based on the dance movements and the sound score.
Almost all participants agreed that they liked the faster, bigger movements, opposed to the slower movements: These
movements were just not very exciting. It was kind of slow.
In addition to tempo, they also commented on specific movements: I liked the diamond shape of the legs. The legs in
the air... they were moving, kind of like swimming. I liked
that. Right here, I think its pretty cool what theyre doing... While there was a lot of agreement between participants on these movements, there was a certain dancer in the
piece whose movements were very different than the other
dancers. Some people loved her movements, while others really disliked them. For example, I liked watching her since
she was doing something completely different... versus I
didnt like that person crawling on the ground...
DISCUSSION
People also commented how they liked when the music became more uplifting or happy, which coincided with the
more dancy movements that people also described liking.
We also had one person who indicated that he liked a certain
part of the dance because he said, The music reminded me
of some songs that I know. In this particular dance, there
was a segment of the score that consisted of a voice speak-
The experts were reflective in their interpretations and understood that the data was a noisy signal and that different
factors could be affecting each individuals readings.
The positive response from performing arts experts and their
1852
Further, we plan to study audience biometric responses during live performances to examine patterns in live performance settings. Finally, our work represents a fundamental
step in understanding biometrics as related to audience engagement. Our future work will continue to build on that understanding through several different facets to create a richer
picture of audience experience and engagement.
The fact that the absolute value of participants LH slider ratings correlated so strongly with their GSR readings is important for three reasons. First, if other researchers wish to have
participants self-report on engagement, this scale was easier
for participants to use, and once transformed, is a valid measure of engagement. Second, if there was interest in building
audience response devices to gather both implicit biometric
responses and explicit self-report responses, the LH scale
would be better to use because participants have an easier
time using it. Third, despite our earlier finding that performing arts experts are not interested in valence, C2 confided:
ACKNOWLEDGEMENTS
This work was funded by an NSF CreativeIT grant (#IIS0855882). We thank our study participants and members
of the HCILab for their feedback. We would also like to
acknowledge DREU students Charlotte Smail and Millicent
Walsh for their assistance during the summer of 2010.
REFERENCES
This suggests that despite our experts stated lack of interest in valence, there may be curiosity. Since the LH slider
allows us to collect valence and a representation of engagement, it gives us more information with seemingly less cognitive effort expended by users. A system that allowed collection of both autonomic and explicit self-report data could
combine the two signals to cancel noise.
We have presented a set of theoretical, exploratory and empirical results on the collection and use of temporal biometric audience response data in an effort to further understand
audience engagement in the performing arts. We conclude
that performing arts experts are interested in the autonomic
reactions of their audience members collected throughout
performances and that they are carefully reflective in interpreting such data. We showed a strong correlation between
participants explicit ratings of level of emotional reaction
and their autonomic GSR responses. We also showed a
strong correlation between the absolute value of love-hate
ratings of a performance and participants GSR responses.
Our results support the validation of temporal GSR data as a
reflection of audience engagement.
1853
1854