Académique Documents
Professionnel Documents
Culture Documents
Renata Solum
Jayanthi Sasisekaran
SLHS 3305W
March 3, 2009
Introduction
facial expression and body language. This project aims to identify some of the effects of
findings by Banse and Scherer (1996) for the emotions hot anger, happiness, boredom,
and contempt. This research will also expand upon Banse and Scherer to include
Research Questions
dynamic range, sentence duration, LTAS, and jitter absolute reveal about
Predictions
After Banse and Scherer (1996), this experiment made use of a standardized
and thus attempt to isolate the effects of emotion. Their phrase, “Hat sundig pron you
venzy,” resembles normal speech. In the interest of further standardizing the four
emotional exemplars, I placed stress uniformly on the words pron and venzy during
recording. All recordings were of my own voice. I recorded three potential exemplars of
each emotion in the Praat software using a sampling rate of 44100Hz. I then randomized
the list of 12 recordings and played them to two listeners, who identified each as one of
the four emotions and rated it on a 1-5 scale in terms of how well it exemplified the
identified emotion. I averaged the ratings and chose the highest-rated exemplar for each
of the four emotions, throwing out one exemplar that the two listeners identified
differently. Having identified the single best exemplar of each emotion (hot anger,
happiness, boredom, and contempt) I used Praat to obtain pitch and intensity contours
Solum 3
and measurements of mean F0, range F0, mean intensity, dynamic range, sentence
Results
Pitch and intensity contours are shown below in Figure 2, with pitch and intensity
contours for each utterance overlaid in their respective pairs for examination of co-
Boredom Contempt
Figure 2: Pitch and Intensity Contours (Pitch in red)
To start with hot anger, it is immediately evident that this exemplar has the highest F0
variability, as I predicted. It also, incidentally, has the highest mean F0--this seems to
reflect the high physiological excitement associated with hot anger, as confirmed by
Banse and Scherer, who found that mean F0 is highest for what they call the most
“intense” emotions. Boredom carries the flattest affect, as predicted, and the little change
there is in the pitch contour does reveal a slight downward slope. Banse and Scherer
Solum 4
found that boredom was associated with lower mean F0, which I did not anticipate, and
which was not evident in my measurements—in fact, I found that my contempt exemplar
had a slightly lower mean F0 than did boredom (see Figure 3, below). Incidentally,
unavailable until about 1.2sec into the utterance, which might reflect the aperiodic,
Pitch and intensity appear to co-vary for the emotions hot anger, happiness, and
contempt especially, with contempt showing much greater peaks for intensity than for
pitch; this seems intuitive considering the way in which I associate contempt with a
“spitting” of words.
Figure 3 shows results for measurements of other acoustic parameters: mean F0,
the largest dynamic ranges, respectively, in fact I found that the dynamic range for
boredom was smallest second only to hot anger. I am curious as to whether the
finding for hot anger that recalls Banse and Scherer is its having the highest mean F0 of
the four exemplars. Hot anger had the slowest speech rate as measured by sentence
Solum 5
duration, which is at odds with my prediction that it would have the fastest speech rate. I
Jitter values may be the most informative and expected data to come from this set
of parameters. The highest jitter values were associated with hot anger and contempt,
which perhaps reveals the physiological stress injected into speech during the
circumstances that warrant these emotions. Specifically, pharyngeal constriction like that
theorized by Banse and Scherer for production of disgust speech may affect the mass or
tension of the vocal folds in ways that interfere with normal periodicity.
comparison. It is hard to draw solid conclusions from the LTAS curves themselves—
perhaps one thing that can be said is that contempt appears to be associated with the most
dramatic decrease in energy from the low to the high frequencies, contrary to my
prediction that pharyngeal constriction (again, something I associate with both contempt
and Banse and Scherer’s disgust) would lead to more high-frequency energy.
Solum 6
Conclusions
varying degrees of success. Pitch and intensity contours certainly gave insight to the
production and perception of the four emotions: for hot anger and happiness, pitch and
intensity co-vary closely, whereas boredom and contempt had similarly dramatic
variations in intensity while maintaining relatively flat pitch. It is interesting that pitch
contours for hot anger and contempt are so distinct, given that these were the two
emotions most confused by listeners in the initial selection of exemplars. Perhaps the
confusion reflects the difficulty in defining hot anger and contempt informatively, and
reveals that perceiving them distinctly from a set containing both may be contingent on
another factor, such as context, body language, or linguistic content. My one expansion
on Banse and Scherer, that is, jitter values for each emotion, seemed to reveal the most
Limitations
I am aware of the limited pool of exemplars (12—three for each of four emotions)
from which judges chose the best four, in comparison with 1,344 voice samples in the
Banse and Scherer study. Furthermore, the fact that all 12 were from my own voice
means that less variation within emotion sets resulted, and so judges were forced to rate
near 5 what they may have thought were poor exemplars, in the interest of identifying a
“best” for each emotion. These issues may be the underlying causes for the scarcity of
identifiable correlation with findings by Banse and Scherer, and the apparent lack of