Vous êtes sur la page 1sur 12

Available online at www.sciencedirect.

com

Journal of Communication Disorders 42 (2009) 124135

Articulatory changes in muscle tension dysphonia: Evidence of


vowel space expansion following manual circumlaryngeal therapy
Nelson Roy a,*, Shawn L. Nissen b, Christopher Dromey b, Shimon Sapir c
a

Department of Communication Sciences & Disorders, The University of Utah, 390 South 1530 East,
Room 1219, Salt Lake City, UT 84112-0252, USA
b
Department of Communication Disorders, Brigham Young University, Provo, UT, USA
c
Department of Communication Sciences and Disorders, University of Haifa, Haifa, Israel
Received 18 April 2008; received in revised form 1 October 2008; accepted 8 October 2008

Abstract
In a preliminary study, we documented significant changes in formant transitions associated with successful manual circumlaryngeal
treatment (MCT) of muscle tension dysphonia (MTD), suggesting improvement in speech articulation. The present study explores
further the effects of MTD on vowel articulation by means of additional vowel acoustic measures. Pre- and post-treatment audio
recordings of 111 women with MTD were analyzed acoustically using two measures: vowel space area (VSA) and vowel articulation
index (VAI), constructed using the first (F1) and second (F2) formants of 4 point vowels/ ", i, , u/, extracted from eight words within a
standard reading passage. Pairwise t-tests revealed significant increases in both VSA and VAI, confirming that successful treatment of
MTD is associated with vowel space expansion. Although MTD is considered a voice disorder, its treatment with MCT appears to
positively affect vocal tract dynamics. While the precise mechanism underlying vowel space expansion remains unknown, improvements may be related to lowering of the larynx, expanding oropharyngeal space, and improving articulatory movements.
Learning outcomes: The reader will be able to: (1) describe possible articulatory changes associated with successful treatment
of muscle tension dysphonia; (2) describe two acoustic methods to assess vowel centralization and decentralization, and; (3)
understand the basis for viewing muscle tension dysphonia as a disorder not solely confined to the larynx.
# 2008 Elsevier Inc. All rights reserved.

1. Introduction
Although muscle tension dysphonia (MTD) is properly regarded as a voice disorder associated with excessive
muscle tension in the laryngeal and perilaryngeal muscles (Aronson, 1990; Boone & McFarlane, 2000; Colton & Casper,
2006; Hillman, Holmberg, Perkell, Walsh, & Vaughan, 1989; Morrison & Rammage, 1994; Stemple, 2000), tension in
these muscles could also constrain articulatory movements and vocal tract dynamics, by virtue of the mechanical linkage
of the articulators to the hyolaryngeal complex, central nervous system influences (e.g., heightened muscle tension in the
jaw muscles), orolaryngeal sensorimotor interactions, or a combination of these (Dromey, Nissen, Roy, & Merrill, 2008;
Higgins, Netsell, & Schulte, 1998; Higgins & Hodge, 2002; McClean & Tasko, 2002; Sapir, 1989).
* Corresponding author. Tel.: +1 801 585 0428; fax: +1 801 581 7955.
E-mail addresses: nelson.roy@health.utah.edu (N. Roy), shawn_nissen@byu.edu (S.L. Nissen), christopher_dromey@byu.edu (C. Dromey),
sapir@research.haifa.ac.il (S. Sapir).
0021-9924/$ see front matter # 2008 Elsevier Inc. All rights reserved.
doi:10.1016/j.jcomdis.2008.10.001

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

125

Despite some ambiguity surrounding its causal mechanisms, the clinical voice literature is replete with evidence
that symptomatic voice therapy for primary MTD can often result in rapid and dramatic voice improvement
(Andersson & Schalen, 1998; Aronson, 1990; Carding & Horsley, 1992; Carding, Horsley, & Docherty, 1999;
Koufman & Blalock, 1988; Pannbacker, 1998; Ramig & Verdolini, 1998; Roy & Bless, 1998; Roy, Bless, Heisey, &
Ford, 1997; Roy & Leeper, 1993). Because prolonged hypercontraction of laryngeal muscles is reportedly associated
with elevation of the larynx and hyoid bone, with pain and discomfort when the circumlaryngeal region is palpated
(Lieberman, 1998; Roy & Bless, 1998; Roy et al., 1997; Rubin, Lieberman, & Harris, 2000), several voice clinicians
have described manual/digital techniques to determine the presence and degree of laryngeal musculoskeletal tension,
as well as methods to relieve such tension during the diagnostic assessment and management session (Aronson, 1990;
Lieberman, 1998; Peifang, 1991; Roy & Bless, 1998). These manual circumlaryngeal techniques, including laryngeal
massage, have been shown to be particularly effective, and purportedly result in rapid and sustained improvements in
the quality of phonation (Roy et al., 1997; Roy & Leeper, 1993).
Clinical reports confirm that successful treatment of MTD can often be achieved by releasing tension in the
extrinsic laryngeal muscles and reposturing the hyolaryngeal complex such that it presumably assumes a more caudal
position and increases oropharyngolaryngeal space (Morrison & Rammage, 1994; Roy & Bless, 1998; Roy &
Ferguson, 2001; Rubin et al., 2000). The release of tension and repositioning of the hyolaryngeal complex may also
improve phonation and articulation by reducing overall tension in the speech subsystems and by improving
sensorimotor processes that affect respiration, phonation, articulation, and orolaryngeal coordination (Dromey et al.,
2008; Sapir, 1989; Sapir, Spielman, Ramig, Story, & Fox, 2007).
Dromey et al. (2008) recently examined F2 transitions in diphthongs to determine if there was acoustic evidence of
articulatory changes in a group of patients who had undergone treatment for MTD using manual circumlaryngeal
techniques. The investigators reported preliminary evidence of a post-treatment increase in F2 slope for diphthongs,
manifested largely by an increase in the extent of the F2 frequency transition. Thus the slope increase could not be
attributed to adjustments in rate alone. Instead, these preliminary results provided the first evidence of increased vocal
tract dynamics in MTD following a treatment which ostensibly targets the voice only. Based on the accepted use of F2
slopes to index articulatory movement (Ferrand, 2007; Kent & Read, 2002; Rosner & Pickering, 1994) Dromey and
colleagues interpreted the results as evidence to support articulatory changes following treatment.
The results reported by Dromey et al. (2008) are compatible with those reported in an earlier case study of an
individual with Parkinsons disease (PD), who showed acoustic evidence of enlarged articulatory movements
following intensive voice treatment (Dromey, Ramig, & Johnson, 1995). Increases in this speakers F2 transition
extent were attributed to improved vocal tract dynamics that occurred without any clinical efforts to treat articulation,
but which arose instead from a system-wide increase in activity. Sapir et al. (2007) have also documented, indirectly,
articulatory changes in the speech of dysarthric individuals with PD associated with a loud phonation training known
as the Lee Silverman Voice Treatment (LSVT1). Specifically, using acoustic analysis and perceptual rating
techniques, they found significant increases in sound pressure level (SPL), the second formant of the vowel /u/,
expressed as F2u, and the ratio of second formant of the vowels /i/ and /u/, expressed as F2i/F2u. These acoustic
changes were accompanied by improved perceptual ratings of vowel goodness. In another study by Sapir, Ramig,
Spielman, and Fox (in review-a), the effects of LSVT1 on speech articulation in individuals with PD were assessed
with two acoustic metrics: vowel space area (VSA) and formant centralization ratio (FCR), as detailed below. Sapir
and colleagues found that LSVT1 significantly increased both VSA and FCR, suggesting vowel space expansion and
improved vowel articulation. They interpreted these findings to suggest that loud phonation may increase coactivation
of the respiratory, laryngeal and articulatory muscles through corticofugal mechanisms. They also suggested that the
articulatory system might aid in the production of loud phonation by serving as an effective resonator, and improve
laryngeal function through biomechanical and sensorimotor linkages to the phonatory system.
The aforementioned treatment effects strongly suggest that targeting treatment of phonation (i.e., respiratory and
laryngeal functions) may result in changes (improvement) in articulation. Indeed, the interconnectedness of the
articulatorylaryngealrespiratory subsystems has been confirmed in kinematic investigations which have revealed
parallel increases in lip and jaw movements when dysarthric and normal speakers increase their vocal effort alone
(Dromey, 2000; Dromey & Ramig, 1998; Kleinow, Smith, & Ramig, 2001; Schulman, 1989). These changes
apparently occurred without conscious increases in articulatory effort by the speaker, and have led some researchers to
describe certain speech modifications as global, in that their effects are measurable across the whole mechanism
(Dromey & Ramig, 1998). Although MTD is appropriately regarded as a voice disorder, and not associated with frank

126

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

neuropathology as in dysarthria, the improvements in dysphonia severity following treatment of MTD as reported by
Dromey et al. (2008) were also accompanied by formant transition changes. These formant transition changes
similarly occurred without any of the speakers effort apparently being directed toward articulation. The data indicated
that the effect of the voice disorder seemed to extend beyond the larynx in MTD, and suggested that MTD may be more
of a speech production issue than solely a problem with the larynx.
The present study explores further the effects of MTD on vowel articulation by means of additional vowel acoustic
measures, sensitive to changes in vowel space. Given that (1) our previous study showed no significant differences in
voice and speech in a control group of individuals without MTD, and (2) our primary interest in assessing the effects of
MTD treatment on vowel articulation, this study focuses on changes in vowel space that occur from pre- to posttreatment in those individuals with MTD who responded to manual circumlaryngeal treatment.
2. Experimental/materials and methods
2.1. Speakers
The recordings in this study originated from an archival database of disordered voice samples collected by the first
author (NR) during routine clinical practice. As we were interested primarily in changes that occur in vowel
articulation with treatment of MTD, we included in this study only individuals with MTD that showed voice
improvement (based upon the judgment of the clinician and patient) following a single voice therapy session using
manual circumlaryngeal techniques. As detailed in our previous study (Dromey et al., 2008) inclusion in the study was
not based on any ratings of articulatory precision, accuracy, or normality. Individuals with MTD were reviewed
sequentially beginning with the first participant, and going forward. Each subject who had a complete and analyzable
dataset was selected for inclusion. Severity of dysphonia was not an inclusion or exclusion criterion. The University of
Utah Institutional Review Board approved the use of these voice samples and waived the requirement to obtain new
consent from the participants.
The present investigation focused on 111 women (mean age 46.12 years, S.D. 13.7) who were diagnosed with MTD
by a speech-language pathologist (NR) and an otolaryngologist, following extensive endoscopic and perceptual
evaluation. Treatment was completed in a single extended session with each individual. A detailed description of the
assessment and treatment procedures is outlined in Roy and Bless (1998). Briefly, each subject underwent a case
history, a traditional voice evaluation, and an assessment of musculoskeletal tension. Manual laryngeal reposturing
maneuvers and/or circumlaryngeal massage were implemented to stimulate improved voice (Roy, 2008; Roy & Bless,
1998; Roy et al., 1997). Laryngeal reposturing or repositioning maneuvers, through brief displacement or by resisting
laryngeal elevation, momentarily interfere with habituated patterns of muscle misuse, and elicit brief moments of
improved voice (Aronson, 1990; Roy, 2008). These moments were immediately identified for the patient and
reinforced. They were shaped using digital cueing or combined with tension-reduction techniques. Digital cues were
faded and the patient was taught to rely on sensory feedback (auditory, kinesthetic, and vibrotactile) to maintain
improved laryngeal posturing and muscle balance. In addition, circumlaryngeal massage was often applied. The hyoid
bone was encircled with the thumb and index finger, and then worked posteriorly into the tips of the major horns of the
hyoid bone. Pressure was applied in a circular motion over the tips of the hyoid bone. The procedure was repeated
within the thyrohyoid space, beginning from the thyroid notch and working posteriorly as the patient vocalized. None
of these manual circumlaryngeal treatments deliberately attempt to target articulatory production or postures.
2.2. Speech task
Before and after voice treatment, the speakers read the second and third sentences from The Rainbow Passage
(Fairbanks, 1960). The sentences were tape recorded and digitized off-line at 25 kHz using Kay Elemetrics
Computerized Speech Lab (CSL) system (Kay Elemetrics Corp., Lincoln Park, NJ). These recordings were made over
a period of several years and included subjects from Canada, Wisconsin, North Dakota, Texas, and Utah, with various
regional dialects. Although recording devices varied over the years, each pre- and post-treatment recording for an
individual subject was made using the identical recording equipment within the same session (i.e., same day), and
within the same location (physical setting/environment). When cassette recorders were employed, they were of the
dual-drive type which maintained consistency of speed across the duration of the audiotape. Thus, each subject served

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

127

as their own control, and any differences within subjects could be confidently attributed to treatment-related changes
observed between recordings, and not merely related to recording artifact, i.e., different recording instruments with
different frequency response characteristics. The recordings were acquired as part of these individuals clinical
evaluation, and articulatory acoustic analysis was not an original goal of this process; therefore, the number of
repetitions of the target sounds was limited by the speaking task. However, a benefit lies in the relative naturalness of
this task when compared with multiple citation form repetitions of utterances designed to allow more straightforward
acoustic segmentation and measurement. Two tokens of each point vowel were analyzed acoustically. The individual
vowel targets occurring in the passage are identified in Appendix A.
2.3. Acoustic analysis
From the digital recordings, a number of acoustic measures were made from only the four point vowels (/i, , ", u/)
from the reading passage. Frequency tracks for the first and second formants were extracted from the vowel targets
using Praat acoustic analysis software, version 5.0.04 (Boersma & Weenink, 2004). Specifically, a linear predictive
coding (LPC) based tracking algorithm (Burg method, 11 coefficients) was used to determine formant values for the
vocalic segments at approximately 5 ms intervals. The LPC analysis employed a 25 ms Hamming window with 50%
overlap and 98% pre-emphasis. Each token was checked to ensure that surrounding speech sounds were not audible in
the analyzed segment, as well as visually inspected for accuracy, and where necessary, hand corrected prior to
statistical analysis. The extracted formant values and associated time points were then saved to a text file for further
analysis. To improve the accuracy of the extracted formant measures, the data were also inspected with software
designed to detect formant values outside the expected range of frequencies; any such occurrences were then visually
inspected to determine if the extracted values needed manual adjustment or were a natural part of the spectral energy
distribution.
Using values from the extracted formant tracks, average F1 and F2 frequencies were calculated at eight different
equidistant measurement points throughout each vowels overall duration (t1t8). Thus, t1 was an average of the
formant values in the initial 12.5% of the vowels duration and t8 over the final 12.5%. For the point vowels, values of
F1 and F2 were extracted from the middle 25% of each vowels duration by averaging the formant values from analysis
windows t4 and t5. It was reasoned that the middle portion of the vowel would exhibit a relatively steady state and be
less influenced by the surrounding consonantal context. Because the passage contained two words containing each
point vowel, the mean F1 and F2 was computed and used in all later calculations.
2.4. Vowel space area (VSA)
The VSA is an acoustic index commonly used in clinical research to indirectly assess the normalcy of vowel
articulation (Kent & Kim, 2003; Kuhl et al., 1997; Vorperian & Kent, 2007). VSA is typically constructed by the
Euclidean distances between the first formant (F1) and second formant (F2) coordinates of the corner vowels /i/, /u/,
and /"/ (triangular VSA), or the corner vowels /i/, /u/, /"/, and // (quadrilateral VSA, or QVSA) in the F1F2 plane
(e.g., Blomgren, Robb, & Chen, 1998; Kent & Kim, 2003; Liu, Tsao, & Kuhl, 2005; Vorperian & Kent, 2007). Speech
associated with dysarthria is often characterized by centralization of vowels due to undershooting of articulatory
targets, such that vowels that normally possess a high center formant frequency tend to have lower frequency, and
vowel formants that normally have low center frequency tend to have higher frequency (Kent & Kim, 2003; Sapir
et al., 2003, 2007, in review-a; Sapir, Ramig, Fox, & Spielman, in review-b; Ziegler & von Cramon, 1983). The VSA is
expected to be compressed as a result of vowel centralization (Sapir et al., in review-a, in review-b); conversely, with
clear speech and hyperarticulation of vowels the VSA is expected to be expanded (Ferguson & Kewley-Port, 2007;
Smiljanic & Bradlow, 2005; Tjaden & Wilding, 2004).
In this regard, centralization of formants and/or compression of VSA in dysarthric speakers have been documented
in numerous studies (e.g., Higgins & Hodge, 2002; Liu et al., 2005; Roy, Leeper, Blomgren, & Cameron, 2001; Sapir
et al., 2003; Turner, Tjaden, & Weismer, 1995; Weismer, Jeng, Laures, Kent, & Kent, 2001; Weismer, Laures, Jeng,
Kent, & Kent, 2000; Ziegler & von Cramon, 1983). In several of these studies, a significant positive correlation
between VSA and speech intelligibility was reported (e.g., Higgins & Hodge, 2002; Liu et al., 2005; Weismer et al.,
2001). Furthermore, natural recovery and effective treatment have been shown to be associated with decentralization
of formants and decompression of VSA (e.g., Roy et al., 2001; Sapir et al., 2003, 2007; Ziegler & von Cramon, 1983).

128

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

Thus, VSA seems to be sensitive to changes in articulatory function associated with specific speech disorders. In the
present study we opted to use a quadrilateral VSA metric (henceforth QVSA) which, theoretically, is more likely to
capture articulatory abnormality than a triangular VSA, simply because abnormality might be in only one vowel,
rather than in all vowels. The QVSA was of the form 0.5  [(F2i  F1 + F2  F1" + F2" 
F1u + F2u  F1i) (F1i  F2 + F1  F2" + F1"  F2u + F1u F2i)], where the vowel-formant elements are
measured in Hertz (Vorperian & Kent, 2007).
2.5. Vowel articulation index (VAI)
Although the VSA appears sensitive to changes in articulatory function, there have been several studies in which
the VSA has failed to differentiate between speakers with and without dysarthria, even though these individuals were
judged perceptually to have abnormal articulation or poor speech intelligibility (Ansel & Kent, 1992; Bunton &
Weismer, 2001; Sapir et al., 2007; Weismer et al., 2001) and despite other acoustic measures that clearly indicated
articulatory abnormalities (Sapir et al., 2007, in review-a, in review-b). Although Neel (2008) has suggested that the
VSA is not sensitive to mild or moderate severity levels of dysarthria, the reasons why the VSA fails to consistently
differentiate normal from abnormal vowel articulation are not clear. One likely explanation, beyond the issue of
severity, is large inter-speaker variability associated with vowel formant measurements in general, and VSA in
particular (Sapir et al., 2007, in review-a, in review-b). These sources of inter-speaker variability can statistically
wash out important differences between abnormal and normal speakers. The VAI is a new acoustic metric of vowel
formant production (see below), designed to minimize the effects of inter-speaker variability and maximize
sensitivity to formant centralization (Sapir et al., in review-a, in review-b). It has been shown that inter-speaker
variability can be considerably reduced by using speaker normalization procedures such as vowel extrinsic
information, formant intrinsic information, and formant ratios (Adank, Smits, & van Hout, 2004; Strange, 1989;
Syrdal & Gopal, 1986). These normalization procedures include intrinsic methods, which are based on
relationships among all steady-state properties (Fo, F1, F2, F3) of individual vowel tokens, and extrinsic methods,
which involve the relationships among the formant frequencies of the entire vowel system of a speaker (Ainsworth,
1975). Sapir et al. (in review-a) have argued that a sensitive acoustic index of normal and abnormal vowel articulation
should probably include these features: vowel extrinsic, formant intrinsic, a ratio, and an arrangement of the vowelformant elements in such a way that the ratio is maximally sensitive to vowel centralization and decentralization.
With the above considerations in mind, Sapir et al. (in review-a) introduced the VAI. The VAI is of the form
(F2i + F1")/(F2u + F2" +F1i + F1u). F2i refers to the second formant of the vowel /i/, F1" refers to the first formant
of the vowel /"/, and so on. The arrangement of the formants was designed ostensibly to maximize the sensitivity of
the VAI to vowel centralization. Specifically, the vowel-formant elements in the VAI ratio are arranged such that the
elements in the numerator (F2i, F1") will decrease and elements in the denominator (F2u, F2", F1i, F1u) will increase
with vowel centralization. Thus, the overall effect is that the VAI decreases with vowel formant centralization and
increases with decentralization. Indeed, the VAI has recently been shown to be more effective than the VSA in
differentiating normal from dysarthric vowel articulation (in individuals with PD) and in monitoring treatment
effects (Sapir et al., in review-a). In the present study, we used the VAI, but with four vowels (/i/, /u/, /"/, and //),
which we labeled VAI(4) to distinguish it from a three-vowel formula or VAI(3). The VAI(4) is expressed as
(F2i + F2 + F1 + F1")/(F1i + F1u + F2u + F2"). Note that the F1 and F2 of the vowel // in the VAI(4) are both in
the numerator. This is because both F1 and F2 are likely to decrease with vowel centralization and increase with
vowel expansion.
It is clear from the preceding discussion that VSA and VAI are two different, and perhaps complementary,
approaches to assessing vowel centralization. To improve convergent validity (i.e., the extent to which the same
findings are obtained using different measures of the construct), we elected to employ both approaches to assess any
difference in vowel articulation before and after treatment of MTD. We chose to use the VAI(4) over a three vowel
metric for the same reasons we elected to use the QVSA over the TVSA as argued previously.
2.6. Reliability of extracted formant data
To evaluate the intra- and inter-rater reliability of the extracted acoustic measures (single vowel-formant elements),
speech samples from 10 speakers were randomly selected and reanalyzed again by the original judge and another

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

129

individual trained in formant extraction. These additional sets of duration and formant frequency measurements were
extracted, recorded, and checked in the same manner as the original measures. Comparisons of the duration measures
produced correlations of 0.98, with an average absolute intra-rater difference of 6.5 ms and an inter-rater difference of
9.2 ms. The formant measures were correlated at 0.99. The average absolute intra-rater difference was 6.2 Hz for F1
measures and 15.9 Hz for F2 measures, while the average absolute inter-rater differences were found to be 7.5 Hz for
F1 and 16.7 Hz for F2.
2.7. Perceptual ratings
To confirm a positive response to treatment, five young adults, who were students in communication disorders and
blinded to whether the sample was pre- or post-treatment, served as listeners in a perceptual rating task. They listened
to the same pre- and post-treatment recordings that were analyzed for the formant measures. In order to determine
intra-judge reliability, 28 of the samples were randomly selected, and played a second time. All 250 samples (111 pre,
111 post, 28 repeats) were fully randomized, and each listener heard them in the same sequence. Using a custom
MATLAB routine, the listeners used a computer mouse to move a slider on a screen to rate the voice quality. One end
of the visual analog scale was labeled normal and the other profoundly abnormal. The position of the slider was
stored by the software as a number that ranged from 0 to 100, with higher numbers reflecting greater disorder severity.
Intra-rater reliability was assessed by calculating a Pearson correlation between the original and second ratings for the
28 repeated samples. Correlations ranged from 0.89 to 0.96 (mean r = 0.92) for the individual raters, indicating
adequate intra-rater reliability. An intraclass correlation coefficient (ICC) was calculated to evaluate inter-rater
reliability. An average measure ICC of 0.97, and a single measure ICC of 0.85 (F = 29.397, p < 0.001 for both),
indicated acceptable inter-rater reliability.
3. Results
Table 1 summarizes the measurements of the vowel-formant elements and the QVSA and VAI(4) at pre- and posttreatment. In nine speakers, one or more vowel-formant elements were missing because the extraction software
(algorithm) was unable to track the formant of a speech token in an accurate and reliable manner. Thus, the data from
that token were not considered analyzable and were not included in subsequent stages of the analysis. Therefore, the
QVSA and VAI(4) of these speakers were excluded, and all statistical analyses are based on the remaining 102
speakers. Paired t-tests results and their significance values are also shown in Table 1. The results of the QVSA and
VAI(4) are also shown graphically in Figs. 1 and 2, respectively. As can be seen, severity of the MTD, as judged
perceptually by the 5 listeners, decreased significantly from pre- to post-treatment. The QVSA and VAI(4) increased
significantly following treatment. QVSA increased by 13%. Only 3 single vowel-formant elements changed
significantly from pre- to post-treatment. These were F2" (decreased by 83 Hz, or 6%), F2u (decrease by 69 Hz, or
4%), and F2 (increased by 25 Hz, or 1%).

Table 1
Means and standard deviations for all measures before and after treatment, with t-test results for pre-/post-changes.
Measure

Pre-mean

Pre-S.D.

Post-mean

Post-S.D.

t-Ratio

d.f.

p-Value

F1/u/ (Hz)
F2/u/ (Hz)
F1/i/ (Hz)
F2/i/ (Hz)
F1/"/ (Hz)
F2/"/ (Hz)
F1// (Hz)
F2// (Hz)
QVSA (Hz2)
VAI(4)
Perceptual severity

432.7
1762.9
426.8
2573.5
739.7
1358.5
756.9
1966.7
237,869
1.54
47.2

115.2
248.5
145.9
188.1
218.2
187.2
158.7
161.8
120,232
0.187
24.9

418.5
1694.2
403.2
2561.6
735.7
1275.5
747.3
1992.3
268,395
1.60
15.9

50.7
242.8
59.6
178.9
128.7
123.5
107.2
144.6
95,440
0.136
12.0

1.390
3.137
1.711
0.715
0.180
3.374
0.643
2.344
2.662
3.603
12.855

100
100
101
102
101
101
101
101
101
101
110

0.168
0.002*
0.090
0.476
0.857
<0.000*
0.318
0.021*
0.009*
<0.000*
<0.001*

Significant at p  .05. Effect size guidelines 0.20 = minimal, 0.50 = medium, 0.80 = large effect.

Effect size
0.28

0.50
0.16
0.28
0.41
1.60

130

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

Fig. 1. Vowel space quadrilateral for monophthongs /", i, u, / before and after treatment.

Fig. 2. Mean VAI(4) changes before and after treatment.

4. Discussion
The present findings indicate statistically significant increases in both QVSA and VAI(4) from pre- to postMTD treatment. These findings are consistent with the previous findings by Dromey et al. (2008) demonstrating
an increase in F2 slope for diphthongs following manual circumlaryngeal therapy for MTD. Although manual
circumlaryngeal therapy does not focus on articulation, and at no time was the patient asked to allocate any
cognitive resources to articulatory production, improvements in articulatory acoustics were observed. This is in
parallel with several acoustic and perceptual studies of dysarthric speech (Dromey et al., 1995; Sapir et al., 2003,
2007), wherein therapy targeting loud phonation resulted in improvement in speech articulation, even though
no deliberate effort was made to improve articulation. Admittedly, given the complex relationships between
vocal tract activity and the way this activity is reflected in the acoustic signal, it is difficult to determine
without physiologic/kinematic measurements which factors contributed to the acoustic findings in the present
study.
The increase in acoustic metrics may reflect improved vocal tract dynamics and articulatory movements during
vowel production, but what aspect of vowel articulation improved seems unclear. There were only three vowelformant elements that showed statistically significant changes from pre- to post-treatment: F2u, F2", and F2. The
change in F2 was within the mean error of measurements, was only 1% from the pretreatment measure, and had a
minimal effect size; and as such it might reflect a spurious effect. Although statistically significant, the % change and
the effect size measures indicate a moderate change in F2" and a lesser change in F2u. However, given the nonlinear
relationships between vowel articulation, acoustics, and perception of back vowels (Ferguson & Kewley-Port, 2007;
Gay, Boe, & Perrier, 1992; Perkell & Nelson, 1985), the fact that the acoustic changes in the back vowels in the present

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

131

study were moderate or small should not necessarily imply that they were perceptually or physiologically negligible.
Small changes in vowel articulation may bring about perceptual changes, which, in the present study, may have been a
change from abnormal to more normal or optimal vowel quality or goodness. On the other hand, these acoustic
changes may have been negligible perceptually. Obviously perceptual studies that target vowel quality are needed to
clarify this important issue.
The decreases in F2u and F2" frequencies following treatment most likely reflect changes in the posterior area
in the direction of normalcy. Interestingly, both /u/ and /"/ tend to assume a laryngeal position that is normally
lower than the /i/ (see review in Sapir, 1989). Thus, the F2 changes in /"/ and /u/ might have been related to the
movement of the base of the tongue, laryngeal height, anterior/posterior hyoid position, the lower pharyngeal
muscles, or any combination of these (Roy & Ferguson, 2001). In this regard, it is possible that the change in
vowel space following treatment was due to biomechanical linkages between mandibular, lingual, laryngeal, and
pharyngeal muscles. Pre-treatment tightness in the larynx may contribute to foreshortening of perilaryngeal
muscles, reduced flexibility in the movement of the hyoid, which then reduces the freedom with which the tongue
and jaw move. This view would be consistent with the mechanical explanations of articulatorlarynx interactions
suggested by previous authors (Higgins et al., 1998; McClean & Tasko, 2002; Sapir, 1989). A related explanation
would be that individuals with MTD simultaneously experience excessive laryngeal and articulatory muscle
tension, which both may respond to manual circumlaryngeal techniques. When skillfully applied, systematic
kneading and reposturing of the extralaryngeal region ostensibly stretches muscle tissue and fascia, promotes
local circulation with removal of metabolic wastes, and can relax tense muscles (Beck, 1994). In theory, this
tension improvement could spread regionally including to the suprahyoid musculature. Finally, if excessive neural
drive to both laryngeal and articulatory muscles is responsible for tension in the articulators in MTD, then
treatment would appear to have impacted activity in both subsystems. Based on their study of unimpaired
speakers, Cookman and Verdolini (1999) suggested that brainstem reflexes or cortical coactivation may be
responsible for parallel increases in articulatory and laryngeal muscle contraction. These types of reflexogenic
and transcortical sensorimotor interactions have also been postulated by Sapir (1989), Higgins et al. (1998), and
McClean and Tasko (2002).
Another possible explanation for some of the changes observed relates to the relationship of the source and filter,
with several investigators challenging the assumption of relative independence of the source and filter, especially in
cases of vocal pathology. It has been suggested that changes at the laryngeal source can and do influence filter
characteristics, and consequently the formant frequency patterns in the vocal tract. Maeda (1982) used an acoustic
model of the vocal tract to demonstrate that F1 and F2 values increase with both supraglottic constriction and
tracheal coupling. Tracheal coupling, which occurs when there is insufficient closure of the glottis during
phonation, introduces a zero into the all-pole model of vowel production, which appears to elevate both F1 and F2
(Klatt & Klatt, 1990). Supraglottic compression in the model produces a similar effect. Interestingly, a variety of
glottic and supraglottic contraction patterns have been associated with MTD and several classification systems have
been offered to describe these laryngoscopic features. Several commonly observed manifestations of abnormal
laryngeal muscle tension include: tight mediolateral glottic and/or supraglottic contraction, anteroposterior glottic
and/or supraglottic compression, incomplete glottic closure, posterior glottic chink, and bowing (Koufman &
Blalock, 1988; Lawrence, 1987; Leonard & Kendall, 1999; Morrison, 1997; Morrison & Rammage, 1994).
Therefore, it could be speculated that some of the lowering of F1 and F2 observed following treatment could be
related to a more normal pattern of vocal fold closure with reduced tracheal coupling and reduced supraglottic
compression.
Clearly, there is always a degree of uncertainty in the interpretation of articulatory acoustic data because of the
complex relationships between vocal tract activity and the way this activity is reflected in the acoustic signal. A
combined physiologic, acoustic, and perceptual study would be more desirable to delineate the effects of MTD
treatment on vowel articulation and vocal tract dynamics. Physiological studies specifically addressing jaw movement
in MTD, could be particularly revealing. Also, the present findings were based on a segment of a paragraph read aloud.
It has been shown that different speech tasks and different phonetic environments can significantly affect acoustic
measures of vowels and consonants (see Baken & Orlikoff, 2000 for a review). Thus, future studies should include a
variety of speech tasks and phonetic environments to better delineate the effects of MTD treatment on articulation. In
this study, we included only women. Given the influence of gender on speech acoustics (Kent & Read, 2002), the
inclusion of both sexes would be desirable.

132

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

5. Conclusions
With the above-stated limitations notwithstanding, the evidence from this study combined with our previous
investigation (Dromey et al., 2008) suggests that for this population at least, MTD appears not to be a disorder solely
restricted to the larynx. The results also indicate that the manual circumlaryngeal treatment may have desirable effects
on both the phonatory and articulatory systems. Clearly, additional acoustic, perceptual, and physiologic studies are
needed to assess the nature of MTD and the orolaryngeal mechanisms underlying the changes associated with its
successful management.
Acknowledgements
Shimon Sapir Ph.D., senior investigator, served as co-first author on this manuscript. We express our appreciation to
Kurtt Boucher and Kristen Gilbert for their assistance with data analysis. The authors wish to acknowledge the
important contribution of an anonymous reviewer regarding the elements of VAI(4) formula.
Appendix A
Sample from The Rainbow Passage (Fairbanks, 1960), showing the vowel targets used in the articulatory
acoustic analysis.
The rainbow is a division of white light into many beautiful colors. These take the shape of a long round arch,
with its path high above, and its two ends apparently beyond the horizon.

Appendix B. Continuing education


CEU Questions:
(1) Vowel centralization can be assessed indirectly by the following measurement techniques:
(a) Vowel space area.
(b) Vowel Articulation Index.
(c) Laryngeal palpation.
(d) All of the above.
(e) a and b only.
(2) The results of this research strongly suggest that muscle tension dysphonia (MTD) should be regarded exclusively
as a voice disorder.
(a) True.
(b) False.
(3) Increases in Vowel Space Area and Vowel Articulation Index following treatment of MTD are explained solely by
lowering of the larynx:
(a) True.
(b) False.
(4) Which of the following statements regarding manual circumlaryngeal therapy (MCT) is/are true?
(a) MCT can produce significant improvement in voice following a single treatment session.
(b) MCT directly targets articulation.
(c) MCT involves laryngeal reposturing and massage.
(d) All of the above are true.
(e) a and c only.

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

133

(5) Which of the statements regarding the vowel articulation index (VAI) is/are true?
(a) The VAI is essentially a speaker normalization procedure.
(b) The VAI potentially reduces inter-speaker variability.
(c) The VAI may be more sensitive to articulatory changes related to dysarthria and other speech disorders, as
compared to the vowel space area.
(d) All of the above are true.
(e) a and c only.

References
Adank, P., Smits, R., & van Hout, R. (2004). A comparison of vowel normalization procedures for language variation research. Journal of the
Acoustical Society of America, 116, 30993107.
Ainsworth, W. A. (1975). The perception of speech signals. Science Progress, 62(245), 3357.
Andersson, K., & Schalen, L. (1998). Etiology and treatment of psychogenic voice disorder: Results of a follow-up study of thirty patients. Journal of
Voice, 12, 69106.
Ansel, B., & Kent, R. (1992). Acoustic-phonetic contrasts and intelligibility in the dysarthria associated with mixed cerebral palsy. Journal of Speech
and Hearing Research, 35, 296308.
Aronson, A. E. (1990). Clinical voice disorders: An interdisciplinary approach (3rd ed.). New York, NY: Thieme.
Baken, R., & Orlikoff, R. (2000). Clinical measurement of speech and voice (2nd ed.). San Diego, CA: Singular Publishing Group.
Beck, M. F. (1994). Theory and practice of therapeutic massage (2nd ed.). Albany, NY: Milady Publishing Company.
Boersma, P., & Weenink, D. (2004). Praat signal processing [Computer Software]. Retrieved 01.02.08, from http://www.fon.hum.uva.nl/praat/.
Bunton, K., & Weismer, G. (2001). The relationship between perception and acoustics for a high-low vowel contrast produced by speakers with
dysarthria. Journal of Speech, Language & Hearing Research, 44, 12151228.
Boone, D. R., & McFarlane, S. C. (2000). The voice and voice therapy (6th ed.). Boston: Allyn and Bacon.
Blomgren, M., Robb, M., & Chen, Y. (1998). A note on vowel centralization in stuttering and nonstuttering individuals. Journal of Speech, Language
& Hearing Research, 41, 10421051.
Colton, R. H., & Casper, J. K. (2006). Understanding voice problems: A physiological perspective for diagnosis and treatment (3rd ed.). Baltimore:
Lippincott Williams & Wilkins.
Carding, P. N., & Horsley, I. (1992). An evaluation study of voice therapy in non-organic dysphonia. European Journal of Disorders of
Communication, 27, 137158.
Carding, P. N., Horsley, I., & Docherty, G. (1999). A study of the effectiveness of voice therapy in the treatment of 45 patients with nonorganic
dysphonia. Journal of Voice, 13, 72104.
Cookman, S., & Verdolini, K. (1999). Interrelation of mandibular laryngeal functions. Journal of Voice, 13, 1124.
Dromey, C. (2000). Articulatory kinematics in patients with Parkinson disease using different speech treatment approaches. Journal of Medical
Speech-Language Pathology, 8, 155161.
Dromey, C., & Ramig, L. (1998). Intentional changes in sound pressure level and rate: Their impact on measures of respiration, phonation and
articulation. Journal of Speech-Language and Hearing Research, 41, 10031018.
Dromey, C., Ramig, L. O., & Johnson, A. B. (1995). Phonatory and articulatory changes associated with increased vocal intensity in Parkinson
disease: A case study. Journal of Speech and Hearing Research, 38, 751764.
Dromey, C., Nissen, S., Roy, N., & Merrill, R. (2008). Articulatory changes following treatment of muscle tension dysphonia: Preliminary acoustic
evidence. Journal of Speech Language and Hearing Research, 51, 196208.
Fairbanks, G. (1960). Voice and articulation drillbook (2nd ed.). New York: Harper & Row.
Ferguson, S. H., & Kewley-Port, D. (2007). Talker differences in clear and conversational speech: Acoustic characteristics of vowels. Journal of
Speech Language and Hearing Research, 50, 12411255.
Ferrand, C. T. (2007). Speech Science: An integrated approach to theory and clinical practice (2nd ed.). Boston: Allyn and Bacon.
Gay, T., Boe, L. J., & Perrier, P. (1992). Acoustic and perceptual effects of changes in vocal tract constrictions for vowels. Journal of the Acoustical
Society of America, 92, 13011309.
Higgins, C., & Hodge, M. (2002). Vowel area and intelligibility in children with and without dysarthria. Journal of Medical Speech-Language
Pathology, 10, 271277.
Higgins, M. B., Netsell, R., & Schulte, L. (1998). Vowel-related differences in laryngeal articulatory and phonatory function. Journal of Speech
Language and Hearing Research, 41(4), 712724.
Hillman, R. E., Holmberg, E. B., Perkell, J. S., Walsh, M., & Vaughan, C. (1989). Objective assessment of vocal hyperfunction: An experimental
framework and initial results. Journal of Speech and Hearing Research, 32, 373392.
Kent, R., & Kim, Y. (2003). Toward an acoustic typology of motor speech disorders. Clinical Linguistics and Phonetics, 17, 427445.
Kent, R. D., & Read, C. (2002). The acoustic analysis of speech. Albany, NY: Delmar.
Klatt, D. H., & Klatt, L. C. (1990). Analysis, synthesis, and perception of voice quality variations among female and male talkers. Journal of the
Acoustical Society of America, 87, 820854.
Kleinow, J., Smith, A., & Ramig, L. O. (2001). Speech motor stability in IPD: Effects of rate and loudness manipulations. Journal of Speech
Language and Hearing Research, 44, 10411051.

134

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

Koufman, J. A., & Blalock, P. D. (1988). Vocal fatigue and dysphonia in the professional voice user: Bogart-Bacall syndrome. Laryngoscope, 98,
493499.
Kuhl, P. K., Andruski, J. E., Chistovich, I. A., Chistovich, L. A., Kozhevnikova, E. V., Ryskina, V. L., et al. (1997). Cross-language analysis of
phonetic units in language addressed to infants. Science, 277, 684686.
Lawrence, V. L. (1987). Suggested criteria for fibre-optic diagnosis of vocal hyperfunction. Care of the professional voice symposium. London: The
British Voice Association.
Leonard, R., & Kendall, R. (1999). Differentiation of spasmodic and psychogenic dysphonias with phonoscopic evaluation. Laryngoscope, 109,
295300.
Lieberman, J. (1998). Principles and techniques of manual therapy: Application in the management of dysphonia. In T. Harris, S. Harris, J. S. Rubin,
& D. M. Howard (Eds.), The voice clinical handbook (pp. 91138). London: Whurr Publishers.
Liu, H., Tsao, F., & Kuhl, P. (2005). The effect of reduced vowel working space on speech intelligibility in Mandarin-speaking young adults with
cerebral palsy. Journal of the Acoustical Society of America, 117, 38793889.
Maeda, S. (1982). A digital simulation method of the vocal tract system. Speech Communication, 1, 985988.
McClean, M., & Tasko, S. M. (2002). Association of orofacial with laryngeal and respiratory motor output during speech. Experimental Brain
Research, 146(4), 481489.
Morrison, M. D. (1997). Pattern recognition in muscle misuse voice disorders: How I do it. Journal of Voice, 11, 108114.
Morrison, M. D., & Rammage, L. A. (1994). The management of voice disorders. San Diego: Singular Publishing Group.
Neel, A. (2008). Vowel space characteristics and vowel identification accuracy. Journal of Speech-Language and Hearing Research, 51, 574
585.
Pannbacker, M. (1998). Voice treatment techniques: A review and recommendations for outcome studies. American Journal of Speech-Language
Pathology, 7, 4964.
Peifang, C. (1991). Massage for the treatment of voice ailments. Journal of Traditional Chinese Medicine, 11, 209215.
Perkell, J. S., & Nelson, W. L. (1985). Variability in production of the vowels /i/ and /a/. Journal of the Acoustical Society of America, 77, 18891895.
Ramig, L., & Verdolini, K. (1998). Treatment efficacy: Voice disorders. Journal of Speech, Language and Hearing Research, 41, S101S116.
Rosner, B. S., & Pickering, J. B. (1994). Vowel perception and production. New York: Oxford University Press.
Roy, N. (2008). Assessment and treatment of musculoskeletal tension in hyperfunctional voice disorders. International Journal of Speech-Language
Pathology, 10(4), 195209.
Roy, N., & Bless, D. M. (1998). Manual circumlaryngeal techniques in the assessment and treatment of voice disorders. Current Opinion in
Otolaryngology Head and Neck Surgery, 6, 151155.
Roy, N., Bless, D. M., Heisey, D., & Ford, C. N. (1997). Manual circumlaryngeal therapy for functional dysphonia: An evaluation of short- and longterm treatment outcomes. Journal of Voice, 11, 321331.
Roy, N., Leeper, H. A., Blomgren, M., & Cameron, R. M. (2001). A description of phonetic, acoustic, and physiological changes associated with
improved intelligibility in a speaker with spastic dysarthria. American Journal of Speech-Language Pathology: A Journal of Clinical Practice,
10, 274288.
Roy, N., & Ferguson, N. A. (2001). Formant frequency changes following manual circumlaryngeal therapy for functional dysphonia: Evidence of
laryngeal lowering? Journal of Medical Speech-Language Pathology, 9, 169175.
Roy, N., & Leeper, H. A. (1993). Effects of the manual laryngeal musculoskeletal tension reduction technique as a treatment for functional voice
disordersPerceptual and acoustic measures. Journal of Voice, 7, 242249.
Rubin, J. S., Lieberman, J., & Harris, T. (2000). Laryngeal manipulation. Otolaryngologic Clinics of North America, 33, 10171034.
Sapir, S. (1989). The intrinsic pitch of vowels: Theoretical, physiological, and clinical considerations. Journal of Voice, 3, 4451.
Sapir, S., Ramig, L., Spielman, J., & Fox, C. (in review-a). Vowel Articulation Index (VAI) as an acoustic metric of dysarthric vowel articulation:
Comparison with vowel space area in Parkinson disease and healthy aging. Journal of Speech-Language and Hearing Research.
Sapir, S., Ramig, L., Fox, C., & Spielman, J. (in review-b). Abnormal vowel articulation in early Parkinsons disease: Acoustic and self-rating
findings. Movement Disorders.
Sapir, S., Spielman, J., Ramig, L., Story, B., & Fox, C. (2007). Effects of intensive voice treatment (the Lee Silverman Voice Treatment [LSVT]) on
vowel articulation in dysarthric individuals with idiopathic Parkinson disease: Acoustic and perceptual findings. Journal of Speech Language
and Hearing Research, 50, 899912.
Sapir, S., Spielman, J., Ramig, L., Hinds, S., Countryman, S., Fox, C., et al. (2003). Effects of intensive voice treatment (the Lee Silverman Voice
Treatment [LSVT]) on ataxic dysarthria: A case study. American Journal of Speech Language Pathology, 12, 387399.
Schulman, R. (1989). Articulatory dynamics of loud and normal speech. Journal of the Acoustical Society of America, 85, 295312.
Smiljanic, R., & Bradlow, A. (2005). Production and perception of clear speech in Croatian and English. Journal of the Acoustical Society of
America, 118, 16771688.
Stemple, J. C. (Ed.). (2000). Voice therapy: Clinical studies (2nd ed.). San Diego, CA: Singular.
Strange, W. (1989). Evolving theories of vowel perception. Journal of the Acoustical Society of America, 85, 20812087.
Syrdal, A., & Gopal, H. (1986). A perceptual model of vowel recognition based on the auditory representation of American English vowels. Journal
of the Acoustical Society of America, 79, 10861100.
Turner, G., Tjaden, K., & Weismer, G. (1995). The influence of speaking rate on vowel space and speech intelligibility for individuals with
amyotrophic lateral sclerosis. Journal of Speech and Hearing Research, 38, 10011013.
Tjaden, K., & Wilding, G. E. (2004). Rate and loudness manipulations in dysarthria: Acoustic and perceptual findings. Journal of Speech, Language
and Hearing Research, 47, 766783.
Vorperian, H. K., & Kent, R. D. (2007). Vowel acoustic space development in children: A synthesis of acoustic and anatomic data. Journal of Speech
Language Hearing Research, 50, 15101545.

N. Roy et al. / Journal of Communication Disorders 42 (2009) 124135

135

Weismer, G., Jeng, J.-Y., Laures, J., Kent, R., & Kent, J. (2001). Acoustic and intelligibility characteristics of sentence production in neurogenic
speech disorders. Folia Phoniatrica et Logopaedica, 53, 118.
Weismer, G., Laures, J., Jeng, J., Kent, R., & Kent, J. (2000). Effect of speaking rate manipulations on acoustic and perceptual aspects of the
dysarthria in amyotrophic lateral sclerosis. Folia Phoniatrica et Logopaedica, 52, 201219.
Ziegler, W., & von Cramon, D. (1983). Vowel distortion in traumatic dysarthria: A formant study. Phonetica, 40, 6378.

Vous aimerez peut-être aussi