Pitch Percep Complex Tone

Pitch Perception Of Complex Tones Mechanisms of Complex Pitch Perception: The Early Years Temporal Theory(Schouten, 1940): Based
on neural phase locking to stimulus waveform. Limited by jitter in synaptic transmission (~5 kHz upper limit in cats) Temporal theories assume that pitch is derived from the temporal pattern of neural spikes arising from one or more points on the basilar membrane where harmonics are interfering. When we combine the harmonics of a given fundamental, even when the fundamental isnt there, the combined time waveform repeats at the rate of the fundamental frequency. As we know, auditory nerve fibers will be phase locked to the high positive peaks in the time waveform which are at the period of the fundamental. These theories state that this is the information we use to assign a pitch to the complex. For the resolved harmonics, the neurons will be phase-locked to the fine structure of each harmonic. For the unresolved harmonics, the neurons will be phase-locked to the envelop of the sound (periodicity), the fundamental frequency. Evidence against a pure temporal model Pitch sensation is strongest for low-order (resolved) harmonics (Plomp, 1967; Ritsma, 1967).
Advantages Accounts the phenomena of missing fundamental. Includes the concept of phase locking. Spectral theory Based on mechanical frequency analysis and frequency-to-place mapping in the cochlea. Limited by cochlear frequency resolution (first 6-10 harmonics in humans). According to this theory, temporal cues like envelope, temporal fine structure do not play a role, and the pitch of the unresolved harmonics cannot be perceived well. But we can perceive the pitch of unresolved harmonics, because of the temporal cues and therefore, this is evidence against the spectral theory.
Spectro-temporal theory (Moore, 2003)
Effect of broader auditory filters Exciation pattern evoked by a sinusoid in a cochlear hearing loss is broader than normal. Hense, according to the place theory this should lead to impaired frequency discrimination of sinusoids. Reduced frequency selectivity presumably leads to reduced ability to resolve partials in complex tones, and this might affect the perception of the pitch of complex tones. In the normal auditory system, complex tones with low, resolvable harmonics give rise to clear pitches while tones containing only high resolvable harmonics give rise to less clear pitches. Since, cochlear HL is associated with poorer resolution of harmonics, it would lead to less clear pitch and poorer discrimination. Evidence for a role of the time pattern of the waveform evoked on the basilar membrane by the higher harmonics comes from studies of the effect of changing the relative phase of the components in a complex tone. Changes in phase can markedly alter both the peak factor of the waveform on the basilar membrane (Moore 1977; Patterson, 1987b) and the number of major waveform peaks per period (Moore 1977; Patterson 1987a; Moore and Glasberg 1988a).
If a complex tone contains only harmonics 2nd, 3rd, 4th they will be resolved on the BM. In this case, the relative phase of the harmonics is of little importance. If the tone contains only high harmonics (above the 8th) then changes in the relative phase can affect both the pitch value and the clarity of pitch. (Moore, 1977). Pitches based on high unresolved harmonics will be clearest when the waveforms evoked at different points on the BM each have a single major peak/period of the sound. As hearing impaired subjects have broader-than-normal filters, their perception of pitch and ability to discriminate repetition rate will be affected by the phase of the components than in normals. Even lower harminics would interact at the outputs of the auditory filters, giving a potential for strong phase effects. Changes in phase locking and cochlear traveling wave phase could lead to less clear pitches. Unresolved harmonics Unresolved harmonics pass through the same auditory filter. Different sets of harmonics could create the same pattern of activity across auditory filters. So if we hear different virtual pitches when the harmonics are unresolved, then we cant be using a pattern to do that because the pattern is the same. We could be using temporal information because the combined waveform of the harmonics repeats at the rate of the fundamental frequency. Auditory filters are wider (in terms of linear Hz) at high frequencies. So generally unresolved harmonics will occur at high frequencies. So this would be a case where we are using phase-locking to lowfrequency modulations of a high-frequency carrier to identify sound. Perception of complex tones have required subjects to identify which of two successive harmonic complex tones had the higher F0 (corresponding to a higher pitch). Cochlear HL makes it more difficult to resolve the harmonics of a complex tone, especially when the harmonics are of moderate harmonic number. For example, for an F0 of 200 Hz, the 4th & 5th harmonics would be quite well resolved in a normal auditory system but would be poorly resolved in an ear where the auditory filters were, say, three times broader than normal.
Figure 1: Psychoacoustical excitation patterns for a harmonic complex tone containing harmonics 410 of a 200-Hz fundamental. Each harmonic has a level of 80 dB SPL. Excitation patterns are shown for a normal ear (solid curve) and for an impaired ear (dashed curve) in which the auditory filters are three times as broad as normal. The former shows distinct ripples corresponding to the lower harmonics, whereas the latter does not. In the normal auditory system, complex tones with low harmonics give rise to clear pitches, while tones containing only high harmonics give less clear pitches. The difference between the two types of tones probably arises because the temporal information conveyed by the low harmonics is less ambiguous than the temporal information conveyed by the high harmonics. Since cochlear HL is associated with poorer resolution of harmonics, it could be expected that this would lead to less clear pitches and poorer discrimination of pitch than normal. However, for complex tones with many harmonics, this effect should be small; spectro-temporal theories assume that information can be combined across different frequency regions to resolve ambiguities. Spectro-temporal theories also lead to the prediction that the perception of pitch and the discrimination of repetition rate by subjects with broader-than-normal auditory filters might be more affected by the relative phases of the components than is the case for normally hearing subjects. For subjects with broad auditory filters, even the lower harmonics would interact at the outputs of the auditory filters, and higher harmonics would interact more than normal, giving a potential for strong phase effects. Abnormalities in phase locking (neural synchrony) or in the ability to make use of phase-locking information. This could lead to less clear pitches and poorer discrimination of pitch than normal. Effect of changing the relative phase of one component/ Effect of temporal fine structure The temporal fine structure of a waveform refers to the rapid variations in pressure that carry the acoustic information. The temporal envelope of a waveform refers to the slower, overall changes in the amplitude of these fluctuations. For unresolved harmonics (and certain stochastic stimuli) information about periodicity is present in both the fine structure and the envelope. Evidence that the time pattern of the waveform evoked on the BM by the higher harmonics plays a role, comes from studies of the effect of changing the relative phase of the components in a complex tone. Changes in phase can markedly alter the waveform of a tone. Effect of changing the relative phase of one component in a complex tone containing just three harmonics, the ninth, tenth and eleventh. The left panels show waveforms of the individual sinusoidal components, and the right panels show the waveforms of the complex tones produced by adding together the sinusoidal components. For the figure shown, in the top half of the figure, all three components start in cosine phase; this means that the components have their maximum amplitudes at the start of the waveform. Correspondingly, a peak in the envelope of the complex tone occurs at the start of the waveform, and a new peak in the envelope occurs for every ten oscillations in the temporal fine structure. At the point in
time marked by the vertical dashed line, a peak in the waveform of the centre component (the tenth harmonic) coincides with minima in the waveforms of the two other harmonics. This gives a minimum in the envelope of the complex tone. Thus, thewaveform of the complex tone has an envelope with distinct peaks and dips and is described as amplitude modulated. In the bottom half of the figure, the phase of the highest component is shifted by 180 ; that component starts with a minimum rather than a maximum in its waveform. As a result, the amplitude of the complex tone at the start of the waveform is not as high as for the case when all components started in cosine phase. At the point in time marked by the vertical dashed line, a peak in the waveform of the centre component (the tenth harmonic) coincides with a minimum in the waveform of the ninth harmonic and a peak in the waveform of the eleventh harmonic. The minimum in the envelope of the complex tone is less deep than the minimum when all components started in cosine phase. Thus, the envelope is much flatter, and the envelope actually shows two maxima for each period of thewaveform. This waveform is sometimes called quasi-frequency modulated, as the time between peaks in the temporal fine structure fluctuates slightly.
Figure 2: The effect of changing the relative phase of one component in a complex tone containing just three harmonics, the 9th, 10th and 11th. The left panels show waveforms of the individual sinusoidal components, and the right panels show the waveforms of the complex tones produced by adding together the sinusoidal components. For one phase (upper panels), the waveform has an envelope with distinct
peaks and dips; this waveform is sometimes called amplitude modulated. For the other phase (lower panel), the envelope is much flatter; this waveform is sometimes called quasi-frequency modulated. If the harmonics have low harmonic numbers (2nd, 3rd and 4th), they will be resolved on the BM. In this case, the relative phase of the harmonics is of little importance as the envelope on the BM does not change when the relative phases of the components are altered. However, if the harmonics have high harmonic numbers (as in Figure above), then changes in the relative phase of the harmonics can result in changes in the envelope of the waveform on the BM. If this waveform has a peaky envelope, the repetition period will be clearly represented in the intervals between neural impulses. However, if thewaveform has a flatter envelope, the repetition period will be less clearly represented. For tones containing only high harmonics, phase can affect both the pitch value and the clarity of pitch (Moore, 1977). Moore et al. (2006) measured F0DLs using stimuli similar to those illustrated in Figure above. To prevent subjects basing their judgements on the frequencies of individual harmonics, or on overall shifts in the excitation pattern, the number of the lowest component, N, was randomly varied by +/- 1 from one stimulus to the next. Their mean results for normally hearing subjects are shown in Figure. The F0DLs were not affected by the phase of the components when N was 7 or less. However, when N was 8 or more, the waveform with the peaky envelope led to smaller F0DLs than the waveform with the flatter envelope. F0DLs worsened progressively as N was increased from 8 to 12, and then flattened off.
Figure 3: Mean results of Moore et al. (2006) for normally hearing subjects, showing F0DLs plotted as a function of the number, N, of the lowest harmonic in three-component complex tones. Open circles show F0DLs for tones with components added in cosine phase (giving a peaky waveform) and filled circles show F0DLs for tones with components added in alternating phase (giving a flatter waveform). The
dashed line indicates the smallest F0DLs that could be achieved by making use of shifts in the excitation pattern of the stimulus. As an example, considering the results of Moore and Peters (1992) (Pitch discrimination and phase sensitivity in young and elderly subjects and its relationship to frequency selectivity; JASA. 91(5))
We can study the effect of fundamental frequency, phase, hearing loss, age, number of resolved &
unresolved components on the pitch perception of complex tones. Support for spectrotemporal theory as opposed to spectral theory. Subjects: They tested four groups of subjects: young subjects with normal hearing; young subjects with impaired hearing; elderly subjects with normal or near-normal hearing; and elderly subjects with impaired hearing. Complex tones(Stimuli): The complex tones were composed of equal-amplitude harmonics with F0s of 50, 100, 200, and 400 Hz. Each component had a level of 75 dB SPL, chosen to be above threshold for all subjects. The tones contained harmonics 1 to 12, 6 to 12, 4 to 12 and 1 to 5. Phase: The components of the harmonic complexes were added in one of two phase relationships, all cosine phase or alternating cosine and sine phase. The former results in a waveform with prominent peaks and low amplitudes between the peaks. The latter results in a waveform with a much flatter envelope and with two major waveform peaks per period. FODL: The geometric mean values of the F0DLs are plotted separately for each group. F0DLs are expressed as a percentage of F0 and plotted on a logarithmic scale. Each symbol represents results for a particular harmonic complex, as indicated by the key in the upper right panel. The results have been averaged across the two phase conditions. Results: Performance was clearly worse for the two hearing-impaired groups than for the young normalhearing group. F0DLs for the elderly normal-hearing group were also higher than for the young normalhearing group, especially at low F0s. Indeed, at F0 = 50 Hz, F0DLs for the elderly normal-hearing group were similar to those for the two impaired groups. For all four groups, F0DLs for F0 = 50 Hz were higher for the complex containing harmonics 1 to 5 than for any of the other complexes. For the two elderly groups and for the lower F0s, performance was worse for complex 1 to 12 than for complexes 4 to 12 or 6 to 12, indicating that adding lower harmonics to a complex tone can actually impair performance. This may happen because, when auditory filters are broader than normal, adding lower harmonics can create more complex waveforms at the outputs of the auditory filters, making temporal analysis more difficult (Rosen and Fourcin 1986). Overall, these results suggest that, for low F0s, pitch is extracted primarily from harmonics above the 5th. This is consistent with results presented by Moore and Glasberg (1988a, 1990). In contrast, for F0 = 400 Hz, the complexes containing only high harmonics (6 to 12 and 4 to 12) tended to be the most poorly discriminated, especially by the two impaired groups. These results indicate that the dominant region for pitch is not fixed in harmonic number, but shifts upward in harmonic number as F0 decreases, as suggested by earlier work (Plomp 1967; Patterson and Wightman 1976; Moore et al. 1985a).
Hearing loss: Performance is worse for the two hearing-impaired groups than for the young normal group. The DLCs showed moderate positive correlations with the absolute thresholds (audiogram values) over the frequency range covered by the harmonics. For example, the DLCs for the complex tone with harmonics 1-12 with fo = 50 Hz were positively correlated with the absolute thresholdsu p to 1 kHz (r = 0.59, 0.44, and 0.40 at 250, 500, and 1000H z respectively), but showed essentially no correlation for frequencies above 1 kHz. For fo = 400 Hz, the correlations were about 0 .5 for frequencies from 250 to 2000 Hz, but were close to 0 at 4 and 8 kHz. These correlations probably reflect the fact that the stimuli were presented at a fixed SPL, so sensation levels were lower for subjects with higher absolute thresholds. Age: DLCs for the elderly normal group are also higher than for the young normal group, especially at low fo's (50 Hz). Indeed, at fo = 50 Hz, DLCs for the elderly normal group are similar to those for the two impaired groups. Fundamental frequency: For all four groups, DLCs for fo = 50 Hz are higher for the complex containing harmonics 1-5 than for any of the other complexes. This suggests that, at very low fo's, pitch is extracted primarily from harmonics above the fifth. This is consistent with results presented by Moore and Glasberg (1988,1 990a). In contrast fo = 400 Hz, the complexes containing only high harmonics( 6-12 and 4 -12) tend to be the most poorly discriminated, especially by the two impaired groups. These results indicate that the dominant region for pitch is not fixed in harmonic number, but shifts upward in harmonic number as fo decreases, suggested by earlier work (Plomp, 1967; Patterson and Wightman, 1976). For the young normal group, harmonic content and phase were significant. For the young impaired group, the main effect of harmonic content was not significant, but the other effects were: for phase and fo. The DLCs for alternating phase were larger than those for cosine phase.There was a significant interaction of harmonic content and fo; at f 0 = 50 Hz, DLCs were higher for complex 1-5 than for the other complexes, while at fo = 400 Hz, DLCs were higher for complexes 4 -12 and 6-12 than for the other complexes. Effects on the F0DLs of the relative phases of the components. For all four subject groups, F0DLs were, on average, larger for components added in alternating phase than for components added in cosine phase. Figure 4: The mean DLCs for each group tested by Moore and Peters (1992). Results are shown for each harmonic complex and phase, but are averaged across F0.
Effect of phase In every case shown, F0DLs are larger for alternating phase than for cosine phase, but the effects overall are rather small. This is somewhat misleading, however, in indicating the influence of phase, since the direction of the effect (whether the change from cosine to alternating phase made performance worse or better) varied in an idiosyncratic way across subjects, F0s, and harmonic contents. Phase effects for individual subjects were often considerably larger.
Figure 5: Mean results of Moore and Peters (1992). The geometric mean values of the DLCs, expressed as a percentage of F0, are plotted separately for each group. Each symbol represents results for a particular harmonic complex, as indicated by the key in the upper right panel. The results have been averaged across two phase conditions, with components added in cosine phase or alternating phase. Which theory do the findings support? These results suggest that, relative to people with normal hearing, people with cochlear damage depend relatively more on temporal information from unresolved harmonics and less on spectral/temporal information from resolved harmonics. The results lend support to spectrotemporal theories of pitch perception. The variability in the results across people, even in cases where the audiometric thresholds are similar, may occur partly because of individual differences in the auditory filters and partly because loss of neural synchrony is greater in some people than others. People in whom neural synchrony is well preserved may have good pitch discrimination despite having broader-than normal auditory filters. People in whom neural synchrony is adversely affected may have poor pitch discrimination regardless of the degree of broadening of their auditory filters. As noted earlier, it may be the case that the clarity of pitch is greatest, and the F0DL is smallest, when the waveforms evoked at different points on the basilar membrane each contain a single major peak per period. The basilar membrane waveforms are determined by the magnitude and phase responses of the auditory filters, and these may vary markedly across subjects and center frequencies depending on the specific pattern of cochlear damage. The variability in the phase effect may arise from variability in the properties of the auditory filters across subjects and across center frequencies. The following findings do not support the spectral theory. Several aspects of the result for complex tones support spectrotemporal theories of pitch perception as opposed to purely spectral theories. First, at low fos, discrimination was better for complexes containing high harmonics. This suggests that higher, unresolved harmonics are dominant in determining pitch at low fos for both normal and hearing impaired subjects. Second, DLCs for the elderly impaired and elderly normal subjects were sometimes larger for the complex with harmonics 1-12 than for the complex with harmonics 6-12 or 4-12. Thus adding lower harmonics can impaired performance. Result similar to this have been reported by Moore and Glasberg (1988, 1990a). This is difficult to explain in terms of spectral theories. It can be explained by spectrotemporal theories on the assumption that, for subjects with broad auditory filters, adding low harmonics may produces more complex waveforms at the outputs of auditory filters responding to higher harmonics. This may make subsequent temporal analysis more difficult. A third aspect of the result supporting spectrotemporal theories is the finding of significant effects of the phase of the components. The effects were present for all groups, but were somewhat larger for the impaired groups. Buunen et al., 1974 has suggested that the effects of relatives phase on pitch can be explained in terms of combination tones, particularly the cubic difference tone, 2f1-f2.
Changing the phases of the components in a complex tone may affect the relative levels of the combination tones, which in turn might affect pitch. However, this explanation seems unlikely to apply to the results, since the levels of combination tones are lower in hearing impaired than in normal subjects (Leshowitz and Lindstrom, 1977), but the phase effects are generally larger for impaired than for normal subjects. The effects of phase on the DLCs most likely reflect a sensitivity to the time structure of the waveforms at the outputs of the auditory filters, as proposed by spectrotemporal theories of pitch. In conclusion, the frequency discrimination of complex tones is worse than normal in young and elderly hearing impaired subjects. In addition, some elderly subjects with normal absolute thresholds and normal auditory filters show impaired frequency discrimination. This may reduce the ability to take advantage of prosodic cues during audiovisual speech perception. Experimental Studies The pitch discrimination of complex tones by hearing-impaired people has been the subject of several studies (Hoekstra and Ritsma, 1977; Rosen, 1987; Moore and Glasberg, 1988c, 1990; Moore and Peters, 1992; Arehart, 1994; Moore and Moore, 2003; Bernstein and Oxenham, 2006; Moore, Glasberg and Hopkins, 2006). Most studies have measured F0 DLs. These studies have revealed the following: 1. There was considerable individual variability, both in overall performance and in the effects of harmonic content. 2. For some subjects, when F0 was low, F0 DLs for complex tones containing only low harmonics (e.g. 15) were markedly higher than for complex tones containing only higher harmonics (e.g. 612), suggesting that pitch was conveyed largely by the higher, unresolved harmonics. 3. For some subjects, F0 DLs were larger for complex tones with lower harmonics (112) than for tones without lower harmonics (412 and 612) for F0s up to 200 Hz. In otherwords, adding lower harmonics made performance worse. This may happen because, when auditory filters are broader than normal, adding lower harmonics can create more complex waveforms at the outputs of the auditory filters. For example, there may be more than one peak in the envelope of the sound during each period, and this can make temporal analysis more difficult (Rosen, 1986; Rosen and Fourcin, 1986). 4. The F0 DLs were mostly only weakly correlated with measures of frequency selectivity. There was a slight trend for large F0 DLs to be associated with poor frequency selectivity, but the relationship was not a close one. Some subjects with very poor frequency selectivity had reasonably small F0 DLs. 5. There can be significant effects of component phase. In several studies, F0 DLs have been measured with the components of the harmonic complexes added in one of two phase relationships, all cosine phase or alternating cosine and sine phase. The former results in a waveform with prominent peaks and low amplitudes between the peaks (as in the upper-right panel). The latter results in a waveform with a much flatter envelope (as in the lower right panel). (From Figure 2) The F0 DLs tended to be larger for complexes with components added in alternating sine/cosine phase than for complexes with components added in cosine phase. However, the opposite effect was sometimes
found. The direction of the phase effect varied in an unpredictable way across subjects and across type of harmonic complex. Phase effects can sometimes be stronger for hearing-impaired than for normally hearing subjects, although this is not always the case. Correlation with frequency selectivity The DLCs were mostly only weakly correlated with the measures of frequency selectivity. Considering the results for the two hearing-impaired groups together, the correlation of the DLCs with the ERBs was typically about 0.4. However, the DLC for fo = 200 Hz was correlated with the ERB for fc = 200 Hz (r = 0.88, p <0.01) and fc = 400 Hz (r = 0.79, p < 0.01) and these correlations remained high (r = 0.90 and 0 .70) after partialing out the effect of the absolute threshold at 200 and 400 Hz, respectively. Also, the DLC for fo = 100 Hz was correlated with the ERB for fc = 400 Hz (r = 0.77, p<0.01), and the correlation remainedr easonablyh igh (r- 0.61) after partialing out the effect of absolute threshold at 400 Hz. The DLCs for fo = 50 & 400 Hz were not significantly correlated with any of the measures of frequency selectivity. Thus, while there is a trend for large DLCs to be associated with poor frequency selectivity, the relationship does not seem to be a close one. Hopkins and Moore (2007) studied how cochlear hearing loss affects sensitivity to changes in temporal fine structure in complex tones. To do this, they measured the ability to discriminate a harmonic complex tone, with F0 = 100, 200 or 400 Hz, from a similar tone in which all components had been shifted up by the same amount in Hertz, F. For example, for an F0 of 100 Hz, the harmonic tone might contain components at 900, 1000, 1100, 1200 and 1300 Hz, while the shifted tone might contain components at 925, 1025, 1125, 1225 and 1325 Hz; the value of F in this example is 25 Hz. People with normal hearing perceive the shifted tone as having a higher pitch than the harmonic tone (de Boer, 1956; Moore and Moore, 2003). The envelope repetition rate of the two sounds is the same (100 Hz), so the difference in pitch is assumed to occur because of a difference in the temporal fine structure of the two sounds (Schouten, Ritsma and Cardozo, 1962; Moore and Moore, 2003). To reduce cues relating to differences in the excitation patterns of the two tones, Hopkins and Moore (2007) used tones containing many components, and the tones were passed though a fixed bandpass filter. For one of the conditions, the filter was centred on the eleventh harmonic. To prevent components outside the passband of the filter from being audible, a background noise was added. In the presence of this noise, the differences in excitation patterns between the harmonic and frequency-shifted tones were very small. The normally hearing subjects tested by Hopkins and Moore (2007) were able to perform this task well, presumably reflecting the ability to discriminate changes in the temporal fine structure of the harmonic
and frequency-shifted tones. However, subjects with moderate cochlear hearing loss generally performed very poorly. The results suggest that moderate cochlear hearing loss results in a reduced ability, or no ability, to discriminate harmonic from frequency-shifted tones based on temporal fine structure. Overall, these results suggest that people with cochlear HL have reduced abilities to perceive the pitch, and to discriminate the F0, of complex tones. This appears to happen for at least two reasons: (1) Lower harmonics are less well resolved than normal, which makes it harder to determine the frequencies of individual low harmonics; (2) The ability to make use of temporal fine structure information (based on phase locking) is reduced. Perceptual Consequences Of Altered Frequency Discrimination And Pitch Perception Effects On Music Perception The existence of pitch anomalies (diplacusis and exaggerated pitch-intensity effects) may affect the enjoyment of music. Changes in pitch with intensity would obviously be very disturbing, especially when listening to a live performance, where the range of sound levels can be very large. There have been few, if any, studies of diplacusis for complex sounds, but it is likely to occur to some extent. Perception of Musical Intervals Arehart and Burns (1999) studied the ability of subjects with high-frequency cochlear HL to identify musical intervals between complex tones containing just two harmonics. All subjects were musically trained. Harmonics were presented at a low sensation level (14 dB SL) either monaurally or dichotically (one harmonic to each ear). To prevent subjects basing their judgments on the pitches of individual harmonics, the rank (harmonic number) of the lowest harmonic in each complex was randomly varied from one stimulus to the next. When the F0 was low and the (mean) harmonic number was low, subjects showed excellent performance. However, for high F0s and high (mean) harmonic numbers, performance worsened markedly and was much poorer than reported by Houtsma and Goldstein (1972) for normally hearing subjects. The highest frequency of the harmonics for which the task was possible was similar for monaural and dichotic presentation. Since resolution of the harmonics should not have been a problem for the dichotic presentation, this finding suggests that some factor other than reduced frequency selectivity limited the ability of the subjects to extract residue pitch from tones with high F0s and high harmonic numbers. Arehart and Burns (1999) suggested that the poor performance of their subjects when the harmonics fell in the region of the hearing loss may have been due to degraded temporal information from that region. A comparison of monotic and dichotic complex-tone pitch perception in listeners with hearing loss Arehart, Burns (1999) J. Acoust. Soc. Am. 106 (2).
We can learn the effect of dichotic versus monotic presentation, sensational level, hearing loss & the role of frequency selectivitythrough the perception of musical intervals. Listeners with normal hearing can derive a pitch corresponding to the missing fundamental frequency (F0) of complex tones containing only two successive upper harmonics, regardless of whether the harmonics are presented to the same ear, monotic presentation or to separate ears, dichotic presentation (Houtsma and Goldstein, 1972). Performance in a musical interval identification task was perfect for harmonics presented monotically or dichotically when the harmonic number was below four, but deteriorated for higher harmonics; example, for an F0 of 200 Hz, performance dropped to chance at about harmonic number 10. The upper harmonic for which performance reached chance for various fundamental frequencies was consistent with the upper frequency limit found in other studies of the existence region of complex-tone fundamental pitch e.g., Ritsma, 1962, 1963, 1967). This upper frequency limit also corresponds approximately to frequencies where individual harmonics of a complex tone are not resolvable (Plomp, 1964) because of interaction at the peripheral basilar membrane level. The fact that listeners performance was virtually identical for dichotic and monotic presentation of harmonics suggests that peripheral frequency resolution per se is not entirely responsible for limiting complex-tone fundamental pitch perception to lower harmonics. Studies of complex-tone frequency discrimination provide only indirect evidence that complex-tone pitch perception is degraded. In addition, since these experiments used either monaural or diotic signal presentation, degradation of pitch processing and degradation of frequency resolution were confounded. The present study examines directly the extent to which complex-tone pitch perception is degraded in listeners with hearing loss and the role that decreased frequency selectivity plays in the degradation of complex-tone pitch perception. This study applies the monotic/dichotic musical interval identification paradigm of Houtsma and Goldstein ,1972 to musically trained listeners with high-frequency hearing loss. If pitch perception is degraded in these listeners, they would be expected to perform poorly in the musical interval identification task. If this degraded pitch perception is in some measure due to degraded frequency resolution, the performance for conditions in which harmonics are in a region of hearing loss would be expected to be worse for monotic presentation of harmonics than for dichotic presentation. Method Subjects Three listeners with high-frequency cochlear-based hearing loss - L1, L2, and L3.
Using an adaptive, 3-AFC procedure, detection thresholds in quiet were obtained for each listener for frequencies ranging from 100 Hz to 5000 Hz. The listeners had musical training, and could identify musical intervals conveyed by low-frequency pure and complex tones with 100% accuracy at the sensation levels used in this study.
Figure 6: Thresholds in dB SPL for listeners with HL (L1, L2, and L3). Also shown are reference thresholds for normal hearing listeners (ANSI, 1989)
Stimuli Musical intervals were conveyed by two sequential complex tones(CT). Each of the CTs contained two successive upper harmonics. For a given block of trials, the missing F0 of the first complex tone was either 100 Hz, 200 Hz, 300 Hz, 400 Hz, or 500 Hz. The missing F0 of the second CT was such that the frequency ratio between the F0s of the second and first tones defined one of the seven ascending musical intervals ranging from a minor second frequency ratio of 1.059 through a fifth frequency ratio of 1.498. The set of musical intervals is not, in and of itself, a significant issue in the study of the pitch of complex tones containing two harmonics. Rather, the intervals provide a means by which musically trained listeners can describe the musical pitches that they hear.
The two successive harmonics were presented either monotically or dichotically at 14 dB Sensation Level. This level was chosen because the goal of the study was to measure synthetic pitch and because synthetic pitch is most easily measured at low sensation levels (Houtmsa, 1979). The audibility of the pure-tone components was verified for each listener. Finally, to determine whether performance on the task changed at a higher sensation level, one listener (L3) was also tested at 24 dB SL. To eliminate the possible confounding effects of combination tones, a low-pass masking noise spectrum level 10 dB/Hz; low-pass cutoff frequency 1/2 octave below the lowest component in the stimulus was presented to a listeners test ear in the monotic condition. Procedure Percent correct performance on the musical interval identification task was measured as a function of the average rank (N) of the lowest harmonic for both monotic and dichotic conditions. On each trial, listeners were presented two CTs, separated by 500 ms, and were asked to identify which of 7 musical intervals the pair of tones comprised. Results & Discussion The results from the three listeners with cochlear-based hearing loss are similar to the results of the listeners from Houtsma and Goldstein (1972), who tested the same in normal hearing individuals. First, performance is very similar in the monotic and dichotic conditions. Second, the listeners with hearing loss show excellent musical interval identification at low harmonic numbers, but fall to chance performance at higher harmonic numbers. However, the decrease in performance with increasing N occurs at much lower harmonic numbers in the listeners with hearing loss than in listeners with normal hearing. The sharp falloff in the data of the listeners with hearing loss is most pronounced at higher F0s. This indicates that the existence region for the fundamental pitch of two-harmonic complex tones in listeners with high frequency hearing loss is confined to a lower range of frequencies than in normalhearing listeners.
Figure 7: Percent correct musical interval identification as a function of average lowest harmonic for monotic (unfilled circles) and dichotic presentation (filled circles) of harmonics for F0 ranging from 100 Hz to 500 Hz for listeners L1, L2, and L3. Data shown are for 14 dB SL. Data for listener L3 at 24 dB SL are shown for monotic presentation (unfilled squares) and for dichotic presentation (filled squares) of harmonics. Also shown are the monotic (solid line) and dichotic (dashed line) results, averaged from the three normal-hearing musicians from Houtsma and Goldstein (1972).
Effect of intensity of the stimulus - Figure 2 also shows the results for listener L3 for stimuli presented monotically (unfilled squares) and dichotically (filled squares) at 24 dB SL. Increasing the sensation level of the stimuli by 10 dB did not affect his performance on the musical interval identification task. Houtsma and Goldstein (1972) included eight musical intervals, four below the F0 and four above the F0, such that unison (the F0) was the average higher interval. As such, the upper frequency limit of performance was simply the harmonic number corresponding to 40% performance multiplied by F0. As shown in Fig. 8, the upper frequency limit of performance is similar in monotic and dichotic conditions for all listeners. The upper frequency limit of performance in the normal-hearing listeners from Houtsma and Goldstein (1972) approaches 5000 Hz. The upper frequency limit of performance in the three listeners with hearing loss is lower and is related to the degree of high-frequency hearing loss. Listener L1 shows upper frequency limit between 2800 and 3300 Hz, which is consistent with his thresholds worsening at 3000 Hz and above. Listener L3 shows upper frequency limit at about 1500 Hz, which is consistent with his thresholds worsening between 1000 and 2000 Hz.
The relationship between degree of loss and decreased ability to perceive a complex-tone pitch cannot be attributed to reduced audibility in the region of hearing loss, since all of the harmonics were presented at equal sensation level.
Figure 8: The upper frequency limit of performance for monotic and dichotic conditions for the same listeners. The upper frequency limit of performance is defined here as the average frequency of the second (higher) tone of the musical intervals at the average harmonic number for which performance falls to 40% correct. The average frequency ratio of the second to first tone was taken as the geometric mean of the range of frequency ratios: 350 cents, a ratio of 1.224/1. Thus by way of example, the upper frequency limit of performance for listener L2 at 200 Hz was calculated as follows: performance reached 40% at N=5, so that the upper frequency limit of performance was 200 x 5 x 1.224 = 1224. To further investigate the ability of listeners to process musical pitch information in regions of normal hearing and in regions of hearing loss, we tested the ability of two of the three listeners to perform puretone octave judgments to reference frequencies of 125 Hz, 250 Hz, 500 Hz, 1000 Hz, and 2000 Hz. Both reference and comparison signals were 500-ms pure tones presented monotically at 14 dB SL. Shown in Table I, the results for L1 and L3 are based on five judgments by each listener at each reference frequency and are expressed in terms of cents above the reference frequency 1200 cents would correspond to a perfect octave match. Normal-hearing musicians can make consistent octave matches for (upper) frequencies up to at least 5000 Hz, with standard deviations on the order of 5 to 20 cents (Ward, 1954). There is also a significant tendency to stretch the octave i.e., to match to a value greater than 1200 cents, by up to a semitone (100 cents) at very low and very high frequencies. In contrast to normal-hearing musicians, octave judgments by listeners L1 and L3 showed increased variability in the regions of hearing loss: above 2000 Hz for listener L1 and above 1000 Hz for listener L3 although the variability of L3 was also high for matches relative to 250 Hz. These results further support
the idea that something other than audibility decreases a listeners ability to process musical pitch information in regions of hearing loss.
If performance by listeners with hearing loss is at all limited by peripheral frequency resolution, then a difference in performance for monotic and dichotic presentation of harmonics in the musical interval identification task would be expected. The similarity in dichotic and monotic performance indicates that peripheral frequency resolution per se is not limiting performance by listeners with hearing loss. This is consistent with the overall low correlation between CTs FDLs and psychophysical measures of frequency resolution in impaired listeners in other studies (Moore, 1995) The fact that the upper frequency limit for dichotic CT pitch in listeners with normal hearing corresponds roughly to frequencies where temporal synchrony in the auditory nerve is highly degraded (e.g., Palmer and Russell, 1986; Johnson, 1980) suggests that the upper frequency limit may be a consequence of a lack of temporal information. This, in turn, suggests that the poor performance of listeners with HL when the harmonics are in a region of loss may be due to abnormally degraded temporal information from this region. Although the literature on neural synchrony in impaired ears is sparse, there is some evidence of reduced synchrony in neurons innervating regions of impairment (Woolf et al., 1981). Some psychophysical studies also support the idea that reduced neural synchrony affects performance by listeners with hearing loss (e.g., Moore, 1995). Temporal fine structure Hopkins, K. & Moore, B (2007). Moderate cochlear hearing loss leads to a reduced ability to use temporal fine structure information. J Acoust Soc Am, 122(2), 1055-68 We can learn the effect of the center component (N), hearing loss, fundamental frequency (100, 200 & 400 Hz), stimulus type (shaped and nonshaped), TFS and envelope cues. Hopkins and Moore (2007) tested the ability of normal-hearing and hearing-impaired subjects to discriminate a harmonic complex tone from a frequency-shifted tone, in which all components were
shifted up by the same amount in Hz (de Boer, 1956). The frequency-shifted tone had very similar temporal envelope and spectral envelope characteristics to the harmonic tone, but a different TFS. All tones were passed through a fixed bandpass filter, to reduce excitation-pattern cues. When the filter was centered on the 11th component, so that the components within the passband were unresolved, subjects with moderate cochlear hearing loss performed poorly, while normal-hearing subjects could do the task well.
They concluded that moderate cochlear HL led to a reduced ability to use TFS information. The reason for this is not clear. One possibility is that the precision of phase locking is reduced by cochlear HL. One study found that phase locking was reduced in animals with induced hearing loss (Woolf et al., 1981), but another study found normal phase locking in such animals (Harrison and Evans, 1979). It is unclear whether the types of pathologies that cause cochlear HL in humans lead to reduced phase locking. Another possible reason for a reduced ability to use TFS information is that TFS information could be decoded by cross correlation of the outputs of two points on the basilar membrane (Loeb et al., 1983;Shamma, 1985). A deficit in this process, produced by a change in the traveling wave on the BM, would impair the ability to use TFS information even if phase locking were normal. The broader auditory filters typically associated with cochlear hearing loss (Liberman and Kiang, 1978; Glasberg and Moore, 1986) could also lead to a reduced ability to use TFS information. The TFS at the output of these broader filters in response to a complex sound will have more rapid fluctuations and be more complex than normal. Such outputs may be uninterpretable by the central auditory system (Sek and Moore, 1995; Moore and Sek, 1996). A reduced ability to use TFS information could explain some of the perceptual problems of hearingimpaired subjects (Lorenzi et al., 2006a). TFS information may be important when listening in background noise, especially when the background is temporally modulated, as is often the case when listening in real life, for example, when more than one person is speaking. Normal-hearing subjects show better speech intelligibility (or lower speech reception thresholds, SRTs) when listening in a fluctuating background than when listening in a steady background (Festen and
Plomp, 1990; Baer and Moore, 1994; Peters et al., 1998), an effect which is sometimes called masking release. Hearing-impaired subjects show a much smaller masking release, and it has been suggested that this may be because they are poorer at listening in the dips of a fluctuating masker than normal-hearing subjects (Lorenzi et al., 2006b). Reduced audibility may account for some of the reduction in masking release measured for hearingimpaired subjects (Bacon et al., 1998), although the effect persists even when audibility is restored (Peters et al., 1998; Lorenzi et al., 2006a). TFS information may be important in dip listening tasks, as it could be used to identify points in the stimulus when the level of the target is high relative to the level of the masker; if the target and masker do not differ in their TFS, or no TFS information is available, dip listening may be ineffective.
Figure 9: Excitation patterns (Moore et al., 1997) for shaped stimuli with F0=400 Hz and N=7, 11, and 18, presented in pink noise with a spectrum level of 18 dB at 1000 Hz. This noise was designed to give roughly the same excitation pattern as the TEN noise used in the experiment. The patterns are plotted only over the frequency range where the shaped stimuli produced excitation comparable to or above that produced by the noise. Patterns for harmonic and frequency-shifted stimuli are plotted as solid and dotted lines, respectively. The frequency shift was 0.5F0 Hz (the maximum shift).
Figure 10: Mean d values for normal-hearing subjects for discrimination of harmonic and frequencyshifted complex tones, plotted as a function of N. Each panel shows results for one F0. Open and filled circles show d values for shaped and non-shaped stimuli, respectively. Error bars show 1 standard deviation of the mean. The square symbol indicates that the d value is not significantly different from zero. This point is plotted at zero. The above figure shows that performance worsened as N increased. This was true for all F0s tested and for both stimulus types. Performance was better for the non-shaped stimuli than for the shaped stimuli. Effect of hearing loss For the hearing-impaired subjects, discrimination of shaped stimuli with N=11 or 18 was very poor whenever the stimuli fell in a frequency region where the hearing loss was 30 dB or more, suggesting that they could make very little use of temporal fine structure information to discriminate harmonic and frequency-shifted complexes.
Air conduction thresholds for the test ears of the hearing-impaired subjects. The age of each subject is also shown.
This is an important result, as other studies have presented data that indirectly imply an inability of subjects with cochlear HL to make use of TFS information, but this has not previously been shown directly (Moore et al., 2006a). Note that the poor performance of the hearing-impaired subjects for shaped stimuli with N=11 or 18 confirms that these subjects were not able to use excitation-pattern cues to perform the task for these conditions. Some subjects showed above-chance performance for shaped stimuli when N=7. Although this may reflect a limited ability to use temporal fine structure information for the frequencies in question, discrimination could also have been based on comparison of the frequencies of individual components, as, despite the poor frequency selectivity seen in hearing-impaired listeners, components with low harmonic numbers may have been resolved for some subjects (Moore et al., 2006a). Subjects HI 6 and HI 7 performed better than the other hearing-impaired subjects with shaped stimuli for some conditions. This can be attributed to the reduced severity of their hearing impairments. Their audiometric thresholds were within the normal range at high frequencies, and the varying deficits seen across frequency reflect this. For both subjects, performance was worse when F0=100 Hz, in which case the components fell into the frequency region where the hearing loss was greatest. Performance in conditions when F0 = 400 Hz was normal for both subjects, as would be predicted by their near-normal audiometric thresholds for high frequencies The complex tones could have been discriminated using envelope, temporal fine structure, or spectral information, or any combination of these. Poor performance could, in part, be due to poor frequency selectivity, which would reduce the number of resolved harmonics that could be directly compared. However, the results are also consistent with a loss of ability to use TFS information. F0 DLs for complexes with N=11 and 18 were similar, suggesting that envelope cues alone may have been used for both values of N, as proposed by Moore and Moore, 2003a.
Figure 11: d values for F0 discrimination by normal-hearing and hearing impaired subjects. Individual data for three hearing-impaired subjects are plotted (HI 1, HI 2, and HI 5- filled symbols), and average data for four normal-hearing subjects are shown in each frame for comparison (open symbols). Error bars indicate 1 standard deviation of the mean. Envelope cues Normally hearing subjects were able to use envelope-shape cues to discriminate the shaped stimuli when N=18 and F0=100 or 200 Hz, while the hearing-impaired subjects apparently were not able to use these cues, since they mostly performed close to chance for these conditions. This suggests that the hearingimpaired subjects had some deficit in the ability to process envelope cues, perhaps because the cues were subtle. Hence, these results suggest that, for the normal-hearing subjects, temporal fine structure cues allowed better performance than envelope shape cues, so the latter were redundant. Similarly, the absence of a significant difference between the performance of normal-hearing subjects for shaped and nonshaped stimuli when N=7 and 11 indicates that subjects did not use the extra excitationpattern cue that was available for non-shaped stimuli. In contrast, better performance for nonshaped than for shaped stimuli when N=18 suggests that the additional spectral cue allowed better performance for these conditions.
Reference Hopkins,K., Moore, B. C. J., and Stone, M. A.(2008) Effects of moderate cochlear hearing loss on the ability to benefit from temporal fine structure information in speech J Acoust Soc Am ; 123(2): 1140 1153. Moore, B. C. J., Peters, R. (1992)Pitch discrimination and phase sensitivity in young and elderly subjects and its relationship to frequency selectivity Journal of acoustical society of America. 1(5) Plack, C. J., and Oxenham, A. J. (2005). "Pitch perception," in Pitch: Neural Coding and Perception, edited by C. J. Plack, A. J. Oxenham, A. N. Popper and R. Fay (Springer, New York). Moore, B. C. J. (2008) The Role of Temporal Fine Structure Processing in Pitch Perception, Masking, and Speech Perception for Normal-Hearing and Hearing-Impaired People JARO 9: 399406 Please refer to the power point for a summary of the articles.

Pitch Percep Complex Tone

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Pitch Percep Complex Tone

Transféré par

Droits d'auteur :

Formats disponibles

Pitch Perception Of Complex Tones Mechanisms of Complex Pitch Perception: The Early Years Temporal Theory(Schouten, 1940): Based

Spectro-temporal theory (Moore, 2003)

Vous aimerez peut-être aussi