Auditory-nerve responses predict pitch attributes related to musical consonance-dissonance for normal and impaired hearinga)
Audiograms for normal-hearing and hearing-impaired conditions. The impaired audiogram was modeled after data reported by Tufts et al. (2005) who measured musical interval consonance rankings in subjects with a flat, moderate, sensorineural hearing loss (SNHL). Pure-tone averages (PTAs) for normal and impaired models are 0 and 45 dB HL, respectively.
Procedure for computing neural pitch salience from AN responses to musical intervals. Single-fiber operations vs population-level analyses are separated by the vertical dotted line. Stimulus time waveforms [x(t) = two note pitch interval] were presented to a computational model of the AN ( Zilany et al., 2009 ) containing 70 model fibers (CFs: 80−16 000 Hz). From the PSTH, the time-weighted autocorrelation function (ACF) was constructed for each fiber. Individual fiber ACFs were then summed to create a pooled, population-level ACF (ACFpop). The ACFpop was then passed through a series of periodic sieve templates. Each sieve template represents a single pitch (f 0) and the magnitude of its output represents a measure of neural pitch salience at that f 0. Analyzing the outputs across all possible pitch sieve templates (f 0 = 25−1000 Hz) results in a running salience curve for a particular stimulus (“Pitch salience”). The peak magnitude of this function was taken as an estimate of neural pitch salience for a given interval (PS(st), where st represents the separation of the two notes in semitones). Inset figure showing AN model architecture adapted from Zilany and Bruce (2006) , with permission from The Acoustical Society of America.
Pooled autocorrelation functions (cf. ISIH) (left columns) and running pitch salience (i.e., output of periodic sieve analyzer) (right columns) computed for three musical intervals for normal and impaired hearing, A and B, respectively. Pooled ACFs (see Fig. 2 , “ACFpop”) quantify periodic activity within AN responses and show clearer, more periodic energy at the fundamental pitch period and its integer related multiples for consonant (e.g., unison, perfect 5th) than dissonant (e.g., minor 2nd) pitch intervals. Running pitch salience curves computed from each ACFpop quantify the salience of all possible pitches contained in AN responses. Their peak magnitude (arrows) represents a singular measure of salience for the eliciting musical interval and consequently represents a single point in Figs. 4 and 5 .
AN responses correctly predict perceptual attributes of consonance, dissonance, and the hierarchical ordering of musical pitch for normal hearing. Neural pitch salience is shown as a function of the number of semitones separating the interval’s lower and higher pitch over the span of an octave (i.e., 12 semitones). The pitch classes recognized by the equal tempered Western music system (i.e., the 12 semitones of the chromatic scale) are demarcated by the dotted lines and labeled along the curve. Consonant musical intervals (black) tend to fall on or near peaks in neural pitch salience whereas dissonant intervals (gray) tend to fall within trough regions, indicating more robust encoding for the former. However, even among intervals common to a single class (e.g., all consonant intervals), AN responses show differential encoding resulting in the hierarchical arrangement of pitch typically described by Western music theory (i.e., Un > Oct > P5, > P4, etc.). All values are normalized to the maximum of the curve which was the unison.
(Color online) Normal-hearing (A) and hearing-impaired (B) estimates of neural pitch salience as a function of level. Little change is seen in the NH “consonance curve” with decreasing stimulus presentation level. Level effects are more pronounced in the case of HI where consonant peaks diminish with decreasing intensity. Even after equating sensation levels, NH responses still show a greater contrast between consonant peaks and dissonant troughs than HI responses indicating that the reduced contrast seen with HI cannot simply be explained in terms of elevated hearing thresholds. For ease of SL comparison, NH at 25 dB SPL (dotted line) is plotted along with the HI curves in B (i.e., NH at 25 dB SPL and HI at 70 dB SPL are each ∼25 dB SL). All values have been normalized to the maximum of the NH 70 dB SPL curve, the unison.
(Color online) Continuous plots of acoustic and neural correlates of musical interval perception. All panels reflect a presentation level of 70 dB SPL. Ticks along the abscissa demarcate intervals of the equal tempered chromatic scale. Neural pitch salience (A) measures the neural harmonicity/periodicity of dyads as represented in AN responses (same as Fig. 5 ), and is shown for both normal-hearing (NH) and hearing-impaired (HI) conditions. Similarly, periodic sieve analysis applied to the acoustic stimuli quantifies the degree of periodicity contained in the raw input waveforms (B). Consonant intervals generally evoke more salient, harmonic neural representations and contain higher degrees of acoustic periodicity than adjacent dissonant intervals. Neural (C) and acoustic (D) roughness quantify the degree of amplitude fluctuation/beating produced by partials measured from the pooled PSTH and the acoustic waveform, respectively. Dissonant intervals contain a greater number of closely spaced partials which produce more roughness/beating than consonant intervals in both the neural and acoustic domain. See Fig. 4 for interval labels.
(Color online) Correlations between neural/acoustic correlates and behavioral consonance scores of equal tempered chromatic intervals for normal and impaired hearing. Both AN pitch salience (A) and acoustic waveform periodicity (B) show positive correlations with behavioral consonance judgments. That is, consonant intervals, judged more pleasant by listeners, are both more periodic and elicit larger neural pitch salience than dissonant intervals. Neural and acoustic roughness (C and D) are negatively correlated with perceptual data (note reversed abscissa) indicating that intervals deemed dissonant contain a larger degree of roughness/beating than consonant intervals. The explanatory power (R 2) of each correlate reveals its strength in predicting the perceptual data: AN neural roughness < (acoustic periodicity ≈ acoustic roughness) < AN neural pitch salience (i.e., harmonicity). Of the neural measures, only AN pitch salience produces the correct ordering and systematic clustering of consonant and dissonant intervals, e.g., maximal separation of the unison (most consonant interval) from the minor 2nd (most dissonant interval). Perceptual data reproduced from Tufts et al. (2005) .
(Color online) Acoustic and neural correlates of behavioral chordal sonority ratings. Presentation level was 70 dB SPL. Neural pitch salience (A) derived from NH AN responses (squares) show close correspondence to perceptual ratings of chords reported for nonmusician listeners ( Cook and Fujisawa, 2006 ; black circles). Salience values have been normalized with respect to the NH unison presented at 70 dB SPL. Similar to the dyad results, HI estimates for chords (triangles) indicate that the overall differences between triad qualities are muted with hearing loss. Roughness computed from AN (C) shows that only HI responses contain meaningful correlates of harmony perception; NH neural roughness does not predict the ordering of behavioral chordal ratings. In contrast, both acoustic periodicity (B) and roughness (D) provide correlates of chord perception and are inversely related; consonant triads contain larger degrees of periodicity and relatively less roughness than dissonant triads.
Article metrics loading...
Full text loading...