1887
banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
Effect of spectral normalization on different talker speech recognition by cochlear implant users
Rent:
Rent this article for
USD
10.1121/1.2897047
/content/asa/journal/jasa/123/5/10.1121/1.2897047
http://aip.metastore.ingenta.com/content/asa/journal/jasa/123/5/10.1121/1.2897047

Figures

Image of FIG. 1.
FIG. 1.

Implementation framework of the GMM-based spectral normalization algorithm.

Image of FIG. 2.
FIG. 2.

Normalized talker distortion as a function of number of channels. Solid line: Without spectral normalization. Dashed line: With spectral normalization. Note that the talker distortion between talkers F1 and M1 (unprocessed speech) was used as the reference.

Image of FIG. 3.
FIG. 3.

Individual and mean sentence recognition performance for talkers M1 and F1. For subjects S1–S3, performance with F1 was better than that with M1; for subjects S4–S9, performance was better with M1 than with F1. The error bars show , and the asterisks show significantly different performance between the two talkers .

Image of FIG. 4.
FIG. 4.

Wave forms for the sentence “Glue the sheet to the dark blue background.” Top panel: Pitch-shift transformation T0.6 (upward pitch shift). Middle panel: Reference talker T1.0 (unprocessed speech from talker F1). Bottom panel: Pitch-shift transformation T1.6 (downward pitch shift).

Image of FIG. 5.
FIG. 5.

Spectral envelopes for different processing conditions in Experiment 2. Top panel: Spectral envelopes for reference talker T1.0 and pitch-shift transformations T0.6 and T1.6. Bottom panel: Spectral envelopes for T1.0 and spectral transformations T0.6-to-T1.0 and T1.6-to-T1.0.

Image of FIG. 6.
FIG. 6.

NH subjects’ overall speech quality ratings for the pitch-shift transformations, with (open symbols) and without (closed symbols) spectral normalization. The error bars show , and the asterisks indicate significantly different ratings with spectral normalization . Note that source talker T1.0 (unprocessed speech from talker F1) was used to anchor the subjective quality ratings.

Image of FIG. 7.
FIG. 7.

Sentence recognition performance for NH and CI subjects, with (open symbols) and without (closed symbols) spectral transformation, as a function of pitch-shift transformations. The error bars show , and the asterisks indicate significantly different performance after spectral transformation .

Tables

Generic image for table
TABLE I.

Subject demographics for the cochlear implant patients who participated in the present study.

Generic image for table
TABLE II.

Performance difference between unprocessed source talkers (i.e., M1 vs F1), and between spectrally normalized and unprocessed talkers. Note that because the performance with talkers M1 and F1 differed among individual subjects, comparisons are made in terms of the “Better” and “Worse” talker. Bold numbers indicate significant differences in performance across different sentence lists .

Generic image for table
TABLE III.

Pitch and formant analysis for the pitch-shift and spectral transformations in Experiment 2. The target F0 for the pitch-shift transformations was scaled according to the pitch-stretching ratio used for processing; the target F0 for the spectral transformation refers to the measured F0 values after pitch-stretching. The F0s and formant frequencies were measured with software WAVESURFER 1.8.5. The F0s were averaged across all IEEE sentences. The formant frequencies were estimated for the vowel /ɪʏ/ from the sentence “Glue the sheet to the dark blue background.” Note that reference talker T1.0 (in bold) was F1 from Experiment 1.

Generic image for table
TABLE IV.

significance values for linear regressions performed between the unprocessed talkers from Experiment 1 (M1 and F1) and the pitch-shift transformations from Experiment 2 (T0.6, T0.8, T1.2, T1.4, T1.6).

Loading

Article metrics loading...

/content/asa/journal/jasa/123/5/10.1121/1.2897047
2008-05-01
2014-04-21
Loading

Full text loading...

This is a required field
Please enter a valid email address
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
Scitation: Effect of spectral normalization on different talker speech recognition by cochlear implant users
http://aip.metastore.ingenta.com/content/asa/journal/jasa/123/5/10.1121/1.2897047
10.1121/1.2897047
SEARCH_EXPAND_ITEM