banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
Time-frequency integration characteristics of hearing are optimized for perception of speech-like acoustic patternsa)
a)Parts of this work were presented in “Average spectrotemporal structure of continuous speech matches with the frequency resolution of human hearing,” Proceedings of Interspeech 2012, Portland, Oregon, September 2012.
Rent this article for


Image of FIG. 1.
FIG. 1.

Examples of STDFs estimated from speech for frequencies of 250, 1000, 3000, and 6000 Hz. Red denotes high dependency, yellow medium, and dark blue low dependency between the given frequencies at the given lag.

Image of FIG. 2.
FIG. 2.

SDFs computed from speech using Eq. (8) . Each peaked curve represents the overall statistical dependency SDF( , ) between frequency pairs and for a fixed , with the maximum dependency occurring at  =  . For a given center frequency , the function SDF(, ) can be interpreted as a bandpass filter with the maximum gain at . Only the SDFs corresponding to every fifth are shown for visual clarity. Note that the dependencies when  =  are not fixed (deterministic) since the SDF is not a measure of the instantaneous correlation between frequencies but is computed across all temporal delays where cross-channel dependencies do exist [see Eq. (5) ].

Image of FIG. 3.
FIG. 3.

Interpolated SDFs for center frequencies 750, 2833, and 6167 Hz. The upper horizontal lines denote the threshold δ = −0.12 that provides the best fit to ERBs and lower vertical dashed lines denote the optimal δ = −0.17 that provides the best fit to Bark bandwidths.

Image of FIG. 4.
FIG. 4.

Bandwidths of SDFs as a function of center frequency with the attenuation parameter separately fitted to ERB and Bark data. Bandwidths of ERB (straight solid line) and Bark critical bands (dashed curved line) are shown as a reference.

Image of FIG. 5.
FIG. 5.

Relative error (in hertz) between SDF bandwidths and the bandwidths of the ERB (solid line) and Bark critical bands (dashed-dotted line).

Image of FIG. 6.
FIG. 6.

Examples of TDFs measured for frequencies of 250, 1000, 2000, and 6000 Hz up to a maximum delay of 600 ms. The TDF is shown without the logarithm in Eq. (6) on the left in order to visualize the decay of structure to zero at long delays and with the logarithm on the right.

Image of FIG. 7.
FIG. 7.

Left: RMSE between the 3166-Hz integrated TDF and the power law function with different exponentials . Right: Integrated TDF as a function of integral length and the corresponding threshold according to the power law with best fitting  = 0.77 (see ).

Image of FIG. 8.
FIG. 8.

Left: Weighting functions of the 1-kHz TDF (solid line), weights of Eq. (12) from fitted to the data of ; dashed line), and the weights of Eq. (12) with parameters fitted to the detection thresholds of TDF in Eq. (10) (dashed-dotted line). Right: Detection thresholds for the TDF (solid line), weights of ; dashed line), and weights with parameters fitted to the TDF (dashed-dotted line) as a function of stimulus duration.


Generic image for table

Time-constants of the TDFs as a function of frequency and length of the TDF interval included in the analysis. Unexplained variances (1 − ) of the exponential fits to the TDF data are also shown.


Article metrics loading...


Full text loading...

This is a required field
Please enter a valid email address
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
Scitation: Time-frequency integration characteristics of hearing are optimized for perception of speech-like acoustic patternsa)