Volume 133, Issue 5, May 2013
- jasa express letters
- letters to the editor
- general linear acoustics 
- nonlinear acoustics 
- aeroacoustics, atmospheric sound 
- underwater sound 
- ultrasonics, quantum acoustics, and physical effects of sound 
- transduction 
- structural acoustics and vibration 
- noise: its effects and control 
- architectural acoustics 
- acoustical measurements and instrumentation 
- acoustic signal processing 
- physiological acoustics 
- psychological acoustics 
- speech production 
- speech perception 
- speech processing and communication systems 
- music and musical instruments 
- bioacoustics 
- acoustical news
- acoustical standards news
- reviews of acoustical patents
- ica 2013 montréal
- award encomiums
- ica 2013 montréal
Index of content:
- JASA EXPRESS LETTERS
133(2013); http://dx.doi.org/10.1121/1.4795851View Description Hide Description
This paper introduces an approach for online speech source clustering and separation, which is based on the utilization of the multichannel location information in a recursive expectation maximization (EM) algorithm. Specifically, the normalized multichannel speech-recording vector is employed as a feature vector and is modeled using Watson mixture model. The model parameters are determined by maximizing the data likelihood at every time-frequency slot in an online processing manner. Consequently, the proposed approach can continuously adjust the speech clusters. Promising results showing the advantage of the proposed approach over the batch EM algorithm in the case of two speakers with speaker movement are obtained.
133(2013); http://dx.doi.org/10.1121/1.4798378View Description Hide Description
Listeners are unable to report the physical order of particular sequences of brief tones. This phenomenon of temporal dislocation depends on tone durations and frequencies. The current study empirically shows that it also depends on the spatial location of the tones. Dichotically testing a three-tone sequence showed that the central tone tends to be reported as the first or the last element when it is perceived as part of a left-to-right motion. Since the central-tone dislocation does not occur for right-to-left sequences of the same tones, this indicates that there is a spatial bias in the perception of sequences.
133(2013); http://dx.doi.org/10.1121/1.4798268View Description Hide Description
Time reversal (TR) utilizes an array of transducers, a time reversal mirror (TRM), to locate sources. Here TR is applied to simple sources using steady-state waveforms in a numerical, point source model in a half-space environment. It is found that TR can effectively localize a simple source broadcasting a continuous wave, depending on the angular spacing. Furthermore, the angular spacing and the aperture of the TRM are the most important parameters when creating a setup of receivers for imaging a source. This work optimizes a TRM when the source's location is known within a region of certainty.
Experimental evaluation of inverse filtering using physical systems with known glottal flow and tract characteristics133(2013); http://dx.doi.org/10.1121/1.4798619View Description Hide Description
The technique presented here uses an impedance head to measure the input impedance spectrum of a physical model of a vocal tract, and then to inject a known glottal flow waveform into the tract. The sound measured outside the mouth is used to evaluate inverse filtering techniques by comparison with the known glottal flow and measured acoustical properties of the tract. The normalized least square errors in the glottal flow were typically a percent or less in the time domain and several percent in the frequency domain. Accurate determination of resonance frequencies and bandwidths required a suitable order of inverse filter.
133(2013); http://dx.doi.org/10.1121/1.4798620View Description Hide Description
The current study examined Vowel Inherent Spectral Change (VISC) of English vowels spoken by English-, Chinese-, and Korean-native speakers. Two metrics, spectral distance (amount of spectral shift) and spectral angle (direction of spectral shift) of formant movement from the onset to the offset, were measured for 12 English monophthongs produced in a /hvd/ context. While Chinese speakers showed significantly greater spectral distances of vowels than English and Korean speakers, there was no significant speakers' native language effect on spectral angles. Comparisons to their native vowels for Chinese and Korean speakers suggest that VISC might be affected by language-specific phonological structure.
133(2013); http://dx.doi.org/10.1121/1.4798648View Description Hide Description
This letter presents results from a study on diffusive architectural surfaces and auditory perception. Spatial discrimination of multiple sources is investigated in a simulated performance venue with various diffusive surface treatments. Simulations were generated with closely spaced sound sources on the stage of a concert hall and a listener in the audience area. Subjects were asked to distinguish signals in which pairs of simultaneous talkers were presented at various lateral separations, in halls with flat or diffusive surfaces. The experiments reveal that discriminating differences in the lateral arrangement of sources is possible at narrower separation angles when reflections come from flat rather than diffusive surfaces.
133(2013); http://dx.doi.org/10.1121/1.4798269View Description Hide Description
Previous work by Polka, Rvachew, and Molnar [Infancy 13(5), 421-439 (2008)] has reported that infants are poor at focusing their attention on a particular frequency range, and, as a result, are distracted by maskers that are outside of the target frequency range. The current study explores this effect of irrelevant distractors further and finds that 8-month-old infants are significantly less affected by maskers outside the frequency range (off-channel maskers) than by on-channel maskers. Thus while infants may display difficulty ignoring irrelevant distractors, they are able to do so to at least some degree, suggesting some ability to perceive speech from spectrally remote maskers, despite the demonstrated presence of greater informational masking at this age.
Preboundary lengthening and preaccentual shortening across syllables in a trisyllabic word in English133(2013); http://dx.doi.org/10.1121/1.4800179View Description Hide Description
This study demonstrates some new aspects of preboundary lengthening and preaccentual shortening on a test word banana in American English. Preboundary lengthening was found to be extended to the initial unstressed syllable beyond the main-stressed syllable, presenting more complexity than has previously been assumed. Preaccentual shortening was observed regardless of boundary strength or the stress pattern (trochaic vs iambic) of the following context word, suggesting that it operates globally at an utterance level. The locus of preaccentual shortening, however, was modulated by prosodic boundary: It is realized on the final vowel IP-finally but on the non-final stressed vowel IP-medially.
English vowel identification in long-term speech-shaped noise and multi-talker babble for English and Chinese listeners133(2013); http://dx.doi.org/10.1121/1.4800191View Description Hide Description
The identification of 12 English vowels was measured in quiet and in long-term speech-shaped noise (LTSSN) and multi-talker babble for English-native (EN) listeners and Chinese-native listeners in the U.S. (CNU) and China (CNC). The signal-to-noise ratio was manipulated from −15 to 0 dB. As expected, EN listeners performed significantly better in quiet and noisy conditions than CNU and CNC listeners. Vowel identification in LTSSN was similar between CNU and CNC listeners; however, performance in babble was significantly better for CNU listeners than for CNC listeners, indicating that exposing non-native listeners to native English may reduce informational masking of multi-talker babble.
133(2013); http://dx.doi.org/10.1121/1.4798618View Description Hide Description
In previous research on distributional training of non-native speech sounds, distributions were always discontinuous: typically, each of only eight different stimuli was repeated multiple times. The current study examines distributional training with continuous distributions, in which all presented tokens are acoustically different. Adult Spanish learners of Dutch were trained on either a discontinuous or a continuous bimodal distribution of the Dutch vowel contrast /ɑ/–/aː/. Both groups improved their perception of the contrast; this shows that continuous training works equally well as discontinuous training. Using the more natural continuous distributions is therefore recommended for future distributional learning experiments.
Modifying the normalized covariance metric measure to account for nonlinear distortions introduced by noise-reduction algorithms133(2013); http://dx.doi.org/10.1121/1.4800189View Description Hide Description
In this study, two methods are proposed to modify the normalized covariance metric (NCM) measure to reduce the effects of gain-induced nonlinear distortions introduced by most noise-suppression algorithms. Considering that the gain-induced distortions behave differently dependent on the signal-to-noise ratio between the noise-reduced speech and the noise, the first approach introduces a penalty factor involving this ratio in the modified NCM measure. The second approach deemphasizes segments marked with amplification distortions that contribute less to intelligibility via adaptive thresholding. Significantly higher correlations with intelligibility scores were obtained from the modified NCM measures compared with the original NCM measures.
133(2013); http://dx.doi.org/10.1121/1.4802186View Description Hide Description
A reference-free speech quality measure is proposed and assessed for hearing aid applications. The proposed speech quality metric is validated with subjective ratings obtained from hearing impaired listeners under a number of noisy and reverberant conditions. In addition, a comparison is drawn between the proposed measure and a state-of-the-art electroacoustic measure that relies on a clean reference signal. The results showed that the reference-free measure had a lower correlation with the subjective ratings of hearing aid speech quality in comparison to the correlations achieved by the measure utilizing a reference signal. Nevertheless, advantages of the reference-free approach are discussed.
Wide-area assessment of topographical and meteorological effects on sound propagation by time-domain modeling133(2013); http://dx.doi.org/10.1121/1.4802185View Description Hide Description
Noise mapping with a three-dimensional finite-difference time-domain (FDTD) model over larger areas suffers from its high computational demand. This study shows that an FDTD model in combination with a meteorological model can be used for at least qualitative assessments of topographical and meteorological effects on sound propagation in domains of even some kilometers extension. This is achieved by restricting the acoustical simulations to low frequencies which allow the use of a rather large numerical grid spacing.
133(2013); http://dx.doi.org/10.1121/1.4802184View Description Hide Description
The effects of relative alignment of two different types of anisotropic open cell porous materials are investigated in terms of the acoustic response of a multi-layered configuration. Numerical experiments, where gradient based optimization techniques were used, are conducted to find possible extremal values. It is shown that, depending on the degree of anisotropy of the porous material properties, their angular orientations have a significant and frequency dependent influence on the measured response. The results highlight the importance of further advancing the knowledge of anisotropic porous material behavior.
133(2013); http://dx.doi.org/10.1121/1.4799761View Description Hide Description
Previously, an effective density fluid model (EDFM) was developed by the author [J. Acoust. Soc. Am. 110, 2276–2281 (2001)] for unconsolidated granular sediments and applied to sand. The model is a simplification of the full Biot porous media model. Here two additional effects are added to the EDFM model: heat transfer between the liquid and solid at low frequencies and the granularity of the medium at high frequencies. The frequency range studied is 100 Hz–1 MHz. The analytical sound speed and attenuation expressions obtained have no free parameters. The resulting model is compared to ocean data.
- LETTERS TO THE EDITOR
133(2013); http://dx.doi.org/10.1121/1.4796124View Description Hide Description
The present work characterizes the acoustic emissions resulting from the collision of a particle driven under gravity with a captive bubble. Conventional methods to investigate the bubble particle collision interaction model measure a descriptive parameter known as the collision time. During such a collision, particle impact may cause a strong deformation and a following oscillation of the bubble–particle interface generates detectable passive acoustic emissions (AE). Experiments and models presented show that the AE frequency monotonically decreases with the particle radius and is independent of the impact velocity, whereas the AE amplitude has a more complicated relationship with impact parameters.
- GENERAL LINEAR ACOUSTICS 
Coarse-grid computation of the one-way propagation of coupled modes in a varying cross-section waveguide133(2013); http://dx.doi.org/10.1121/1.4799021View Description Hide Description
A one-way approximation is investigated for the computation of wave propagation in varying cross-section waveguides. The proposed method derives as a basic approximation of the extensively studied multimodal admittance method. When integrated with a Magnus scheme, this matrix one-way equation exhibits an unexpected behavior, as the deviation from the exact solution is minimum when only two discretization points per wavelength are taken. This peculiar property makes this method efficient to compute the wave propagation for a large variety of geometries, beyond the initially stated framework of weakly non-uniform waveguides.
- NONLINEAR ACOUSTICS 
133(2013); http://dx.doi.org/10.1121/1.4796120View Description Hide Description
This paper presents an experimental study on nonlinear transient acoustical holography. The validity and effectiveness of a recently proposed nonlinear transient acoustical holography algorithm is evaluated in the presence of noise. The acoustic field measured on a post-focal plane of a high-intensity focused transducer is backward projected to reconstruct the pressure distributions on the focal and a pre-focal plane, which are shown to be in good agreement with the measurement. In contrast, the conventional linear holography produces erroneous results in this case where the nonlinearity involved is strong. Forward acoustic field projection was also carried out to further verify the algorithm.
133(2013); http://dx.doi.org/10.1121/1.4795806View Description Hide Description
The nonlinear forcing terms for the wave equation in general curvilinear coordinates are derived based on an isotropic homogeneous weakly nonlinear elastic material. The expressions for the nonlinear part of the first Piola-Kirchhoff stress are specialized for axisymmetric torsional and longitudinal fundamental waves in a circular cylinder. The matrix characteristics of the nonlinear forcing terms and secondary mode wave structures are manipulated to analyze the higher harmonic generation due to the guided wave mode self-interactions and mutual interactions. It is proved that both torsional and longitudinal secondary wave fields can be cumulative by a specific type of guided wave mode interactions. A method for the selection of preferred fundamental excitations that generate strong cumulative higher harmonics is formulated, and described in detail for second harmonic generation. Nonlinear finite element simulations demonstrate second harmonic generation by T(0,3) and L(0,4) modes at the internal resonance points. A linear increase of the normalized modal amplitude ratio over the propagation distance is observed for both cases, which indicates that mode L(0,5) is effectively generated as a cumulative second harmonic. Counter numerical examples demonstrate that synchronism and sufficient power flux from the fundamental mode to the secondary mode must occur for the secondary wave field to be strongly cumulative.
- AEROACOUSTICS, ATMOSPHERIC SOUND 
133(2013); http://dx.doi.org/10.1121/1.4798671View Description Hide Description
The plane wave normal incidence acoustic absorption coefficient of five types of low growing plants is measured in the presence and absence of soil. These plants are generally used in green living walls and flower beds. Two types of soil are considered in this work: a light-density, man-made soil and a heavy-density natural clay base soil. The absorption coefficient data are obtained in the frequency range of 50–1600 Hz using a standard impedance tube of diameter 100 mm. The equivalent fluid model for sound propagation in rigid frame porous media proposed by Miki [J. Acoust. Soc. Jpn. (E) 11, 25–28 (1990)] is used to predict the experimentally observed behavior of the absorption coefficient spectra of soils, plants, and their combinations. Optimization analysis is employed to deduce the effective flow resistivity and tortuosity of plants which are assumed to behave acoustically as an equivalent fluid in a rigid frame porous medium. It is shown that the leaf area density and dominant angle of leaf orientation are two key morphological characteristics which can be used to predict accurately the effective flow resistivity and tortuosity of plants.