1887
banner image
No data available.
Please log in to see this content.
You have no subscription access to this content.
No metrics data to plot.
The attempt to load metrics for this article has failed.
The attempt to plot a graph for these metrics has failed.
Finding good acoustic features for parrot vocalizations: The feature generation approach
Rent:
Rent this article for
USD
10.1121/1.3531953
/content/asa/journal/jasa/129/2/10.1121/1.3531953
http://aip.metastore.ingenta.com/content/asa/journal/jasa/129/2/10.1121/1.3531953

Figures

Image of FIG. 1.
FIG. 1.

Numbers of call types are represented according to bins of F-measures for the preliminary experiment. Overall, the F-measures were low, meaning that the classifier (MFCC) was not efficient on the whole dataset.

Image of FIG. 2.
FIG. 2.

Mean power spectra are shown for (a) C1, (b) C57, (c) C77, (d) C106, and (e) C113.

Image of FIG. 3.
FIG. 3.

Oscillograms (top) and spectrographic representations (bottom) for C1, C57, C77, C106, and C113. Spectrograms were calculated using 512-point Hamming windows, a 22 kHz sampling frequency, and 16-bit amplitude sampling.

Image of FIG. 4.
FIG. 4.

Mean percentage correct classification for the five calls selected is represented according to the feature sets. Using just four EDS features provided better results than with 10 or even 20 MFCC values, and the results were even better with more EDS features.

Tables

Generic image for table
TABLE I.

Features composing each EDS feature set. In the feature expressions, “x” represents the input acoustic signal. The mathematical composition operation is implicitly represented by the parentheses. Each label denotes an operator. Most of them are explained in detail in Pachet and Roy (2009). See Appendix for descriptions of operators not included in Pachet and Roy (2009).

Generic image for table
TABLE II.

Mean percentage F-measures obtained for each feature set. The training-testing process was repeated several times in order to reduce the variability in classifier performance: 50 repetitions (or more) ensure almost constant mean F-measures.

Generic image for table
TABLE III.

Mean percentage correctly classified calls (±SE) for each feature set and call type.

Generic image for table
TABLE IV.

Confusion matrices providing the distribution of the calls according to their original call type after the classification process with CL2-EDS and CL2-MFCC. More calls were classified correctly with CL2EDS than with CL2MFCC. In both cases, calls of type B were misclassified (in C1, C57, C77, C106 or C113).

Generic image for table
TABLE V.

Confusion matrices providing the distribution of the calls according to their original call type and mean F-measures obtained for each call type after the classification process with CL1, either followed by CL2-EDS or by CL2-MFCC. More calls were classified correctly with CL2EDS (diagonal of the matrix: 3930 calls) than with CL2MFCC (diagonal of the matrix: 3689 calls). Overall, F-measures were greater with CL2EDS than with CL2MFCC.

Loading

Article metrics loading...

/content/asa/journal/jasa/129/2/10.1121/1.3531953
2011-02-11
2014-04-16
Loading

Full text loading...

This is a required field
Please enter a valid email address
752b84549af89a08dbdd7fdb8b9568b5 journal.articlezxybnytfddd
Scitation: Finding good acoustic features for parrot vocalizations: The feature generation approach
http://aip.metastore.ingenta.com/content/asa/journal/jasa/129/2/10.1121/1.3531953
10.1121/1.3531953
SEARCH_EXPAND_ITEM