Feature extraction in each frame, including front-end signal enhancement, cepstral coefficient and log energy calculation, mean normalization, and appended velocity and acceleration coefficients.
Illustration of a HMM applied to vocalization modeling. HMM states align to the observed frame-based features using a maximum likelihood criterion, based on statistical transition and observation models.
(Color online) Comparative examples of LDR waveforms and zoomed narrow-band spectrograms for each individual in the study.
(Color online) Boxplots for each of the four whole vocalization measures showing median, upper and lower quartile, and dynamic range.
Profile of vocalization data set.
Whole vocalization measures across all six individuals.
ANOVA F statistics and p values showing discriminability for duration, maximum f 0, minimum f 0, and average f 0 measures.
Test set accuracy versus number of states and number of mixtures.
Confusion matrix for the final system with 10 states and 10 mixtures. IDs are re-ordered to illustrate confusability between individuals. Confusions between Tigers 2 and 6 and Tigers 4 and 5 are highlighted, accounting for about 2/3 of all errors.
Article metrics loading...
Full text loading...