Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients
- 21 October 2008
- journal article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Audio, Speech, and Language Processing
- Vol. 16 (8), 1541-1550
- https://doi.org/10.1109/tasl.2008.2005345
Abstract
This paper presents a method for automatic classification of birds into different species based on the audio recordings of their sounds. Each individual syllable segmented from continuous recordings is regarded as the basic recognition unit. To represent the temporal variations as well as sharp transitions within a syllable, a feature set derived from static and dynamic two-dimensional Mel-frequency cepstral coefficients are calculated for the classification of each syllable. Since a bird might generate several types of sounds with variant characteristics, a number of representative prototype vectors are used to model different syllables of identical bird species. For each bird species, a model selection method is developed to determine the optimal mode between Gaussian mixture models (GMM) and vector quantization (VQ) when the amount of training data is different for each species. In addition, a component number selection algorithm is employed to find the most appropriate number of components of GMM or the cluster number of VQ for each species. The mean vectors of GMM or the cluster centroids of VQ will form the prototype vectors of a certain bird species. In the experiments, the best classification accuracy is 84.06% for the classification of 28 bird species.Keywords
This publication has 19 references indexed in Scilit:
- Automatic recognition of animal vocalizations using averaged MFCC and linear discriminant analysisPattern Recognition Letters, 2005
- A Model-Selection-Based Self-Splitting Gaussian Mixture Learning with Application to Speaker IdentificationEURASIP Journal on Advances in Signal Processing, 2004
- Birdsong recognition with DSP and neural networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- GA-based noisy speech recognition using two-dimensional cepstrumIEEE Transactions on Speech and Audio Processing, 2000
- Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative studyThe Journal of the Acoustical Society of America, 1998
- Birdsong recognition using backpropagation and multivariate statisticsIEEE Transactions on Signal Processing, 1997
- Template-based automatic recognition of birdsong syllables from continuous recordingsThe Journal of the Acoustical Society of America, 1996
- A study of the two-dimensional cepstrum approach for speech recognitionComputer Speech & Language, 1992
- Spoken-word recognition using dynamic features analysed by two-dimensional cepstrumIEE Proceedings I (Communications, Speech and Vision), 1989
- Speaker-independent isolated word recognition using dynamic features of speech spectrumIEEE Transactions on Acoustics, Speech, and Signal Processing, 1986