Shallow Convolutional Neural Networks for Acoustic Scene Classification
- 19 March 2018
- journal article
- computer science
- Published by EDP Sciences in Wuhan University Journal of Natural Sciences
- Vol. 23 (2), 178-184
- https://doi.org/10.1007/s11859-018-1308-z
Abstract
Recently, deep neural networks, which include convolutional neural networks (CNNs), have been widely applied to acoustic scene classification (ASC). Motivated by the fact that some simplified CNNs have shown improvements over deep CNNs, such as Visual Geometry Group Net (VGG-Net), we have figured out how to simplify the VGG-Net style architecture to a shallow CNN with improved performance. Max pooling and batch normalization are also applied for better accuracy. With a series of controlled tests on detection and classification of acoustic scenes and events (DCASE) 2016 data sets, our shallow CNN achieves 6.7% improvement, and reduces time complexity to 5%, compared with the VGG-Net style CNN.Keywords
This publication has 6 references indexed in Scilit:
- Deep CNNs Along the Time Axis With Intermap Pooling for Robustness to Spectral VariationsIEEE Signal Processing Letters, 2016
- Sparse feature learning for instrument identification: Effects of sampling and pooling methodsThe Journal of the Acoustical Society of America, 2016
- Acoustic Scene Classification: Classifying environments from the sounds they produceIEEE Signal Processing Magazine, 2015
- librosa: Audio and Music Signal Analysis in PythonPublished by SciPy ,2015
- Speech/music segmentation using entropy and dynamism features in a HMM classification frameworkSpeech Communication, 2003
- Improving predictive inference under covariate shift by weighting the log-likelihood functionJournal of Statistical Planning and Inference, 2000