Shallow Convolutional Neural Networks for Acoustic Scene Classification

19 March 2018

journal article
computer science
Published by EDP Sciences in Wuhan University Journal of Natural Sciences

Vol. 23 (2), 178-184
https://doi.org/10.1007/s11859-018-1308-z

Abstract

Recently, deep neural networks, which include convolutional neural networks (CNNs), have been widely applied to acoustic scene classification (ASC). Motivated by the fact that some simplified CNNs have shown improvements over deep CNNs, such as Visual Geometry Group Net (VGG-Net), we have figured out how to simplify the VGG-Net style architecture to a shallow CNN with improved performance. Max pooling and batch normalization are also applied for better accuracy. With a series of controlled tests on detection and classification of acoustic scenes and events (DCASE) 2016 data sets, our shallow CNN achieves 6.7% improvement, and reduces time complexity to 5%, compared with the VGG-Net style CNN.

Keywords

This publication has 6 references indexed in Scilit:

Deep CNNs Along the Time Axis With Intermap Pooling for Robustness to Spectral Variations
IEEE Signal Processing Letters, 2016
Sparse feature learning for instrument identification: Effects of sampling and pooling methods
The Journal of the Acoustical Society of America, 2016
Acoustic Scene Classification: Classifying environments from the sounds they produce
IEEE Signal Processing Magazine, 2015
librosa: Audio and Music Signal Analysis in Python
Published by SciPy ,2015
Speech/music segmentation using entropy and dynamism features in a HMM classification framework
Speech Communication, 2003
Improving predictive inference under covariate shift by weighting the log-likelihood function
Journal of Statistical Planning and Inference, 2000

Cited by 12 articles