Deep Learning and Music Adversaries
- 10 September 2015
- journal article
- research article
- Published by Institute of Electrical and Electronics Engineers (IEEE) in IEEE Transactions on Multimedia
- Vol. 17 (11), 2059-2071
- https://doi.org/10.1109/tmm.2015.2478068
Abstract
An adversary is an agent designed to make a classification system perform in some particular way, e.g., increase the probability of a false negative. Recent work builds adversaries for deep learning systems applied to image object recognition, exploiting the parameters of the system to find the minimal perturbation of the input image such that the system misclassifies it with high confidence. We adapt this approach to construct and deploy an adversary of deep learning systems applied to music content analysis. In our case, however, the system inputs are magnitude spectral frames, which require special care in order to produce valid input audio signals from network- derived perturbations . For two different train-test partitionings of two benchmark datasets, and two different architectures , we find that this adversary is very effective. We find that convolutional architectures are more robust compared to systems based on a majority vote over individually classified audio frames. Furthermore , we experiment with a new system that integrates an adversary into the training loop, but do not find that this improves the resilience of the system to new adversaries.Keywords
Funding Information
- Danish Council for Strategic Research of the Danish Agency for Science Technology and Innovation (11-115328)
This publication has 38 references indexed in Scilit:
- The neglected user in music information retrieval researchJournal of Intelligent Information Systems, 2013
- Evaluation in Music Information RetrievalJournal of Intelligent Information Systems, 2013
- Feature learning and deep architectures: new directions for music informaticsJournal of Intelligent Information Systems, 2013
- Automatic Tagging of AudioPublished by IGI Global ,2010
- IRLbotACM Transactions on the Web, 2009
- Learning Deep Architectures for AIFoundations and Trends® in Machine Learning, 2009
- Concatenative sound synthesis: The early yearsJournal of New Music Research, 2006
- Distortion discriminant analysis for audio fingerprintingIEEE Transactions on Speech and Audio Processing, 2003
- Musical genre classification of audio signalsIEEE Transactions on Speech and Audio Processing, 2002
- Signal estimation from modified short-time Fourier transformIEEE Transactions on Acoustics, Speech, and Signal Processing, 1984