Towards end-to-end polyphonic music transcription: Transforming music audio directly to a score
- 1 October 2017
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE) in 2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)
Abstract
We present a neural network model that learns to produce music scores directly from audio signals. Instead of employing commonplace processing steps, such as frequency transform front-ends, harmonicity and scale priors, or temporal pitch smoothing, we show that a neural network can learn such steps on its own when presented with the appropriate training data. We show how such a network can perform monophonic transcription with very high accuracy, and how it also generalizes well to transcribing polyphonic music.Keywords
This publication has 9 references indexed in Scilit:
- Piano music transcription with fast convolutional sparse codingPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2015
- Melody Extraction from Polyphonic Music Signals: Approaches, applications, and challengesIEEE Signal Processing Magazine, 2014
- Harmonic Adaptive Latent Component Analysis of Audio and Application to Music TranscriptionIEEE Transactions on Audio, Speech, and Language Processing, 2013
- A Shift-Invariant Latent Variable Model for Automatic Music TranscriptionComputer Music Journal, 2012
- Polyphonic piano note transcription with recurrent neural networksPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2012
- Transcribing Multi-Instrument Polyphonic Music With Hierarchical EigeninstrumentsIEEE Journal of Selected Topics in Signal Processing, 2011
- Automatic Piano Transcription Using Frequency and Time-Domain InformationIEEE Transactions on Audio, Speech, and Language Processing, 2006
- Signal Processing Methods for Music TranscriptionPublished by Springer Science and Business Media LLC ,2006
- Learning to Forget: Continual Prediction with LSTMNeural Computation, 2000