Evaluating Simultaneous Recognition and Encoding for Optical Music Recognition

Abstract
Most Optical Music Recognition workflows include several steps to retrieve the content from music score images. These steps typically comprise preprocessing, recognition, notation reconstruction and encoding. Currently, state-of-the-art models allow performing graphic recognition in an almost end-to-end fashion, performing the steps from preprocessing to recognition simultaneously. However, this graphic recognition has to be further processed to obtain a standard digital music representation. In this paper, we study the simultaneous recognition and encoding for a state-of-the-art OMR approach, based on neural networks, which receives a single staff-region image as input and directly obtains a sequence of characters that encodes the content in a standard music format. Our results confirm that performing OMR this way is feasible and brings additional benefits such as directly obtaining a version of the score readily available to be further processed or edited by standard tools.

This publication has 17 references indexed in Scilit: