End-to-End Full-Page Optical Music Recognition for Mensural Notation

Ríos-Vila, Antonio*; Inesta, Jose M.; Calvo-Zaragoza, Jorge

P2-10: End-to-End Full-Page Optical Music Recognition for Mensural Notation

Ríos-Vila, Antonio*, Inesta, Jose M., Calvo-Zaragoza, Jorge

Subjects (starting with primary): Domain knowledge -> machine learning/artificial intelligence for music ; MIR tasks -> optical music recognition ; MIR tasks -> music transcription and annotation

Presented Virtually: 4-minute short-format presentation

Abstract:

Optical Music Recognition (OMR) systems typically consider workflows that include several steps, such as staff detection, symbol recognition, and semantic reconstruction. However, fine-tuning these systems is costly due to the specific data labeling process that has to be performed to train models for each of these steps. In this paper, we present the first segmentation-free full-page OMR system that receives a page image and directly outputs the transcription in a single step. This model requires only the annotations of full score pages, which greatly alleviates the task of manual labeling. The model has been tested with early music written in mensural notation, for which the presented approach is especially beneficial. Results show that this methodology provides a solution with promising results and establishes a new line of research for holistic transcription of music score pages.

Direct link to video