LV-38: Assistive alignment of in-the-wild sheet music and performances

Michael Feffer, Chris Donahue, Zachary Lipton

Abstract: Sheet music, which contains precise instructions for performers, remains a primary mechanism for communicating musical ideas. While digital scans of sheet music (represented as images) and recordings of performances (represented as audio) are both abundant sources of musical data, there remains a surprising paucity of aligned data, mappings between pixels in sheet music and the corresponding timestamps in associated performances. While several existing MIR datasets contain alignments between performances and structured scores (formats like MIDI and MusicXML), no current resources align performances with more commonplace raw-image sheet music, possibly due to obstacles like expressive timing and repeat signs that make alignment challenging and time-consuming even for trained musicians. To overcome these obstacles, we developed an interactive system, MeSA , which leverages off-the-shelf measure and beat detection software to aid musicians in quickly producing measure-level alignments (ones which map bounding boxes of measures in the sheet music to timestamps in the performance audio). We verified MeSA ’s functionality by using it to create a small proof-of-concept dataset, MeSA-13.