P6-09: Emotion-driven Harmonisation And Tempo Arrangement of Melodies Using Transfer Learning
Takahashi, Takuya*, Barthet, Mathieu
Subjects (starting with primary): MIR tasks -> music generation ; Applications -> gaming, augmented/virtual reality ; Musical features and properties -> rhythm, beat, tempo ; Musical features and properties -> harmony, chords and tonality ; Musical features and properties -> musical affect, emotion and mood ; Applications -> music composition
Presented Virtually: 4-minute short-format presentation
We propose and assess deep learning models for harmonic and tempo arrangement generation given melodies and emotional constraints. A dataset of 4000 symbolic scores and emotion labels was gathered by expanding the HTPD3 dataset with mood tags from last.fm and allmusic.com. We explore how bi-directional LSTM and Transformer encoder architectures can learn relationships between symbolic melodies, chord progressions, tempo, and expressed emotions, with and without a transfer learning strategy leveraging symbolic music data without emotion labels. Three emotion annotation summarisation methods based on the Arousal/Valence (AV) representation are compared: Emotion Average, Emotion Surface, and Emotion Category. 20 participants (average age: 30.2, 7 females and 13 males from Japan) rated how well generated accompaniments matched melodies (musical coherence) as well as perceived emotions for 75 arrangements corresponding to combinations of models and emotion summarisation methods. Musical coherence and match between target and perceived emotions were highest when melodies were encoded using a BLSTM model with transfer learning. The proposed method generates emotion-driven harmonic/tempo arrangements in a fast way, a keen advantage compared to state of the art. Applications of this work include AI-based composition assistant and live interactive music systems for entertainment such as video games.