P5-17: Heterogeneous Graph Neural Network for Music Emotion Recognition
Mendes da Silva, Angelo Cesar*, Silva, Diego F, Marcacini, Ricardo Marcondes
Subjects (starting with primary): MIR tasks -> automatic classification ; Domain knowledge -> machine learning/artificial intelligence for music ; MIR fundamentals and methodology -> multimodality ; Musical features and properties -> musical affect, emotion and mood ; Musical features and properties -> representations of music
Presented Virtually: 4-minute short-format presentation
Music emotion recognition has been a growing field of research motivated by the wealth of information that these labels express. Recognition of emotions highlights music's social and psychological functions, extending traditional applications such as style recognition or content similarity. Once musical data are intrinsically multi-modal, exploring this characteristic is usually beneficial. However, building a structure that incorporates different modalities in a unique space to represent the songs is challenging. Integrating information from related instances by learning heterogeneous graph-based representations has achieved state-of-the-art results in multiple tasks. This paper proposes structuring musical features over a heterogeneous network and learning a multi-modal representation using Graph Convolutional Networks with features extracted from audio and lyrics as inputs to handle the music emotion recognition tasks. We show that the proposed learning approach resulted in a representation with greater power to discriminate emotion labels. Moreover, our heterogeneous graph neural network classifier outperforms related works for music emotion recognition.