P7-10: Verse versus Chorus: Structure-aware Feature Extraction for Lyrics-based Genre Recognition
Mayerl, Maximilian*, Brandl, Stefan, Specht, Günther, Schedl, Markus, Zangerle, Eva
Subjects (starting with primary): Musical features and properties -> musical style and genre ; Musical features and properties -> structure, segmentation, and form ; Domain knowledge -> machine learning/artificial intelligence for music ; MIR tasks -> automatic classification ; MIR fundamentals and methodology -> lyrics and other textual data
Presented Virtually: 4-minute short-format presentation
The aim of lyrics-based genre recognition is to automatically determine the genre of a given song based on its lyrics. Previous approaches for this task have commonly used textual features extracted from the entirety of a song's lyrics, neglecting the inherent structure of lyrics consisting of, for instance, verses and choruses. Therefore, we pose the hypothesis that features extracted from different parts of the lyrics can have significantly different predictive power. To test this hypothesis, we perform a series of experiments to determine whether models trained on features taken from verses and choruses perform differently for genre recognition. Our experiments indeed confirm our hypothesis, showing that generally, using features extracted from verses leads to higher performance than features extracted from choruses. Digging deeper, we found that this is especially true for pop and rap songs. Rock songs show the opposite effect, with features extracted from choruses performing better than those taken from verses.