Computational models of rhythm similarity: a signal processing perspective
Quantifying similarity between music recordings is a fundamental problem in music information retrieval, with existing solutions underpinning the development of recommendation, playlisting, auto-tagging, cover-song identification and general music classification systems. Most prior work has focused on the modeling of timbre or tonal-based similarity. However a number of approaches are specifically targeted to the modeling of rhythm similarity and, in some cases, the application of those models to the analysis and classification of music from non-western traditions, where rhythm is oftentimes the defining feature. In this talk, I will discuss the problem of rhythm similarity from an audio signal processing perspective, identify the strengths and weaknesses of previous approaches, and present a preliminary study that seeks to capitalize on those insights to improve style identification on Latin American music. In particular, I suggest that existing approaches can be conceptualized as deep, multi-layer systems combining a very limited set of basic signal processing operations. I conjecture that expanding this set of basic operations to include novel feature design and learning strategies holds significant promise for the robust modeling of rhythm similarity.
presentation slides available here