Computational models of tempo perception, swing and rhythm pattern
Aspects related to rhythm can be subdivided into tempo, grouping (including meter and patterns) and deviations. In the absence of a music score, tempo can only be defined in terms of its perception. [Moelants and McKinney, 2004] highlighted the fact that people can perceive different tempi for a single track. We argue that this agreement and disagreement in the perception of tempo also arises from metrical level ambiguity in the audio content and propose a computational model that allows predicting this agreement, disagreement. A specific case of deviation is the systematic deviation (delay) of the second eight-note within a beat, known as swing. We show that swing is not only found in jazz music but also often found in blues (then named shuffle) or reggae. We study the evolution of this factor as a function of tempo and revise [Friberg et al., 2002] assumption, we study whether this factor is specific to the performer and propose a computational model to predict it. A rhythm pattern should represent the timing information of musical events (percussive or not). We revise the various computational models proposed to represent it given an audio signal. We show that representing jointly this timing information with timbre information through tempo-invariant-time-frequency representations allows achieving more accurate representation of the rhythm pattern. This representation is then to be used for performing recognition or computing the similarity between two audio rhythms.