On October 15th, NYU student Sripathi Sridhar will present a poster by the BirdVox team to the attendees of the International Society on Music Information Retrieval (ISMIR) Late-Breaking / Demo session (LBD). We reproduce the abstract of the paper below.

 

Helicality: An Isomap-based Measure of Octave Equivalence in Audio Data

Sripathi Sridhar, Vincent Lostanlen

Octave equivalence serves as domain-knowledge in MIR systems, including chromagram, spiral convolutional networks, and harmonic CQT. Prior work has applied the Isomap manifold learning algorithm to unlabeled audio data to embed frequency sub-bands in 3-D space where the Euclidean distances are inversely proportional to the strength of their Pearson correlations. However, discovering octave equivalence via Isomap requires visual inspection and is not scalable. To address this problem, we define “helicality” as the goodness of fit of the 3-D Isomap embedding to a Shepherd-Risset helix. Our method is unsupervised and uses a custom Frank-Wolfe algorithm to minimize a least-squares objective inside a convex hull. Numerical experiments indicate that isolated musical notes have a higher helicality than speech, followed by drum hits.

 

We have uploaded the video of Sripathi’s presentation to YouTube:

 

Link to the preprint of the ICASSP paper:
https://arxiv.org/abs/2010.00673

Link to the TinySOL dataset of isolated musical notes:
https://zenodo.org/record/3685367

Link to the source code to reproduce the figures of the paper: https://github.com/sripathisridhar/sridhar2020ismir