We are happy to announce that our article: “Matching human vocal imitations to birdsong: An exploratory analysis” is featured in the proceedings of the 2nd international workshop on Vocal Interactivity in-and-between Humans, Animals, and Robots (VIHAR).
This paper was written by two MSc students: Kendra Oudyk (now at McGill University) and Yun-Han Wu (now at Fraunhofer IIS). They were supervised by the BirdVox team: Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, and Juan Pablo Bello.
The abstract of the paper is reproduced below.
We explore computational strategies for matching human vocal imitations of birdsong to actual birdsong recordings. We recorded human vocal imitations of birdsong and subsequently analysed these data using three categories of audio features for matching imitations to original birdsong: spectral, temporal, and spectrotemporal. These exploratory analyses suggest that spectral features can help distinguish imitation strategies (eg whistling vs. singing) but are insufficient for distinguishing species. Similarly, whereas temporal features are correlated between human imitations and natural birdsong, they are also insufficient. Spectrotemporal features showed the greatest promise, in particular when used to extract a representation of the pitch contour of birdsong and human imitations. This finding suggests a link between the task of matching human imitations to birdsong to retrieval tasks in the music domain such as query-by-humming and cover song retrieval; we borrow from such existing methodologies to outline directions for future research.
Kendra Oudyk presented the paper on August 30th in London, UK. The website of the VIHAR workshop is: http://vihar-2019.vihar.org/