Crowdsourced ratings for speech data

Researchers who study interventions for speech disorders need to obtain blinded listeners’ ratings of speech production accuracy before and after treatment. However, conventional methods for obtaining these ratings can be time-consuming and frustrating. Crowdsourcing platforms like Amazon Mechanical Turk provide immediate access to a huge pool of potential raters, and our results suggest that by aggregating responses across a large number of nonexpert listeners, we can obtain speech ratings that are comparable in quality to trained listeners’ judgments.

agreeplotsmallOverview of crowdsourcing on ASHA Leader

Crowdsourcing interview video: ASHA CrEd Library

McAllister, T., Nightingale, C., Moya-Gale, G., Kawamura, A., & Ramig, L. O. (2023). Crowdsourced perceptual ratings of voice quality in people with Parkinson’s Disease before and after intensive voice and articulation therapies: Secondary outcome of a randomized controlled trial. Journal of Speech, Language, and Hearing Research, 66(5), 1541-1562. Link to manuscript preprint and associated text/code 

Nightingale, C., Swartz, M. T., Ramig, L. O., & McAllister, T. (2020). Using crowdsourced listeners’ ratings to measure speech changes in hypokinetic dysarthria: A proof-of-concept study. American Journal of Speech-Language Pathology. Link to manuscript preprint and associated text/code 

McAllister Byun, T., Halpin, P. F., & Szeredi, D. (2015). Online crowdsourcing for efficient rating of speech: A validation study. Journal of Communication Disorders, 53, 70-83.

McAllister Byun, T., Harel, D., Halpin, P.H., & Szeredi, D. (2016). Deriving gradient measures of child speech from crowdsourced ratings. Journal of Communication Disorders, 64, 91-102.

Harel, D., Hitchcock, E. R., Szeredi, D., Ortiz, J., & McAllister Byun, T. (2016). Finding the experts in the crowd: Accuracy and reliability in crowdsourced measures of children’s covert contrasts. Clinical Linguistics and Phonetics. doi: 10.3109/02699206.2016.1174306