Research Interests

Gross characterization: laboratory phonology, speech production & perception, language acquisition

Finer detail: Laboratory/experimental phonology, crosslinguistic production and perception, phonotactics, second language acquisition of phonetics & phonology, ultrasound imaging of speech, voice quality, loanword phonology & phonetics


Current Project

Pitch differences and the perception of creaky phonation, 2017-2018 NYU University Research Challenge Fund (URCF) award

This project investigates the perception of creaky phonation (“vocal fry”) in the speech of American English speakers. Creaky phonation refers to a type of vocal fold vibration in which the folds vibrate at a lower rate and have a longer closed phase as compared to baseline, or modal, phonation. The use of creak in the speech primarily of young people in the United States has recently attracted the attention of linguists, clinicians and even the general public. However, little is understood about how listeners perceive the acoustic properties that have been identified as indicating the presence of creaky phonation. One hypothesis is that creaky phonation is more readily recognized in female speech because it is much lower in pitch than the speaker’s modal phonation, and this comparison therefore is very salient. In contrast, men generally have a lower modal pitch, and so the comparison between their creaky and modal pitch may not be very perceptible if pitch is a primary correlate that listeners are attending to.

This hypothesis is investigated by presenting listeners with stimuli containing creaky, modal, and combined phonation types for both male and female speakers. Speakers with either relatively high or low modal pitch are used; in particular, the male higher-pitched and female lower-pitched speakers have similar modal pitch. If results show that listeners correctly identify more creaky tokens for the higher-pitched speaker of the within-gender pair, this would support the hypothesis that listeners are sensitive to differences between modal and creaky pitch. However, if listeners report more creaky tokens for the female than male speaker who are matched for pitch, this would indicate that there is a bias for attributing creak to female voices. The participant groups will vary along age, gender, and expert status (whether they are linguistically naïve or are speech-language pathology students) to investigate how these variables interact with the manipulations of the speech signal itself.


Previous Funding

Collaborative Research: A Bayesian model of phonetic and phonotactic effects in cross-language speech production. NSF BCS-1052855, in collaboration with Colin Wilson, Johns Hopkins University.

A fundamental aspect of human cognition is the capacity to perceive and produce language. Studies of non-native speech processing provide some of the most striking evidence bearing on this capacity: when humans attempt to perceive or produce words containing foreign sounds or sound sequences, they show systematic patterns of correct and incorrect performance. Prior research has established that different non-native structures elicit different rates and types of error; it has been hypothesized that these differences can be explained by a combination of grammatical, perceptual, and articulatory factors. The main goals of this project are to provide carefully controlled experimental evaluations of these factors, and to develop an explicit, probabilistic model of how they interact in human performance. Particular experimental issues to be investigated are: (1) what phonetic characteristics humans are most sensitive to when processing non-native sounds; (2) how the quality of the input and ambient acoustics affect non-native perception and production; and (3) whether learning word meaning can modulate sensitivity to detailed properties of non-native sounds. The computational model builds on a growing body of work suggesting that human perception and action reflect optimal Bayesian inference conditioned on prior expectations and noisy sensory measurements. The relevant prior reflects knowledge of the native language; the model predicts that non-native structures that are more similar to those in the native language should be processed with greater accuracy. The model also predicts that non-native sounds with robust perceptual properties should be processed more accurately, even if their prior probabilities are low. The development of this model, which will be made available to other researchers, will promote the role of phonology and phonetics within the broader context of cognitive science research.


Phonotactics and Articulatory Coordination in Foreign Language Acquisition and Loan Phonology. NSF CAREER Grant (BCS-0449560)

This research has been featured in the NSF Special Report on Language and Linguistics.

Why are foreign language learners typically unable to eradicate an accent, despite years of practice? When English speakers say the name of the pickle brand “Vlasic”, why do they end up pronouncing it “Velasic”? It is evident that speakers attempting to pronounce non-native words either in foreign language acquisition or when borrowing words from other languages face serious difficulties. While researchers of language acquisition and loanword borrowing have long attempted to understand the factors that contribute to non-native pronunciation, one little-studied issue that has a substantial impact on the production of unfamiliar sound sequences (or phonotactics) is the coordination of adjacent sounds. Coordination refers to timing patterns in speech, such as the duration of a sound (like the /r/ in “rat”), or how much overlap exists among the articulations of adjacent sounds (does the tongue start moving to make the /a/ before it finishes producing the /r/?). Previous research has shown that coordination patterns are language-specific, which presents a challenge for speakers producing non-native sound sequences like the /vl/ of “Vlasic”: not only must speakers learn which sounds can be combined into sequences, but they must also learn how such sequences are temporally coordinated.

With support from the National Science Foundation, Dr. Lisa Davidson will investigate how foreign language learners learn to produce non-native sound sequences and how language borrowers incorporate these sequences into their native language. Acoustic recordings will be combined with ultrasound imaging of tongue motion during speech to understand how the foreign language learner’s speech differs from the intended goal in the target language. Another issue important both to language acquisition and to borrowing is how speakers initially perceive the sound sequences that they are trying to produce. The combination of information from perceptual data and from the way speakers manipulate coordination in the articulation of non-native sequences will provide critical insight into how speakers learn to produce foreign languages or adopt new words into their native language.