Under review. Inducing nonlocal constraints from baseline phonotactics. (Maria Gouskova & Gillian Gallagher)
download pdf

We present a model that induces nonlocal representations, needed for learning nonlocal phonological generalizations, based on phonotactic properties of a language that are observable in a model without any nonloca representations. We build on the UCLA Phonotactic Learner (Hayes & Wilson 2008) as our base grammar. Our model then evaluates the baseline grammar for constraints that suggest nonlocal interactions, specifically, trigram constraints where the middle gram is “any segment”, e.g., *[-continuant, -sonorant][][+cg] in Quechua. Based on these trigram constraints, our model defines a search space for nonlocal projections and selects the projection on which the strongest nonlocal generalizations are stated. We show that our model arrives at the desired solution for Quechua, a language with categorical restrictions on stops, and Shona, a language with gradient restrictions on vowels.

Under review. Accidental gaps and surface-based phonotactic learning: a case study of South Bolivian Quechua. (Colin Wilson & Gillian Gallagher)
download pdf

This squib investigates the role of statistical computations in distinguishing systematic and accidental gaps by comparing two phonotactic learning models: a version of the maximum entropy (maxent) model of Hayes & Wilson (2008) and a version of the tier based strictly local model (TSL) of Heinz et al. 2011. The models are tested on their success at capturing the vowel height allophony pattern in South Bolivian Quechua, which exhibits the ‘conditioned allophone vs. elsewhere allophone’ schema. We show that both phonotactic models require increased complexity over traditional analyses, but that the maxent model’s reliance on statistical computations allows it to cope with this added complexity while the TSL model does not.

To Appear. Rapid generalization in phonotactic learning. (Tal Linzen & Gillian Gallagher). LabPhon.
download pdf
The phonotactics of a language concerns the well-formedness of strings of sounds as potential words (e.g., paim is a better potential word of English than mlpemr). Speakers’ phonotactic judgments are informed not only by the distribution of particular sounds ([b] or [g]) but also by the distribution of classes of sounds (e.g., voiced stops). In a series of artificial language experiments, we investigate how such generalizations over classes of sounds are acquired, focusing on evaluating the proposal that generalizations must be acquired in a specific-to-general sequence – i.e., that learners must first learn the statistics of multiple individual sounds that belong to a class before they can generalize to the class. Contrary to this proposal, learners acquired knowledge over classes earlier than sound-specific knowledge, and showed an ability to generalize to a class based on a single example of the class. We discuss the implications of our finds for computational models of phonotactic learning.

2016. Asymmetries in the representation of categorical phonotactics. Language 92(3):557-590.
download pdf
materials
follow-up studies
An inductive learning bias in favor of constraints with the structural form of an Obligatory Contour Principle (OCP) restriction, *[αF][αF], is supported by a repetition task and a discrimination task diagnosing the representation of two categorical phonotactic restrictions by speakers of Cochabamba Quechua: the cooccurrence restriction on roots with pairs of ejectives *[k’ap’u], and the ordering restriction on roots with a plain stop followed by an ejective *[kap’u]. The results are consistent with a strong phonotactic restriction against cooccurring ejectives, above and beyond perception and production difficulties, while the ordering restriction seems to be represented as a weaker phonotactic restriction with speakers’ behavior primarily reflecting phonetic difficulties. As both restrictions are categorical, the results support an inductive learning bias that favors constraints like *[+cg][+cg], which penalize sequences of feature matrices with the same value for some feature, over constraints like *[-cont, -son][+cg], which penalize sequences of unrelated feature matrices.

2016. Vowel height allophony and dorsal place contrasts in Cochabamba Quechua. Phonetica 73:101-119.
download pdf
materials
This paper reports on the results of two studies investigating vowel height allophony triggered by uvular stops in Cochabamba Quechua. An acoustic study documents the lowering effect of a preceding tautomorphemic or a following heteromorphemic uvular on the high vowels /i u/. A discrimination study finds that vowel height is a significant cue to the velar-uvular place contrast.

2015. Natural classes in cooccurrence constraints. Lingua 166 A: 80-98.
download pdf
materials
Natural classes are typically defined by some shared phonetic property, though the segments within such a class may differ substantially along other dimensions. This paper explores two such classes in Quechua: the class of [spread glottis] segments, aspirates and [h], and the class of [constricted glottis] segments, ejectives and [ʔ]. While aspirates and ejectives pattern with their glottal counterparts in the cooccurrence phonotactics of the language, nonce word tasks only find weak evidence of these natural classes. Instead, there is evidence for strong phonotactic restrictions on aspirates and ejectives to the exclusion of their glottal counterparts. It is proposed that the preference for classes of laryngeally marked stops is phonetically based, deriving from the salience of the phonetic properties unique to stops.

2014. Evidence for an identity bias in phonotactics. Laboratory Phonology 5(3): 337-378
download pdf
Speakers of Cochabamba Quechua (CQ) participated in two tasks involving phonotactically illegal nonce forms with pairs of identical (e.g., [p’ap’u]) and non-identical ejectives (e.g., [k’ap’u]). In a repetition task, speakers were more accurate on identical than non-identical ejective pairs, though no asymmetry was found in analysis of low-level acoustic detail or in a perception study. The latent preference for identical ejectives is unexpected given the phonotactics of CQ, which categorically disallows both identical and non-identical ejective pairs. The asymmetry is in accord with the typology, however. Many languages systematically exempt identical segments from a phonotactic restriction that applies to non-identical segments. It is argued that this cross-linguistic identity preference has its roots in a synchronic bias in favor of identical segments.

2014. An acoustic study of trans-vocalic ejective pairs in Cochabamba Quechua. Journal of the International Phonetic Association 44(2): 133-154 (with James Whang)
download pdf
Cochabamba Quechua disallows pairs of ejectives within roots (*[k’it’a]), but this structure may arise across word boundaries, e.g., [misk’i t’anta] ‘good bread’. This paper presents an acoustic study of these phonotactically legal, trans-vocalic ejective pairs that occur at word boundaries. It is found that Cochabamba Quechua speakers de-ejectivize one of the two ejectives in such phrases a significant portion of the time, and that, in correct productions with two ejectives, the period between the two ejectives is lengthened by increasing the duration of the vowel and the closure duration of the second ejective.

2013. Speaker awareness of non-local ejective phonotactics in Cochabamba Quechua. Natural Language and Linguistic Theory 31: 1067-1099
download pdf
Native Quechua speakers were asked to repeat a mixture of real and nonsense words with medial ejectives, where the nonsense words were either phonotactically legal but unattested roots or phonotactically illegal roots that violated either the cooccurrence restriction (e.g., *k’ap’i) or the ordering restriction (e.g., *kap’i). Medial ejectives are accurately repeated significantly more often in nonce roots where the medial ejective is phonotactically legal than when it is illegal. Additionally, there is variation in how roots that violate the ordering restriction are repaired, both deletion of medial ejection, e.g., target [kap’i] produced as [kapi],  and movement of ejection, e.g., target [kap’i] produced as [k’api]  are common.

2013. Learning the identity effect as an artificial language. Phonology 30: 1-43
download pdf (copyright held by Phonology http://journals.cambridge.org/action/displayJournal?jid=pho)
The results of two artificial grammar experiments show that individuals learn a distinction between identical and non-identical consonant pairs better than an arbitrary distinction, and that they generalise the distinction to novel segmental pairs. These results have implications for inductive models of learning, because they necessitate an explicit representation of identity. While identity has previously been represented as root-node sharing in autosegmental representations (Goldsmith 1976, McCarthy 1986), or implicitly assumed to be a property that constraints can reference (MacEachern 1999, Coetzee & Pater 2008), the model of inductive learning proposed by Hayes & Wilson (2008) assumes strictly feature-based representations, and is unable to reference identity directly. This paper explores the predictions of the Hayes & Wilson model and compares it to a modification of the model where identity is represented (Colavin et al. 2010). The results of both experiments support a model incorporating direct reference to identity.

2012. Perceptual similarity in non-local laryngeal restrictions. Lingua  122:112-124
download pdf
Native Quechua speakers were asked to repeat a mixture of real and nonsense words with medial ejectives, where the nonsense words were either phonotactically legal but unattested roots or phonotactically illegal roots that violated either the cooccurrence restriction (e.g., *k’ap’i) or the ordering restriction (e.g., *kap’i). Medial ejectives are accurately repeated significantly more often in nonce roots where the medial ejective is phonotactically legal than when it is illegal. Additionally, there is variation in how roots that violate the ordering restriction are repaired, both deletion of medial ejection, e.g., target [kap’i] produced as [kapi],  and movement of ejection, e.g., target [kap’i] produced as [k’api]  are common.

2011. Acoustic and articulatory features in Phonology: the case for [long VOT]. The Linguistic Review 28:281-313
download pdf
This paper argues that phonological features must represent both the articulatory and acoustic properties of speech sounds. Evidence for this claim comes from the long-distance restrictions on ejectives and aspirates in Quechua (MacEachern 1999), which require both that ejectives and aspirates be referred to as a class and that they be distinguishable. While ejectives and aspirates are articulatorily disparate, they can be grouped in acoustic terms. Both types of segments are characterized by a long lag between the release of the oral constriction and the onset of voicing in a following sonorant, referred to by the proposed feature [long VOT]. Introducing acoustic features allows for a simple and restrictive account of the phonological behavior of laryngeally marked segments, both in Quechua and cross-linguistically.

2010. The perceptual basis of long-distance laryngeal restrictions. MIT Dissertation
download pdf
The two main arguments in this dissertation are 1. That laryngeal cooccurrence restrictions are restrictions on the perceptual strength of contrasts between roots, as opposed to restrictions on laryngeal configurations in isolated roots, and 2. That laryngeal cooccurrence restrictions are restrictions on auditory, as opposed to articulatory, features.

2010. Perceptual distinctness and long-distance laryngeal restrictions. Phonology 27: 435-480.
download pdf
In this paper, I present an analysis of the typology of laryngeal co-occurrence restrictions based on contrast markedness. The key ingredient of the analysis is that laryngeal co-occurrence phenomena reflect a preference for maximising the perceptual distinctness of contrasts between words (Flemming 1995, 2004). An AX discrimination task finds that the contrast between an ejective and a plain stop is less accurately perceived in the context of another ejective in the word than in the context of another plain stop in the word. Pairs of words like [k’ap’i] and [k’api], which contrast 2 vs. 1 ejectives, are less reliably distinguished than pairs of words like [kap’i] and [kapi], which contrast 1 vs. 0 ejectives. The unifying factor of all laryngeal co-occurrence patterns is the neutralization of the contrast between words with one and two laryngeally marked segments, exactly the contrast that is shown to be relatively perceptually weak.

2009. Distinguishing total and partial identity: Evidence from Chol. Natural Language and Linguistic Theory 27: 545-582. (with Jessica Coon)
download pdf
This paper argues that long-distance assimilations between consonants come in two varieties: total identity, which arises via a non-local relation between the interacting segments; and partial identity, which results from local articulatory spreading through intervening segments (Flemming 1995; Gafos 1999). Our proposal differs from previous analyses (Hansson 2001; Rose and Walker 2004) in that only total identity is a non-local phenomenon. While non-adjacent consonants may interact via a relation we call linking, the only requirement which may be placed on linked consonants is total identity. All single feature identities are the result of local spreading. The interaction of a total identity requirement on ejectives and stridents with anteriority harmony in Chol (Mayan) highlights the distinction between these two types of long-distance phenomena. We show that theories that allow non-local, single-feature agreement make undesirable predictions, and that the more restrictive typology predicted by our framework is supported by the vast majority of long-distance assimilation cases.