Author: Alec Marantz (Page 10 of 11)

On Features

August 17, 2019 / Alec Marantz / 1 Comment

In the Jakobson/Halle tradition, morphological features were treated on par with phonological features. Binary features cross-classified a set of entities, phonemes in the case of Phonology and perhaps morphemes in the case of Morphology. Jakobson was clear that binary features project a multidimensional space for phonemes or morphemes. An alternative to cross-classificatory binary features would be a unidimensional linear hierarchy. Applied to the geometry of case, and to the issue of expected syncretism across cases in a language, the linear hierarchy predicts syncretism across continuous stretches of the hierarchy, while the binary feature approach predicts syncretism across neighbors in multidimensional space. 3 binary features project a cube, with each element (say, a case) at a vertex and syncretism predicted between elements connected by an edge.

Catherine Chvany describes Jackobson’s experiments with features for Slavic case in her paper, Chvany, Catherine V. “Jakobson’s fourth and fifth dimensions: On reconciling the cube model of case meanings with the two- dimensional matrices for case forms.” Case in Slavic (1986): 107-129, which we’ll read for my fall morphology course. Apparently, Jakobson explored a linear hierarchy of cases to account for case syncretism but moved to binary features, and a multi-dimensional case space, because observed syncretisms involved non-adjacent cases on the linear hierarchy. Morris Halle and I reached a similar conclusion from a paradigm of Polish cases in our “No Blur” paper.

Generative phonology has continually questioned whether shared behavior between phonological segments are best captured via cross-classifying binary features of the traditional sort or via some other representational system. Particle and Government Phonologies exploit privative unary features, and linear and more complicated hierarchies of such features have been explored in “feature geometries” of standard theories.

For morphology, linear hierarchies of monovalent features of the sort Jakobson abandoned have re-emerged most notably in Nanosyntax for the analysis of case, of person, gender and number, and of tense/aspect. I will blog about Nanosyntax later in the fall; here, one is tempted to remark that, as far as I can tell, Nanosyntacticians have not sufficiently tackled the sorts of generalization that led Jakobson away from linear case hierarchies or that motivated Halle & Marantz’s analysis of Polish. Here I would like to highlight a couple issues concerning the distribution of morphological features in words and phrases.

DM claims that some sets of features are not formed via syntactic merge. In Halle & Marantz 1993, these sets include sets for person/number/gender values of agreement morphemes, and features defining cases like nominative or dative.

From the point of view of canonical DM, the features of, say, person/number/gender and their organization could be investigated apart from the “merge and move” principles of syntactic structure building. The peculiarities of features in PNG bundles or case bundles might relate to the role of the features in semantic interpretation. Maybe some relevant features would be monovalent, and organized in a linear hierarchy, while others might be binary and cross-classificatory. The internal structure of such bundles might involve a theory like feature geometry in phonology — a fixed structure in which the individual features would find their unique positions. In phonology, it would seem strange to build a phoneme by free merge of phonetic features, checking the result of merge against some template — although perhaps this might be explored as an option.

If you have a fixed template of PNG features, or a strict linear hierarchy of monovalent case features, one needs to ask why syntactic merge should build this structure. In any case, the leading idea in DM would be that fixed hierarchies of features are internal to morphemes while the hierarchies of syntactic merge would be constrained by syntactic selection and by interpretation at the interfaces. I hope to explore later in this Blog the question of whether the mini-tree structures implied by selectional features are really equivalent to what’s encoded in a templatic hierarchy. In the recent history of DM, though, the working distinction between morpheme internal templatic structure and syntactic hierarchies of morphemes has played a role in research.

Teaching Halle & Marantz (1993)

August 10, 2019 / Alec Marantz / 1 Comment

I can’t remember a phone number for even a second, and when I’m introduced to people, I lose the beginning of their names by the time they reach the end (even for one syllable names, it seems). So any recounting of the origins of Halle & Marantz will necessarily involve rational reconstruction of what must have been going on in the early 1990’s. That being said, attention to the text reveals the many forces that led to the structure and content of the paper, and thus to the structure of canonical Distributed Morphology. Here I want to concentrate on the relationship between the goals of the paper and the various technical pieces of early DM — why there’s morphological merger, vocabulary insertion, impoverishment, fission and fusion.

It should be clear from the number of pages H&M devote to Georgian and, in particular, to Potawatomi, that a main thrust of the paper is a response to Steve Anderson’s A-Morphous Morphology. Following the lead of Robert Beard’s insights into “Separationist” morphology, we wanted to show that item and arrangement morphology could have its realizationism (separation of the syntactic and semantic features of morphemes from their phonological realization) and eat it, too. So, as we stated more directly in “Key Features of Distributed Morphology,” the aim was a marriage of Robert Beard on separation and Shelly Lieber on syntactic word formation — Late Insertion, and Syntax All The Way Down.

However, one shouldn’t forget that I wrote a very long dissertation turned book in the early 1980’s that concerned the relationship between word formation and syntax. The work is titled, “On the Nature of Grammatical Relations,” because it is, in a sense, a paean to Relational Grammar. It re-considers some of the bread and butter issues in RG (causative clause union, applicatives (advancement to 2), ascensions) within a more standard generative framework, with a particular emphasis on the connection between word formation and syntax. So, morphological merger between a higher causative head and the embedded verb might both be the cause of word formation (a verb with a causative suffix) and the structure reduction associated with causative clause union (or “restructuring”). Within this framework, morphological merger is distinct from traditional affix-hopping and from head raising, which don’t by themselves cause structure reduction. Baker’s subsequent work on “Incorporation” tried — and, in my opinion, failed — to unify head raising with the structure reduction associated with morphological merger. These issues are still quite live — see, e.g., Matushanky’s work on head movement.

These days, it might be useful to review the work from the early 1980’s on word formation and syntax. Richard Sproat’s papers are exemplary here. Those of us thinking hard about the issues explicitly connected the “bracketing paradoxes” of inflection (affix hopping creates a local relation between a head and inflection when the syntactic and semantic scope of the inflection is phrasal, not head to head) to similar mismatches between morphological and syntactic/semantic scope exemplified by clitics in particular (so “played” = “the Queen of England’s hat”). While it’s possible to think of all these bracketing mismatches as arising post-syntactically from a PF side morphological merger operation, my book explored the possibility that the same word formation operation of merger could feed the syntax, yielding syntactic restructuring in the case of causative constructions, for example. This may or may not be on the right track, but, as Matushansky makes clear, any phase-based Minimalist Program type syntax adopts an approach to cyclicity that would allow PF-directed morphological merger to feed back into the syntax.
It’s quite remarkable that, given my own pre-occupation with morphological merger and its potential interaction with the syntax, H&M write as if the field had coalesced around the conclusion that syntactic word formation was largely the result of head movement (raising) and adjunction. I’ll blog about this later, but pragmatically, adoption of this assumption about word formation allowed for a straightforward comparison between DM and Chomsky’s lexicalist syntactic theory of the time in the last section of the paper. Nevertheless, H&M are assuming that something like morphological merger/affix-hopping/lowering was necessary to create phonological words. So, for the “syntactic structure all the way down” key feature of DM, H&M are promoting head movement and adjunction as well as morphological merger. H&M leave aside any question about whether morphological merger might feed syntax.

For the “late insertion” key feature, H&M propose a particular technology for Vocabulary Insertion. The empirical target here is contextual allomorphy and (local) blocking relations. One could conclude, then, that the core of DM are the mechanisms of syntactic word formation and the mechanisms of PF realization — and the mechanisms proposed in H&M have been the topic of continuous research for the last 25 years.

What, then, about Fission, Fusion and Impoverishment? For these mechanisms, there were two driving forces at play: empirical domains of interest to Morphologists and the particular research of Eulalia Bonet and Rolf Noyer, which we were convinced by. Fission is a particular approach to the appearance of multiple exponence, and was expertly employed by Noyer in his analysis of Semitic verbal agreement. Fusion involves a head-on tackling of apparent portmanteau vocabulary items. To derive our analysis of syncretism, we required the one to one connection of terminal nodes to vocabulary items, and Fusion was in essence a brute force mechanism for covering situations in which arguably multiple terminal nodes feed the insertion of a single vocabulary item. Impoverishment accounts for two types of phenomena. The first is exemplified in Bonet’s work on Catalan clitics: the use of a unmarked Vocabulary Item in a marked environment. I still believe that the Impoverishment analysis is required to separate standard contextual allomorphy, where a marked VI appears in a marked environment (and a more general VI occurs elsewhere) from situations in which a more general VI occurs in a particular environment — the main argument is that the environment for VI and thus contextual allomorphy is local, while Impoverishment can occur at a distance. The other use of Impoverishment is for systematic paradigmatic gaps — where, for example, gender distinctions are lost in the plural, say. Here, the feature designations of VIs are sufficient to generate the forms without Impoverishment, but Impoverishment explicitly states the underlying generalization (e.g., no gender distinctions in the context of plural).

Jochen Trommer and others have shown that, by playing with the mechanisms of Vocabulary Insertion and with the assumptions about syntactic structure, none of these mechanisms are required to cover the empirical domains for which they were exploited in H&M. That they’re not necessary does not entail that they’re not actually part of the grammar — maybe they were the right approach to the phenomena to which they were applied. Personally, I believe the evidence for Impoverishment is strong, but I no longer adopt Fission and Fusion in my own work (although I’ll happily endorse them in the work of others).

To summarize, H&M lays the foundation for the syntactic word-building and late insertion theory of DM by describing the mechanisms of head movement and adjunction and Morphological Merger for word formation and the mechanisms of Vocabulary Insertion for late insertion. There’s way more of interest going on in the paper, which is, in bulk, a response to A-Morphous Morphology and to Chomsky’s then current version of lexicalism for inflectional morphology. What’s unfortunately largely missing is the concerns of “On the Nature of Grammatical Relations” — the precise interaction of word formation and syntax.

As I work towards articulating an experimental research program….

August 8, 2019 / Alec Marantz / 2 Comments

At the nexus between computation and theory:

In his dissertation research, Yohei Oseki investigated how to model the recognition of visually presented morphologically complex words. In the most promising model that he confronted with behavioral and MEG data, words were parsed by a probabilistic context free grammar of morphological structure, from left to right in the word. Cumulative surprisal over the word, given the correct parse, correlated with reaction time in lexical decision and with brain responses around 170ms post-stimulus onset around the Visual Word Form Area of the left hemisphere.

In his PCFG, category nodes dominating stems and derivational affixes were treated as non-terminals, with the orthographic form of morphemes identified as the terminals. So in parsing, say, “parseability,” at the second morphological form in the parse, -abil, the relevant rules would be one that expands an adjective into a verb and an adjective head ([_Adj[_v[parse]]Adj]) and an “emission” rule that expands the Adj node into -abil. At -ity, the emission rule that takes the non-terminal N node to -ity would not be sensitive to the presence of “-abil” under the Adj node. From the perspective of a speaker’s knowledge of English, this is wrong – speakers know that “able/abil” potentiates nominalizing -ity.

If we were to improve the model, we probably would want to stick closely to the actual structure of Distributed Morphology, since the development of DM involves maximum consideration of all potentially relevant types of data – formal choices within the theory are motivated by empirical considerations. In DM, the phonological/orthographic form of a morpheme is a vocabulary item. Vocabulary insertion is part of the Phonology, separate from the Merger operation that builds structures from morphemes. So, we would like to model “emission” not as a context free operation, as might be appropriate for structure building, but as a context sensitive insertion rule (so, in early versions of transformational syntax, lexical insertion was a transformation, rather than a PS rule). On this approach, the category nodes of the morphological tree are potentially terminal nodes – vocabulary insertion doesn’t create a (non-branching) tree structure but elaborates the terminal node into which it is inserted. This matters for locality: we want -ity to be inserted as a sister to -able, so -able must reside at the A node, not below it.

A context dependent vocabulary insertion rule seems formally equivalent to a treelet – N <-> -ity in the context of [_Adjable ] = [_N[_Aable] ity] Or, rather, the rule looks like a fragment tree in a Fragment Grammar, since the -able can be the head of the adjective, sister to a verbal head – the relevant subtree is not a proper tree structure. Which raises the question of the formal connection between PCFGs with contextual rules of vocabulary insertion and TAG and Fragment Grammar formalisms.

For the moment, I’m thinking about a couple quasi empirical questions. First, if we’re considering continuations of, say, a verb “parse,” does the frequency in which adjectives are made from verbs overall in English really contribute to the processing of “parseable” over and above the transition probability from “parse” to “able”? One could ask similar questions about sentential parsing, and perhaps people have – is the parsing of a continuation from a transitive verb to its direct object modulated by the probability of a transitive verb phrase in general in English, independent of the identity of the head verb?

Second is a theoretical question about the representation of derivational morphemes, with possible computational consequences. It’s clear that many derivational affixes contribute meanings beyond those associated with bare category nodes. So “-able” has a meaning beyond that of a simple adjective head. A possibility being explored in the morpho-phonological literature is that (many) derivational heads include roots. However, this possibility comes with the additional possibility that there’s a split between root-full derivational heads and derivational morphemes that are the pure spell out of category heads. So, for example, maybe the little v that verbalizes many Latinate stems in English and is spelled out as -ate is a pure category head, without a root. A further speculation is that such bare category heads might be classed with inflectional heads, and exhibit contextual allosemy – meaning that they could have null interpretations. Jim Wood’s recent work on Icelandic (and English) complex event nominalizations suggests that nominalizers of verbs may have null interpretation – these nominalizers, then, would be candidates for the bare category head type of derivational suffix, while contentful nominalizers such as -er would always involve roots and would not show the null semantic contextual allosemy of the bare nominalizers.

Author: Alec Marantz (Page 10 of 11)

On Features

Teaching Halle & Marantz (1993)

As I work towards articulating an experimental research program….

News & Events

Meta