Author: Alec Marantz (Page 3 of 11)

The Revenge of Phrase-Structure Rules

June 13, 2022 / Alec Marantz / 0 Comments

Part One: No Escape from Late Insertion

In an interesting proposal about the connection between Morphology and Syntax, Collins and Kayne (“Towards a theory of morphology as syntax.” Ms., NYU (2020)) outline a sign-based theory of Morphology, one that lacks Late Insertion. That is, Collins and Kayne (for “Morphology as Syntax” or MAS) propose that morphemes are signs: connections between phonology and formal features, where the latter would serve as input to semantic interpretation. The formal features of a morpheme determine its behavior in the syntax. They further propose, along with NanoSyntax, that each morpheme carries a single formal feature. By denying Late Insertion, they are claiming that the morphemes are not “inserted” into a node bearing their formal features, where this node has previously been merged into a hierarchical syntactic structure, but rather than the morphemes carry their formal features into the syntax when they merge into a structure, providing them to the (phrasal) constituent that consists of the morpheme and whatever constituent the morpheme merges with.

From the moment that linguists started thinking about questions concerning the connections between features of constituents and their distributions, they found that the ordering and structuring of elements and constituents in the syntax depended on the categories of these elements, not the specific items. Thus, phrase structure rules that traffic in category labels. For example, noun phrases (or DPs, or some such) appear in subject and object positions; in general, the distribution of nominal phrases like “John” or “the ball” is determined by their identification as noun phrases, not their particular lexical content. Similarly, within a language, the organization of morphological heads is associated with what Collins and Kayne call their formal features (like “tense”), not with the lexical items themselves. In fact Collins and Kayne assume that the hierarchical positioning of morphemes is governed by something like Cinque hierarchies, i.e., hierarchies of formal features that reflect cross-linguistic hierarchical ordering regularities. The literature has recently been calling such hierarchies f-seqs, for a sequence of functional categories (in theories that adopt some version of Kayne’s Linear Correspondence Axiom, a linear sequence also completely determines a hierarchical structure, where left in the sequence = higher in a tree). Tense might be higher than aspect in such f-seqs/hierarchies, for example.

But if the hierarchical organization of morphemes is determined by their formal features in a theory, then that theory is endorsing “late insertion,” i.e., the independence of the syntactic organization of morphemes from anything but their formal features. Technically, let’s break down this issue into two possible theoretical claims; the examples in Collins and Kayne’s work suggest that they are endorsing the second claim, which is more obviously a late insertion approach, but perhaps they really endorse the first one. The first possible claim is that there is only one morpheme in each language for each formal feature; that is, there is no contextual allomorphy, no choice of morphemes for expression of a formal feature that depends on the context of the morpheme (with respect to other morphemes). In their analysis of irregular plurals like “oxen,” C and K argue that -en and the regular plural -s actually express different formal features that occupy different positions in the f-seq (the universal hierarchy of formal features) of nominal features. This analysis predicts “oxens” as the plural of “ox,” since items in different positions can’t be in complementary distribution, and C and K propose a terribly unconvincing account of why we don’t say oxens in English.¹ But more crucially, they assume that the morpheme -en includes selectional features that limit its merger to certain roots/stems. This suggests that there are multiple morphemes in English for the inner plural formal feature, with contextual allomorphy; most stems “take” the zero allomorph of the inner plural (or the zero allomorph selects a set of stems that includes the majority of English nominal roots).

Which is the second possible claim about the way the C and K’s type of morphemes might interact with the principles that determine the hierarchical structure of formal features: that the features are ordered by the syntax, somehow invoking a Cinque hierarchy of f-features, but that the particular morpheme that instantiates an f-feature is determined by selectional features. But now we’ve recreated the approach of Distributed Morphology, at least for the core property of Late Insertion. The syntax organizes the (abstract) morphemes by category, then the morphophonology inserts particular vocabulary items that instantiate the features of the categories, respecting selectional requirements. The main difference between DM and MAS on this view, then, would be the assumption of one feature per terminal node in MAS – DM allows a bundle of features at each terminal node in the syntax.

It is possible to organize morphology (and syntax) around sign-morphemes (connections between f-features and phonology) without Late Insertion. This describes the grammatical theory in Lieber’s Deconstructing Morphology (1992) (and elsewhere). I will expand a bit in a later post how Lieber’s system is templatic and inspired by (true) X-bar syntax. But for present purposes, it’s sufficient to point out the basics. Each morpheme has three essential syntactic features, in addition to its phonological form. First, it indicates what category of constituent/phrase it may merge with; this is a selectional feature (formally a subcategorization feature, since the selectional features form subcategories of morphemes “categorized” by the second type of feature). Second, it indicates what category of constituent/phrase it creates (what the label is of the phrase/word it heads). And, finally, it includes a set of features that it adds to the constituent that it’s merging with. The categories are familiar – e.g., N, V, A. And the categories include levels, as in X-bar theory, so a morpheme may attach to N level zero and create N level 1. Crucially, for the most part the categories are distinct from the features carried by the morphemes. For the N category, these features might include person, gender, number, case, definiteness, etc. The plural /z/ in English, then, might select for category N zero, create a category N-bar, and add a +plural feature to the “categorial signature” of the N zero to which it attaches.

For the Lieber theory, and for templatic theories in general, there is no explanatory connection between the location of morphemes carrying f-features in the “extended projection” of a lexical category like N or V and the f-features themselves. Why a morpheme carrying a plural feature should attach to an N zero and create an N-bar, blocking out any other N zero-attaching morpheme, is a stipulation. The organization of the morphemes is not specified in the syntax by the f-features, since the syntax of the hierarchical structure of morphemes cares about the categories, not these features, and the morphemes are not, essentially “of” a category – they produce a category via merger, but that’s independent of the nature of the features they carry, in principle.

As soon as you have, like MAS, a system in which the syntax organizes morphemes via the f-features, constrained by selectional features, you have a system with Late Insertion in the Distributed Morphology sense. Again, as we will explore in future posts, the alternatives to Late Insertion are templatic theories of morphology (and syntax), but these deny the central insight behind Cinque hierarchies and generalizations about f-sequences in the “extended projection” of lexical items. A templatic system, like Lieber’s, does not claim that the distribution of constituents is determine by their categories/features.

The one feature per morpheme assumption shared by C & K and by NanoSyntax runs into an explanatory problem that DM at least sidesteps by allowing bundles of features under a terminal node in the syntax. Consider the way in which unary (non-binary) features in a feature hierarchy both yield categories of, say, gender and number, and capture the markedness relationship among the genders and numbers. Suppose that we use the first gender feature for masculine nouns, which are the least marked (and perhaps default) gender in some language. An additional gender feature hierarchically above the masc feature might give us neuter gender, in a three way gender feature in which masculine and neuter share exponents of case and number (as in Slavic). Finally, a third gender feature on top of the other two would yield feminine gender. Within a subsequence of an f-seq for the “extended projection” of a noun, one wouldn’t need to label these gender features; their values come from the markedness hierarchy within the gender region. It’s the sub-f-seq’s within the gender region that have values – a single feature is masculine, two features is neuter and three features is feminine.

Similarly, suppose we have a number system in the same language with three values, singular, dual and plural. Singular would be least marked, with a single feature in the number sub-f-seq, plural might be more marked, with two features, and dual might be the most marked, with three features. Again, the features themselves would not have values; the values come from the number of features within the number sub-f-sequence.

But now we can see clearly that the specific morphemes (for C & K) or features (for NanoSyntax) are not themselves ordered within an f-seq. Rather, it’s the fields of features, here gender and number, that are ordered. The features within each field are “ordered” in a sense, but really, for the syntax, a language would just specify how many features it allows within each field – it’s up to the phonology and the semantics to “interpret” the features in terms of number and gender, and to do so just based on the number of features in a field.

We’ve seen that “late insertion” doesn’t distinguish among DM, C & K, and NanoSyntax, and now we can see that ordering based on classes of features, rather than individual features, doesn’t distinguish among these theories either. All these theories require the equivalent of phrase structure rules to correctly distribute fields of features within hierarchical syntactic structures, followed by (or parallel with) principles of vocabulary insertion that realize the features phonologically. The adoption of a signed based theory of morphemes along with the assumption of a single formal feature per morpheme seems to make the principle of Vocabulary Insertion very simple for C &K. However, complications arise immediately, essentially concerning the distribution of zero morphemes. Consider what they need to rule out oxens, for example. NanoSyntax explores a rather more complicated theory of Vocabulary Insertion, but it would be fair to say that, unlike DM and C & K, NanoSyntacticians spend little effort showing how NanoSyntax interacts with SyntaxSyntax (becoming at present more of a theory of NanoMorphology).

Missing from all three approaches is a theory of phrase-structure; that is, a theory of how to exploit the generalizations expressed in Cinque-style f-sequences to generate syntactic structures that conform to them. I’ll write more about this problem in a future post.
___________

¹They appeal to a generalization about “blocking” of multiple exponence across positions that Halle & Marantz (1993, “Distributed morphology and the pieces of inflection”) debunked in their discussion of Anderson’s blocking principles in A-Morphous Morphology. In any case, to split plural into two plural heads across the f-seq is to claim they don’t have the same interpretation and thus shouldn’t trigger any kind of blocking (and a single f-feature per morpheme makes it difficult to claim that the plurals are “similar,” since similarity here would imply feature decomposition to allow the two plurals to share a feature).

Grammar and Memorization: The Jabberwocky Argument

February 16, 2021 / Alec Marantz / 0 Comments

Memorized vs. Computed

I have previously written about why I believe that distinctions in the literature between words (or sentences) that are “stored” as units vs. words (or sentences) that are “computed” are not well articulated. I claimed, instead, that, in a sense that is crucial for understanding language processing, all words (and all sentences) are both stored AND computed, even those words and sentences a speaker has never encountered before. A speaker stores or memorizes all the infinite number of words (and sentences) in his/her language by learning a grammar for the language. Attempts in the literature to distinguish the stored words from the computed ones fail to be clear about what it means to store or compute a word – particularly on what it means to memorize a word. I claimed that, as we become clear on how grammars can be used to recognize and produce words (and sentences), any strict separation between the stored and the computed disappears.

Despite my earlier efforts, however, I find that I have not convinced my audience. So here I’ll follow a line of argument suggested to me by Dave Embick and try again.

Let’s start with Jabberwocky, by Lewis Carroll of course (1871, this text from Wikipedia):

Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

“Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!”

He took his vorpal sword in hand:
Long time the manxome foe he sought—
So rested he by the Tumtum tree,
And stood awhile in thought.

And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One, two! One, two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

“And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!”
He chortled in his joy.

‘Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

The poem is full of words that the reader might not recognize and/or be able to define. Quick quiz: which words did Carroll make up?

Not so easy, at least for me. Some words that Carroll was the first to use (as far as lexicographers know) have entered the language subsequently, e.g., vorpal (sword). What about chortle? Are you sure? How about gyre? Gimble? Beamish? Whiffling?

The fact is that when we encounter a word in context, we use our knowledge of grammar, including our knowledge of generalizations about sound/meaning connections, to assign a syntax and semantics to the word. (Chuang, Yu-Ying, et al. “The processing of pseudoword form and meaning in production and comprehension: A computational modeling approach using linear discriminative learning.” Behavior research methods (2020): 1-32.) Suppose the word is one we have not previously encountered, but it is already in use in the language. Can we tell it’s a “real” word as opposed to Jabberwocky? That the word has found a place in the language probably means that it fits with generalizations in the language, including those about correlations between sound and meaning and between sound and syntactic category. Children must be in this first encounter position all the time when they’re listening – and I doubt that many of them are constantly asking, is that really a word of English? Suppose, now, that the-new-to-us word in question is actually not yet a word in use in English, as was the case for the first readers of Jabberwocky encountering chortle. In the course of things, there’s no difference between encountering in context a word that’s in use and you haven’t heard yet, but fits the grammar of your language, and one that the speaker made up, but also fits the grammar of your language equally well. Lewis Carroll made up great words, extremely consistent with English, and many of them stuck.

Speakers of English have internalized a phonological grammar (a phonology) that stores our knowledge of the well-formedness of potentially infinite strings of phonemes. The phonotactics of a language include an inventory of sounds (say an inventory of phonemes) and the principles of phonotactics – the sounds’ legal combinations. The phonology – the phonotactic grammar – stores (and generates) all the potential words of the language, but doesn’t distinguish possible from “actual” words by itself. Are the “actual” words distinguished as phoneme-strings carrying the extra feature [+Lexical Insertion], as Morris Halle once claimed for morphologically complex words that are in use as opposed to potential but not “actual” words (Halle, M. (1973). Prolegomena to a theory of word formation. Linguistic inquiry, 4(1), 3-16)? It’s not particularly pertinent to the question at hand whether people can say, given a string of letters or phonemes in isolation, this a word of my language. Experimental subjects are asked to do this all the time in lexical decision experiments, and some are surprisingly accurate, as measured against unabridged dictionaries or large corpora. Most subjects are not so accurate, however, as one can see from examining the English Lexicon Project’s database of lexical decisions – 85% correct is fairly good for both the words (correct response is yes) and pronounceable non-words (correct response is no) in that database. Lexical Decision is a game probing recognition memory – can I recover enough of my experiences with a letter or phoneme string to say with some confidence that I encountered it before in a sentence? A better probe of our knowledge of potential and actual words is placing the strings in sentential context – the Jabberwocky probe. Do we think a Jabberwocky word is a word in our language. Here we see that our judgments are graded, with no clear intuition corresponding to a binary word/non-word classification.

For phonology, it’s somewhat clear what we mean when we say that the generative grammar “stores” the forms of potential and existing words in the language. The consequences of this for the Chomskyan linguist (committed to the principle that there are not separate competence and performance grammars) is that the phonological grammar is used in recognizing and producing the words. For committed Chomskyans, like me, at a first pass, we expect that phonotactic well-formedness will always play a role in word recognition and production – “knowing” a word doesn’t exempt it from obligatory “decomposition” via the grammar in use into, e.g., phonemes, and analysis via the phonotactic grammar. “Retrieving” the phonological form of a word from memory and “generating” it from the grammar become the same process.

What is, then, the difference between words like chatter and Jabberwocky like chortle? Although our grammar will assign a meaning to any well-formed possible word, without sentential or other context, the meaning might be vague. As we experience words in context, we can develop sharper accounts of their meaning, perhaps primarily via word co-occurrences. The sharpness of a semantic representation isn’t a property of the phonological grammar, but it is a property of the grammar as a whole. For linguists, a “whole” grammar includes, in addition to a syntax that organizes morphemes into hierarchical tree structures and a phonology that maps the syntactic structure into a structured sequence of prosodic units like phonological words, also what Chomsky calls a language’s “externalization” in the conceptual system, i.e., in this case the meaning of words.

In important ways, words are like human faces to human speakers. We have internalized a grammar of faces that allow us to recognize actual and potential faces. We store this grammar, at least partially, in what is called the Fusiform Face Area. Recognizing faces (as faces) involves obligatory decomposition into the elements of a face (like the eyes, ears, and nose) whose grammatical combinations the face grammar describes. For faces, we don’t call the faces of people that we haven’t seen “potential” or “pseudo” faces – they’re just faces, and the faces of people that we have encountered (and can recall as belonging to people we’ve seen or met) we call “familiar” faces. For words, I propose we adopt the same nomenclature – words and potential words should just be “words,” while words to which we push the “yes” button to in Lexical Decision experiments should be called “familiar words.”

Note that, for written words, there’s an even greater parallel between words and faces. Our orthographic grammar, describing the orthotactics of the language, generates thus stores all the orthographic forms of the words of the language. From neuroscientific studies, we know that the orthographic grammar – and thus the orthographic forms of words – is (at least partially) stored in an area of the brain adjacent to the Fusiform Face Area (called the “Visual Word Form Area”), and the recognition of words follows a parallel processing stream and time frame as the recognition of faces. One can speculate (as I will in a future post) that the phonological grammar and thus the phonological forms of words (really morphemes of course) live in secondary auditory cortices on the superior temporal lobe, where auditory word recognition is parallel to the recognition of faces and visual word forms, with the interesting complication that the recognition process plays out over time, as the word is pronounced.

[To be continued…..]

Distributed Morphology Basics: Part Two

October 30, 2020 / Alec Marantz / 1 Comment

Recall that Morris Halle had proposed to treat the phonological form of “abstract morphemes” (those with conditioned allomorphs) as (abstract) “Q.” Q would be replaced in the phonology by the actual phonological realizations of the morpheme, in some context. For DM, Morris and I assumed that, in place of Q, all morphemes had no phonological form at all in the syntax. Nevertheless, the insertion of phonological forms via “Vocabulary Insertion” was seen as a phonological process, subject to whatever principles we (thought we) knew were applicable in the phonology.

We imagined that a language contained a set of Vocabulary Insertion rules (VIn rules), that were rather like phonological rules in the Sound Patterns of English sense. When the grammar was ready to insert a Vocabulary Item (VIt) into a morpheme from the syntax, all the VIn rules would compete for use. The VIn rule specifying the largest subset of the features on the morpheme would win the competition. If two (or more rules) specified the same subset of morphosyntactic features, these rules would compete on the basis of any contextual features they had that restricted the environment of the morpheme into which the VIt could be inserted. For example, in the competition for insertion into a number morpheme with the feature [+plural], the VIn rule [+plural]<<—->> /-z/ would tie with the VIn rule [+plural] <<—->>Ø in terms of the subset of features that the rules spell out. The latter rule would have the context of the list of stems that take the zero plural (deer, fish, etc.) and thus would win the competition with the /-z/ rule when the [+plural] feature was on a number node sister to, say, deer. The /-z/ rule, having no contextual features, would be the default plural rule, used for [+plural] when the stem fell on no lists associated with the context for special VIn plural rules.

The “subset principle” here was supposed to be a version of a Pāninian elsewhere condition of the sort that governed the application of phonological rules. The nesting of features between VIn rules (say a rule spelling out +plural and +masc nesting a rule spelling out just +plural) was parallel to the rule abbreviation conventions of phonology (in the 1960’s at least) and captured the idea that the most specified VIt that fit into a morpheme would be inserted (blocking the insertion of a VIt whose features nested within the more specified VIt’s). Anderson’s principles for complementary distribution of the morphophonological rules realizing features in A-Morphous Morphology included a similar approach, again borrowed from phonology.

The overriding principle governing the complementary distribution of VIts was the one morpheme, one VIt principle. That is, if a morpheme had features A and B, and there were separate VIns spelling out A and B, only one VIt could be inserted. Such a situation, however, would not be covered by the subset principle, since A and B aren’t subsets of each other. To mediate competition in such cases, additional principles were necessary. Halle & Marantz propose stipulated ordering of VIs in such cases (so the one spelling out A might be ordered before the one spelling out B and thus bleed it – if the one spelling out A, however, had a contextual feature not met in a particular word, then B would show up). Other approaches to VI rule ordering were explored, including a universal hierarchy of features (see Noyer (1992), e.g., for early work on this idea).

But if there is just one VIt per morpheme, the “bundling” of features in single morphemes becomes crucial for explaining the distribution of VIs in a word. Halle & Marantz here supposed that “bundling” could be universal, e.g., if say Agreement in natural languages involved Agreement morphemes with a particular set of “phi-features” for person, number, and gender, then person, number and gender would be bundled universally into a single morpheme. Or bundling could be language specific, with each language determining how to package the universally available features into morphemes. A “bundling” parameter, for example, was explored in Pylkkänen (2008)) where languages were claimed to differ depending on whether or not they bundled the features of voice and those of v into a single morpheme, with consequences for both syntax and for morphophonology.

So, DM had bundles of features in terminal nodes of the syntax formed prior to the use of these bundles in the syntax, a hierarchical organization of these terminal nodes (= morphemes) produced by the syntax, and a principle of Vocabulary Insertion via VIn rules that provided, in the morphophonology, a single VIt for each morpheme from the syntax. An additional set of assumptions was necessary to account for word formation – how syntactically distributed morphemes end up in a sequence of phonological words. Marantz (1984) included an elaborate theory of “morphological merger” – the process by which morphemes were put together into words. DM did not adopt the analyses of Marantz (1984), instead buying into what was turning into a consensus account of syntactic word formation: a morpheme that headed a lower phrase raised and adjoined to the head of the phrase that took its maximal projection as its complement (if XP were the complement to YP, X could raise and adjoin to Y). Halle & Marantz more or less presuppose the viability of this analysis, despite the fact that mainstream generative theory at the time, influenced by Chomsky, included a lexicalist assumption that words were the basic units of syntax, constructed in the lexicon (see the final section of Halle & Marantz (1993) for a comparison with Chomsky’s approach). In addition to head movement (and adjunction), Halle & Marantz supposed that some morphemes could be inserted (and adjoined to morphemes already in the syntactic tree) after the syntax proper, e.g., agreement and perhaps case morphemes. And cliticization (of the sort found in English possessive constructions like “the queen of England’s hat”) was assumed to involve yet another post-syntactic process, (morphological) merger under adjacency, adjoining two morphemes that were adjacent in the morphophonology.

Bundling, head-movement, morphological merger, and the one morpheme, one VI principle served as the scaffolding for a piece-based realizational approach to morphology and phonology, contrasting with, e.g., Lieber-style (1992) lexicalist theories with their phonology-laden morphemes and Anderson-style realization theories without Vocabulary Items as pieces. However, empirical issues required additional mechanisms for early DM, one central to the theory and two place-holders for a better theory to come. The subset principle of Vocabulary Insertion along with assumptions about how context resolves ties between VIns that spell out the same features lead to this generalization about contextual allomorphy: the more specific VIts go in the more specified environments, while the more general VIts go in the less specified environments (are relative defaults). However, against this generalization, there seemed to be situations in which a more general VIt is inserted in a more specific environment. Bonet (1991) provided a set of cases of this sort from Romance pronominal clitics and suggested a principle of Impoverishment could explain the facts. For example, the Spanish dative clitic le (for third person masculine nouns) occurs in most environments with other clitics. However, in some dialects, before the third person accusative lo, le surfaces as se, apparently the third person reflexive clitic. Bonet argues that se is actually a default (third person) clitic, and its distribution motivates the deletion of features from the 3^rdperson dative clitic in the environment of the 3^rd accusative clitic before Vocabulary Insertion. This Impoverishment of features causes the insertion of a more general clitic, se, in a more specified environment (before lo) over le, which is the default third person dative clitic. An understudied claim of Marantz’s (e.g., 2010) is that the locality constraints on the relationship between the trigger of Impoverishment (in our example, the accusative clitic) and the target of Impoverishment (here the dative clitic) is looser than that between the locus of VIn and any environment that might trigger a contextual allomorph to be inserted. In Halle & Marantz, the analysis of Potawatomi involved a longish distance relationship between an Impoverishment trigger and its target, a relationship that was too distant to have triggered contextual allomorphy at the target. If Marantz’s observation is correct, Impoverishment and contextual allomorphy would properly be separated in the theory, as in standard DM.

Given the one morpheme, one VI principle, certain patterns of morpheme distribution become difficult to describe. For example, data motivating portmanteaux morphemes (where a single VIt looks as if it is spelling out two morphemes) and circumfixes (where two VIts look to be spelling out a single morpheme, on opposite sides of a stem) challenge straightforward accounts of Vocabulary Insertion under one morpheme, one VI. Halle & Marantz endorse two brute force operations to provide rather standard accounts of these phenomena. In Fusion, two morphemes join into one before Vocabulary Insertion, allowing the features of both to contribute to the choice of the VIt for this fused morpheme and predicting complementary distribution between a portmanteau VIt and any other VIts that spell out only the features of one or of the other of the pre-fused morphemes. For Georgian, we supposed that the subject and object agreement morphemes fused prior to vocabulary insertion, explaining the fact that only a single prefix reflecting the person and number of the subject or the object occurs with any verb, even though prefixes exist in the language to separately spell out a subject and an object agreement.

In Fission, some of the features of a morpheme are split off from the morpheme into a separate terminal node, allowing for a VI to be inserted into the original morpheme and an additional VI to be inserted into the new terminal node. For Georgian, we accounted for the appearance of number suffixes in the same verbs as agreement prefixes by Fissioning off the number feature when it occurred with a certain set of other features in the Fused subject and object agreement morpheme. (See Blix (to appear) for a recent non-Fusion, non-Fission analysis of the Georgian data within a realizational morphology.)

After the appearance of Halle & Marantz, Jochen Trommer (e.g., 1999) pointed out that if we abandon the one morpheme, one VIt principle, allowing multiple VIn into a single morpheme, we could derive (at least most of) the same forms as the H & M version of DM does without Impoverishment, Fusion, or Fission. Impoverishment would be replaced by “consuming zeros”: phonologically null VIts that eat up features in the environment of other morphemes, effectively Impoverishing them prior to VIn of a phonologically contentful VIt. Fission would simply involve multiple VIn into a single morpheme, and Fusion could be replaced by contextual allomorphy at one of the apparently Fusing morphemes in the context of the other, followed by either non-insertion of a VIt into the other morpheme or insertion of a consuming zero VIt.

The question of which of these mechanisms to retain in the theory is not a simple matter of simplicity or redundancy. Before we abandon, say, Fusion, we need to ask whether or not there is a theory of Fusion that makes different predictions than DM with, say, Trommer’s changes. Of particular interest are locality domains for the operations, specifically the relation between targets and triggers, as well as any interactions among operations that result from their ordering. For example, we have already seen that the environment for the trigger of Impoverishment may be at a longer distance from the target than that between the environment for Vocabulary Insertion and the morpheme at which we’re inserting the VIt. If this is correct, it argues against Trommer’s collapsing of Impoverishment with VIn (of a consuming zero). Similarly, Matthew Hewett has recently argued for the autonomy of Fission as an operation within DM based on the interaction of Fission with other processes (Hewett (2020)).

Blix, H. (to appear). Spans in South Caucasian agreement. NLLT. https://doi.org/10.1007/s11049-020-09475-x

Bonet, E. (1991). Morphology after syntax: Pronominal clitics in Romance. MIT.

Halle, M., & Marantz, A. (1993). Distributed morphology and the pieces of inflection. Hale, K. & SJ Keyser (eds.), The View from Building 20.

Hewett, M. (2020). On the autonomy of Fission: Evidence from discontinuous agreement in Semitic. NYU MorphBeer handout.

Lieber, R. (1992). Deconstructing morphology: Word formation in syntactic theory. University of Chicago Press.

Marantz, A. (1984). On the Nature of Grammatical Relations. Linguistic Inquiry Monographs Cambridge, Mass, (10), 1-339.

Marantz, A. (2010). Locality domains for contextual allosemy in words. Handout of a talk given at the University of California, Santa Cruz, 30.

Noyer, R. R. (1992). Features, positions and affixes in autonomous morphological structure (Doctoral dissertation, Massachusetts Institute of Technology).

Trommer, J. (1999, August). Morphology consuming syntax’resources: Generation and parsing in a minimalist version of Distributed Morphology. In Proceedings of the ESSLI Workshop on Resource Logics and Minimalist Grammars (pp. 469-80).

Author: Alec Marantz (Page 3 of 11)

The Revenge of Phrase-Structure Rules

Grammar and Memorization: The Jabberwocky Argument

Distributed Morphology Basics: Part Two

News & Events

Meta