Part One: No Escape from Late Insertion

In an interesting proposal about the connection between Morphology and Syntax, Collins and Kayne (“Towards a theory of morphology as syntax.” Ms., NYU (2020)) outline a sign-based theory of Morphology, one that lacks Late Insertion. That is, Collins and Kayne (for “Morphology as Syntax” or MAS) propose that morphemes are signs: connections between phonology and formal features, where the latter would serve as input to semantic interpretation. The formal features of a morpheme determine its behavior in the syntax. They further propose, along with NanoSyntax, that each morpheme carries a single formal feature. By denying Late Insertion, they are claiming that the morphemes are not “inserted” into a node bearing their formal features, where this node has previously been merged into a hierarchical syntactic structure, but rather than the morphemes carry their formal features into the syntax when they merge into a structure, providing them to the (phrasal) constituent that consists of the morpheme and whatever constituent the morpheme merges with.

From the moment that linguists started thinking about questions concerning the connections between features of constituents and their distributions, they found that the ordering and structuring of elements and constituents in the syntax depended on the categories of these elements, not the specific items. Thus, phrase structure rules that traffic in category labels. For example, noun phrases (or DPs, or some such) appear in subject and object positions; in general, the distribution of nominal phrases like “John” or “the ball” is determined by their identification as noun phrases, not their particular lexical content. Similarly, within a language, the organization of morphological heads is associated with what Collins and Kayne call their formal features (like “tense”), not with the lexical items themselves. In fact Collins and Kayne assume that the hierarchical positioning of morphemes is governed by something like Cinque hierarchies, i.e., hierarchies of formal features that reflect cross-linguistic hierarchical ordering regularities. The literature has recently been calling such hierarchies f-seqs, for a sequence of functional categories (in theories that adopt some version of Kayne’s Linear Correspondence Axiom, a linear sequence also completely determines a hierarchical structure, where left in the sequence = higher in a tree). Tense might be higher than aspect in such f-seqs/hierarchies, for example.

But if the hierarchical organization of morphemes is determined by their formal features in a theory, then that theory is endorsing “late insertion,” i.e., the independence of the syntactic organization of morphemes from anything but their formal features. Technically, let’s break down this issue into two possible theoretical claims; the examples in Collins and Kayne’s work suggest that they are endorsing the second claim, which is more obviously a late insertion approach, but perhaps they really endorse the first one. The first possible claim is that there is only one morpheme in each language for each formal feature; that is, there is no contextual allomorphy, no choice of morphemes for expression of a formal feature that depends on the context of the morpheme (with respect to other morphemes). In their analysis of irregular plurals like “oxen,” C and K argue that -en and the regular plural -s actually express different formal features that occupy different positions in the f-seq (the universal hierarchy of formal features) of nominal features. This analysis predicts “oxens” as the plural of “ox,” since items in different positions can’t be in complementary distribution, and C and K propose a terribly unconvincing account of why we don’t say oxens in English. But more crucially, they assume that the morpheme -en includes selectional features that limit its merger to certain roots/stems. This suggests that there are multiple morphemes in English for the inner plural formal feature, with contextual allomorphy; most stems “take” the zero allomorph of the inner plural (or the zero allomorph selects a set of stems that includes the majority of English nominal roots).

Which is the second possible claim about the way the C and K’s type of morphemes might interact with the principles that determine the hierarchical structure of formal features: that the features are ordered by the syntax, somehow invoking a Cinque hierarchy of f-features, but that the particular morpheme that instantiates an f-feature is determined by selectional features. But now we’ve recreated the approach of Distributed Morphology, at least for the core property of Late Insertion. The syntax organizes the (abstract) morphemes by category, then the morphophonology inserts particular vocabulary items that instantiate the features of the categories, respecting selectional requirements. The main difference between DM and MAS on this view, then, would be the assumption of one feature per terminal node in MAS – DM allows a bundle of features at each terminal node in the syntax.

It is possible to organize morphology (and syntax) around sign-morphemes (connections between f-features and phonology) without Late Insertion. This describes the grammatical theory in Lieber’s Deconstructing Morphology (1992) (and elsewhere). I will expand a bit in a later post how Lieber’s system is templatic and inspired by (true) X-bar syntax. But for present purposes, it’s sufficient to point out the basics. Each morpheme has three essential syntactic features, in addition to its phonological form. First, it indicates what category of constituent/phrase it may merge with; this is a selectional feature (formally a subcategorization feature, since the selectional features form subcategories of morphemes “categorized” by the second type of feature). Second, it indicates what category of constituent/phrase it creates (what the label is of the phrase/word it heads). And, finally, it includes a set of features that it adds to the constituent that it’s merging with. The categories are familiar – e.g., N, V, A. And the categories include levels, as in X-bar theory, so a morpheme may attach to N level zero and create N level 1. Crucially, for the most part the categories are distinct from the features carried by the morphemes. For the N category, these features might include person, gender, number, case, definiteness, etc. The plural /z/ in English, then, might select for category N zero, create a category N-bar, and add a +plural feature to the “categorial signature” of the N zero to which it attaches.

For the Lieber theory, and for templatic theories in general, there is no explanatory connection between the location of morphemes carrying f-features in the “extended projection” of a lexical category like N or V and the f-features themselves. Why a morpheme carrying a plural feature should attach to an N zero and create an N-bar, blocking out any other N zero-attaching morpheme, is a stipulation. The organization of the morphemes is not specified in the syntax by the f-features, since the syntax of the hierarchical structure of morphemes cares about the categories, not these features, and the morphemes are not, essentially “of” a category – they produce a category via merger, but that’s independent of the nature of the features they carry, in principle.

As soon as you have, like MAS, a system in which the syntax organizes morphemes via the f-features, constrained by selectional features, you have a system with Late Insertion in the Distributed Morphology sense. Again, as we will explore in future posts, the alternatives to Late Insertion are templatic theories of morphology (and syntax), but these deny the central insight behind Cinque hierarchies and generalizations about f-sequences in the “extended projection” of lexical items. A templatic system, like Lieber’s, does not claim that the distribution of constituents is determine by their categories/features.

The one feature per morpheme assumption shared by C & K and by NanoSyntax runs into an explanatory problem that DM at least sidesteps by allowing bundles of features under a terminal node in the syntax. Consider the way in which unary (non-binary) features in a feature hierarchy both yield categories of, say, gender and number, and capture the markedness relationship among the genders and numbers. Suppose that we use the first gender feature for masculine nouns, which are the least marked (and perhaps default) gender in some language. An additional gender feature hierarchically above the masc feature might give us neuter gender, in a three way gender feature in which masculine and neuter share exponents of case and number (as in Slavic). Finally, a third gender feature on top of the other two would yield feminine gender. Within a subsequence of an f-seq for the “extended projection” of a noun, one wouldn’t need to label these gender features; their values come from the markedness hierarchy within the gender region. It’s the sub-f-seq’s within the gender region that have values – a single feature is masculine, two features is neuter and three features is feminine.

Similarly, suppose we have a number system in the same language with three values, singular, dual and plural. Singular would be least marked, with a single feature in the number sub-f-seq, plural might be more marked, with two features, and dual might be the most marked, with three features. Again, the features themselves would not have values; the values come from the number of features within the number sub-f-sequence.

But now we can see clearly that the specific morphemes (for C & K) or features (for NanoSyntax) are not themselves ordered within an f-seq. Rather, it’s the fields of features, here gender and number, that are ordered. The features within each field are “ordered” in a sense, but really, for the syntax, a language would just specify how many features it allows within each field – it’s up to the phonology and the semantics to “interpret” the features in terms of number and gender, and to do so just based on the number of features in a field.

We’ve seen that “late insertion” doesn’t distinguish among DM, C & K, and NanoSyntax, and now we can see that ordering based on classes of features, rather than individual features, doesn’t distinguish among these theories either. All these theories require the equivalent of phrase structure rules to correctly distribute fields of features within hierarchical syntactic structures, followed by (or parallel with) principles of vocabulary insertion that realize the features phonologically. The adoption of a signed based theory of morphemes along with the assumption of a single formal feature per morpheme seems to make the principle of Vocabulary Insertion very simple for C &K. However, complications arise immediately, essentially concerning the distribution of zero morphemes. Consider what they need to rule out oxens, for example. NanoSyntax explores a rather more complicated theory of Vocabulary Insertion, but it would be fair to say that, unlike DM and C & K, NanoSyntacticians spend little effort showing how NanoSyntax interacts with SyntaxSyntax (becoming at present more of a theory of NanoMorphology).

Missing from all three approaches is a theory of phrase-structure; that is, a theory of how to exploit the generalizations expressed in Cinque-style f-sequences to generate syntactic structures that conform to them. I’ll write more about this problem in a future post.
___________

1They appeal to a generalization about “blocking” of multiple exponence across positions that Halle & Marantz (1993, “Distributed morphology and the pieces of inflection”) debunked in their discussion of Anderson’s blocking principles in A-Morphous Morphology. In any case, to split plural into two plural heads across the f-seq is to claim they don’t have the same interpretation and thus shouldn’t trigger any kind of blocking (and a single f-feature per morpheme makes it difficult to claim that the plurals are “similar,” since similarity here would imply feature decomposition to allow the two plurals to share a feature).