Author Archives: Alec Marantz

Reducing Lexical Categories to Two Syntactic Heads: Implications for Causative Alternations

The following is a précis of a talk I’ll be delivering remotely in Japan in November.  Questions are invited.

 

The problem:  Although the most common so-called “lexical causatives” take unaccusative verbs like “open” as their base, many languages allow lexical causatives of unergative verbs.  These are identified as “lexical” rather than syntactic because they (generally) exhibit the following properties:  an overt affix on the verb (rather than a periphrastic construction), a lack of complete productivity, a monoclausal syntax, and a failure of the causee to pass tests for agentivity that the corresponding subject of the base unergative verb would pass.  Although the subject of unergatives pattern with the subjects of transitive verbs in these languages, the causatives of the unergatives are syntactically identical to the causatives of unaccusative verbs, and the causee for the unergatives cannot be expressed in the manner of the causee of transitive verbs when these serve as the source for a causative.  In fact, in some languages transitives generally resist the lexical causative while unergatives do not.  Here we see some data from Georgian from Nash (2020) that illustrate the pattern of interest.  (1a) is a caustiave of an unaccusative, (1b) the causative of an unergative and (1c) the causative of a transitive.  In the aorist, the subject surfaces in the ergative case and the direct object in the nominative.

 

(1)  a.  keti-m msxal-i      ga=a-xm-o

       Keti-ERG pear-NOM prev=VAM-dry-AOR.3sg

‘Keti dried the pear.’

 

        b.  keti-m gogo          a-varjiš-a

       Keti-ERG girl.NOM VAM-exercise-AOR.3sg

‘Keti made the girl exercise.’

 

       c.  keti-m gogo-s otax-i da=a-lag-eb-in-a

       Keti-ERG girl-DAT room-NOM prev=CAUS-tidy-TS-CAUS-AOR.3sg

‘ Keti made the girl tidy up the room.’

 

This problematic behavior of unergatives under lexical causativizataion has been noted in recent years, manifesting in a variety of ways, for a number of languages, including Samoan (Tollan 2018), Kipsigis (Kouneli 2021), Quechua (Myler 2022), Plains Cree (Tollan & Oxford 2018), and Malayalam (Krishnan and Sarma, submitted).  The data were already well described in Relational Grammar (see, e.g., Harris 1981 on Georgian), where the behavior of unergatives fell under the general rule of Causative Clause Union.  The intuition behind the mechanics of Causative Clause Union are straightforward.  When a causative matrix clause (X causes Y) collapses with the embedded clause (Y), the structure will possibly contain multiple competitors for subjecthood and objecthood of the combined clause.  The causer gets precedence for subject of the causative, leaving a possible embedded subject without a grammatical relation.  Unaccusative subjects, which are underlying objects, pattern with transitive objects and have privileged call on the object relation.  One might expect unergative subjects to pattern with transitive subjects, which generally become indirect objects or optional oblique arguments, but the statement of Clause Union allows unergative subjects to acquire the object relation, if this is not already borne by an argument of the embedded clause.

 

Although Causative Clause Union describes the facts in, e.g., Georgian, this “solution” was a stipulation of the pattern, rather than an explanation of it.  For syntacticians in the Chomskyan tradition, the problem is how to avoid having the external argument of an unergative not share the fate of the external argument of a transitive, when embedded under a causative head.  A popular approach postulates that the external argument occupies a special subject position as an extra or high specifier of vP, rather than a specifier of voiceP, where transitive subjects sit (see e.g. Kouneli 2021).  While this puts the unergative subject in relation with the v that might resemble that of an unaccusative theme, in contrast to a transitive subject, it does so by breaking the theory.  Recent developments have led to a constrained and coherent theory of the syntax and semantics of v and voice (see e.g. Wood and Marantz 2017), and the extra subject position does not work even as an extension of this theory.  Rather, the proposal just seems a restatement of the problem: unergative subjects behave like transitive subjects outside the causative construction but like unaccusative subjects inside.

 

In this paper, I will describe an extension (really a simplification) of the theory of functional categories in Wood and Marantz that seems to provide an explanation of the behavior of unergatives in lexical causatives.  This approach will involve adopting and adapting the proposals of Legate (2014) concerning the connection between the semantic notions associated with external argumenthood (like “causer” and “agent”) and the syntactic positions into which bearers of external argument-type roles are introduced.

 

The Framework:  Wood and Marantz propose reducing the functional categories that introduce DP arguments, like voice, poss(essor), appl, and P, to a single functional head i*.  This head introduces a DP into the syntactic tree structure and relates it to another phrase.  For appl and voice, the phrase is a complement to the i*, with the introduced DP (the applied or external arguments) as the specifier of i*.  For a P, the directionality is reversed, and the introduced DP is the complement to the i* with the related phrase Merged second.  The “lexical” categorizing heads n, v, and a fell outside the system, and Wood & Marantz did not propose any analysis that related these heads to each other or to the properties of i*.  This implied that the set of items that could introduce DPs – v, and i* – have nothing essential in common (P was reduced to i*).

 

Following work by Armoskaite (2011), Shushurin (2021), and Newman (2021), Marantz (2022) proposed a further reduction of the system.  Verbs were analyzed as a structure with two i* heads, one above the other.  The lower i* head introduces the direct object (if there is one), with voice (also i*) taking this i*P with its direct object and introducing the external argument in its spec position.  (Intransitive verbs would involve an i* head that introduced no DP.)  i* itself is identified as a transitivity head, with values for transitivity indicating that it does or does not introduce a DP or is ambivalent about introducing a DP (+Trans, -Trans, øTrans).  This trivalent Trans feature recapitulates the trivalent voice of Kastner (2020) and others:  since voice and v now are the same feature, Trans, we expect them to exhibit the same range of values.  What we call a “verb,” then, is a transitivity head over a transitivity head.  The lower half of a verb is identical to the structure of a Preposition, with the difference between verbs and prepositions mostly a matter of their extended projections (if you project Aspect and Tense, you’re a verb, e.g.), as related to or conditioned by the root that adjoins to the i*.  Little n is a gender feature and little a – for “nominal” type adjectives of the Indo-European type, not adjectives of the Japanese type that behave like a special class of intransitive verbs – is an unvalued gender feature that will receive gender from a (modified) noun (the other, verbal, type of adjectives are treated like intransitive verbs syntactically, as far as this reduction to a minimal set of category features is concerned).

 

In this theory, “voice” and “v” are the same head and are distinguished configurationally, not featurally.  Because there is only a single argument introducing head, the theory of the way that certain meanings – certain theta roles – are distributed in syntactic structure cannot rely on statements such as, “voice projects the external argument of the verb.”  Following the lead of, e.g., Legate (2014), the theory requires that we separate the event structure denoted by the roots of nouns, verbs and adjectives from the semantic interpretation of syntactic structure.  In Marantz (2022) I explain this separation, relying on Jim Wood’s (2022) analysis of complex event nominalizations in Icelandic.  The root of an unergative verb like “cry,” for example, will point to a crying event that includes a crier.  But that information in itself does not tell us where a crier will be projected in a sentence.  The distribution of event arguments depends on the interaction of event semantics with the rules of interpretation for syntactic structure.  In the lexical causative of unergative “cry,” with the root “cry” adjoined to a transitivity head below the transitivity head that will introduce the causer, the “crier” can be introduced as the complement to the lower transitivity head.  Here it will receive the structural interpretation of an entity that undergoes a caused change of state, but also the role of the “crier.”  The interpretation of X cry-CAUSE Y is that X made Y into a crier (from a non-crier), i.e., that X caused Y to begin to cry.  The caused change of state semantics is entirely a function of the syntactic structure, not the root, but the “crier” role of the root is compatible with the object position in such a structure.

 

In the context of a theory of derivational morphology, Marantz (2022) elaborates on what we expect from root morphemes in the morphophonology.  Specifically, functional heads seem happy with phonologically zero Vocabulary Items; perhaps the phonologically null realization is the universal default for such morphemes.  Roots, on the other hand, may be zero, but roots in general lack aggressive homophony.  One expects there to be only a very limited set of zero roots in any given language, just like one expects there to be only a very limited set of roots whose phonological form is /khæt/.  The implication of this theory is that when a head is generally overt in a language, it most likely is a root rather than a functional morpheme.  The persistent overt realization of the “causative” morpheme in causatives of unergatives cross-linguistically suggests that the morpheme is a spell-out of a “causative” or “agentive” root attached to a transitivity (voice/v) head, rather than a Vocabulary Item realizing voice or v itself.

 

The Proposal:  Lexical causatives – causatives that do not introduce an additional event modifiable by temporal adverbials and in which a causee does not pass tests for agentivity – contain no more (and no fewer) functional heads than other (“regular”) transitive verbs in a language:  essentially, a transitivity head with a +Trans feature and a lower transitive head also with a +Trans feature.  The interpretation of causative constructions requires transitivity here.  The lower object is interpreted as a change of state predicate such that the causer, the external argument of the verb, causes the object to undergo a change of state.  This interpretation of the lexical causative is enforced by the semantics of a root that is adjoined to the higher transitivity feature (the “voice”) and is often identified as the “causative” morpheme.

 

In my conference talk, I’ll review why this analysis correctly limits the location of unergative “agents” in the direct object position to transitive causative constructions, contrasting with the themes of unaccusative verbs, that may appear as direct objects in intransitive sentences.  We’ll also review why unergative causees may not, in general, be expressed in the manner of the causees of transitive verbs under causativization.

 

Of particular interest are the features that might be associated with the overt “causative” morpheme across languages that allow lexical causatives of unergative verbs.  Following the work of Nash (2020) and of Krishnan and Sarma (2022), I will explore the morphology of Georgian and of Malayalam with the goal of explaining how the distribution of overt morphemes in these languages supports the claim that “causative” morphology isn’t spelling out a causative head or a voice head in most cases.  In Malayalam in particular, the distribution of overt morphology is telling.  When an unergative is not overtly marked, its overtly marked lexical causative behaves as explained above, in parallel to the causatives of unaccusatives.  However, overtly marked unergatives pattern with transitives under causativization.  We will see that this behavior follows from the role of the overt morphology in controlling the expression of the external argument of an unergative root.

 

References

 

Armoskaite, S. (2011). The destiny of roots in Blackfoot and Lithuanian (Doctoral dissertation, University of British Columbia).

 

Harris, A. C. (1981). Georgian syntax: A study in relational grammar. Cambridge Studies in Linguistics London33.

 

Kastner, I. (2020). Voice at the interfaces: The syntax, semantics, and morphology of the Hebrew verb. Language Science Press.

 

Kouneli, M.  (2021).  High vs. Low external arguments: Evidence from KipsigisSAIAL, University of Potsdam, April 15, 2021.

 

Krishnan, G.G., & Sarma, V.M.  (2022).  Unlocking verbal forms in Malayalam:  Past tense is key.  Submitted.

 

Legate, J. A. (2014). Voice and v: Lessons from Acehnese (Vol. 69). MIT Press.

 

Marantz, A. (2022).  Rethinking the syntactic role of word formation.  In Bonet, N., et als., eds., Building on Babel’s Rubble, PUV, Université Paris 8, pp. 293-316.

 

Myler, N.  (2022).  Argument Structure and Morphology in Cochabamba Quechua (with occasional comparison with other Quechua varieties).  Boston University ms.

 

Nash, L. (2020). Causees are not agents. In Perspectives on causation (pp. 349-394). Springer, Cham.

 

Newman, E. S. B. (2021). The (in) distinction between wh-movement and c-selection (Doctoral dissertation, Massachusetts Institute of Technology).

 

Shushurin, P. (2021). Nouns, Verbs and Phi-Features (Doctoral dissertation, New York University).

 

Tollan, R. (2018). Unergatives are different: Two types of transitivity in Samoan. Glossa: a journal of general linguistics3(1).

 

Tollan, R., & Oxford, W. (2018). Voice-less unergatives: Evidence from Algonquian. In Proceedings of WCCFL (Vol. 35, pp. 399-408).

 

Wood, J., & Marantz, A. (2017). The interpretation of external arguments. The verbal domain, 255-278.

 

Wood, J. (2022).  Icelandic nominalizations and allosemy. Yale University book ms. lingbuzz/005004.

The Revenge of Phrase-Structure Rules

Part One: No Escape from Late Insertion

In an interesting proposal about the connection between Morphology and Syntax, Collins and Kayne (“Towards a theory of morphology as syntax.” Ms., NYU (2020)) outline a sign-based theory of Morphology, one that lacks Late Insertion. That is, Collins and Kayne (for “Morphology as Syntax” or MAS) propose that morphemes are signs: connections between phonology and formal features, where the latter would serve as input to semantic interpretation. The formal features of a morpheme determine its behavior in the syntax. They further propose, along with NanoSyntax, that each morpheme carries a single formal feature. By denying Late Insertion, they are claiming that the morphemes are not “inserted” into a node bearing their formal features, where this node has previously been merged into a hierarchical syntactic structure, but rather than the morphemes carry their formal features into the syntax when they merge into a structure, providing them to the (phrasal) constituent that consists of the morpheme and whatever constituent the morpheme merges with.

From the moment that linguists started thinking about questions concerning the connections between features of constituents and their distributions, they found that the ordering and structuring of elements and constituents in the syntax depended on the categories of these elements, not the specific items. Thus, phrase structure rules that traffic in category labels. For example, noun phrases (or DPs, or some such) appear in subject and object positions; in general, the distribution of nominal phrases like “John” or “the ball” is determined by their identification as noun phrases, not their particular lexical content. Similarly, within a language, the organization of morphological heads is associated with what Collins and Kayne call their formal features (like “tense”), not with the lexical items themselves. In fact Collins and Kayne assume that the hierarchical positioning of morphemes is governed by something like Cinque hierarchies, i.e., hierarchies of formal features that reflect cross-linguistic hierarchical ordering regularities. The literature has recently been calling such hierarchies f-seqs, for a sequence of functional categories (in theories that adopt some version of Kayne’s Linear Correspondence Axiom, a linear sequence also completely determines a hierarchical structure, where left in the sequence = higher in a tree). Tense might be higher than aspect in such f-seqs/hierarchies, for example.

But if the hierarchical organization of morphemes is determined by their formal features in a theory, then that theory is endorsing “late insertion,” i.e., the independence of the syntactic organization of morphemes from anything but their formal features. Technically, let’s break down this issue into two possible theoretical claims; the examples in Collins and Kayne’s work suggest that they are endorsing the second claim, which is more obviously a late insertion approach, but perhaps they really endorse the first one. The first possible claim is that there is only one morpheme in each language for each formal feature; that is, there is no contextual allomorphy, no choice of morphemes for expression of a formal feature that depends on the context of the morpheme (with respect to other morphemes). In their analysis of irregular plurals like “oxen,” C and K argue that -en and the regular plural -s actually express different formal features that occupy different positions in the f-seq (the universal hierarchy of formal features) of nominal features. This analysis predicts “oxens” as the plural of “ox,” since items in different positions can’t be in complementary distribution, and C and K propose a terribly unconvincing account of why we don’t say oxens in English. But more crucially, they assume that the morpheme -en includes selectional features that limit its merger to certain roots/stems. This suggests that there are multiple morphemes in English for the inner plural formal feature, with contextual allomorphy; most stems “take” the zero allomorph of the inner plural (or the zero allomorph selects a set of stems that includes the majority of English nominal roots).

Which is the second possible claim about the way the C and K’s type of morphemes might interact with the principles that determine the hierarchical structure of formal features: that the features are ordered by the syntax, somehow invoking a Cinque hierarchy of f-features, but that the particular morpheme that instantiates an f-feature is determined by selectional features. But now we’ve recreated the approach of Distributed Morphology, at least for the core property of Late Insertion. The syntax organizes the (abstract) morphemes by category, then the morphophonology inserts particular vocabulary items that instantiate the features of the categories, respecting selectional requirements. The main difference between DM and MAS on this view, then, would be the assumption of one feature per terminal node in MAS – DM allows a bundle of features at each terminal node in the syntax.

It is possible to organize morphology (and syntax) around sign-morphemes (connections between f-features and phonology) without Late Insertion. This describes the grammatical theory in Lieber’s Deconstructing Morphology (1992) (and elsewhere). I will expand a bit in a later post how Lieber’s system is templatic and inspired by (true) X-bar syntax. But for present purposes, it’s sufficient to point out the basics. Each morpheme has three essential syntactic features, in addition to its phonological form. First, it indicates what category of constituent/phrase it may merge with; this is a selectional feature (formally a subcategorization feature, since the selectional features form subcategories of morphemes “categorized” by the second type of feature). Second, it indicates what category of constituent/phrase it creates (what the label is of the phrase/word it heads). And, finally, it includes a set of features that it adds to the constituent that it’s merging with. The categories are familiar – e.g., N, V, A. And the categories include levels, as in X-bar theory, so a morpheme may attach to N level zero and create N level 1. Crucially, for the most part the categories are distinct from the features carried by the morphemes. For the N category, these features might include person, gender, number, case, definiteness, etc. The plural /z/ in English, then, might select for category N zero, create a category N-bar, and add a +plural feature to the “categorial signature” of the N zero to which it attaches.

For the Lieber theory, and for templatic theories in general, there is no explanatory connection between the location of morphemes carrying f-features in the “extended projection” of a lexical category like N or V and the f-features themselves. Why a morpheme carrying a plural feature should attach to an N zero and create an N-bar, blocking out any other N zero-attaching morpheme, is a stipulation. The organization of the morphemes is not specified in the syntax by the f-features, since the syntax of the hierarchical structure of morphemes cares about the categories, not these features, and the morphemes are not, essentially “of” a category – they produce a category via merger, but that’s independent of the nature of the features they carry, in principle.

As soon as you have, like MAS, a system in which the syntax organizes morphemes via the f-features, constrained by selectional features, you have a system with Late Insertion in the Distributed Morphology sense. Again, as we will explore in future posts, the alternatives to Late Insertion are templatic theories of morphology (and syntax), but these deny the central insight behind Cinque hierarchies and generalizations about f-sequences in the “extended projection” of lexical items. A templatic system, like Lieber’s, does not claim that the distribution of constituents is determine by their categories/features.

The one feature per morpheme assumption shared by C & K and by NanoSyntax runs into an explanatory problem that DM at least sidesteps by allowing bundles of features under a terminal node in the syntax. Consider the way in which unary (non-binary) features in a feature hierarchy both yield categories of, say, gender and number, and capture the markedness relationship among the genders and numbers. Suppose that we use the first gender feature for masculine nouns, which are the least marked (and perhaps default) gender in some language. An additional gender feature hierarchically above the masc feature might give us neuter gender, in a three way gender feature in which masculine and neuter share exponents of case and number (as in Slavic). Finally, a third gender feature on top of the other two would yield feminine gender. Within a subsequence of an f-seq for the “extended projection” of a noun, one wouldn’t need to label these gender features; their values come from the markedness hierarchy within the gender region. It’s the sub-f-seq’s within the gender region that have values – a single feature is masculine, two features is neuter and three features is feminine.

Similarly, suppose we have a number system in the same language with three values, singular, dual and plural. Singular would be least marked, with a single feature in the number sub-f-seq, plural might be more marked, with two features, and dual might be the most marked, with three features. Again, the features themselves would not have values; the values come from the number of features within the number sub-f-sequence.

But now we can see clearly that the specific morphemes (for C & K) or features (for NanoSyntax) are not themselves ordered within an f-seq. Rather, it’s the fields of features, here gender and number, that are ordered. The features within each field are “ordered” in a sense, but really, for the syntax, a language would just specify how many features it allows within each field – it’s up to the phonology and the semantics to “interpret” the features in terms of number and gender, and to do so just based on the number of features in a field.

We’ve seen that “late insertion” doesn’t distinguish among DM, C & K, and NanoSyntax, and now we can see that ordering based on classes of features, rather than individual features, doesn’t distinguish among these theories either. All these theories require the equivalent of phrase structure rules to correctly distribute fields of features within hierarchical syntactic structures, followed by (or parallel with) principles of vocabulary insertion that realize the features phonologically. The adoption of a signed based theory of morphemes along with the assumption of a single formal feature per morpheme seems to make the principle of Vocabulary Insertion very simple for C &K. However, complications arise immediately, essentially concerning the distribution of zero morphemes. Consider what they need to rule out oxens, for example. NanoSyntax explores a rather more complicated theory of Vocabulary Insertion, but it would be fair to say that, unlike DM and C & K, NanoSyntacticians spend little effort showing how NanoSyntax interacts with SyntaxSyntax (becoming at present more of a theory of NanoMorphology).

Missing from all three approaches is a theory of phrase-structure; that is, a theory of how to exploit the generalizations expressed in Cinque-style f-sequences to generate syntactic structures that conform to them. I’ll write more about this problem in a future post.
___________

1They appeal to a generalization about “blocking” of multiple exponence across positions that Halle & Marantz (1993, “Distributed morphology and the pieces of inflection”) debunked in their discussion of Anderson’s blocking principles in A-Morphous Morphology. In any case, to split plural into two plural heads across the f-seq is to claim they don’t have the same interpretation and thus shouldn’t trigger any kind of blocking (and a single f-feature per morpheme makes it difficult to claim that the plurals are “similar,” since similarity here would imply feature decomposition to allow the two plurals to share a feature).

Grammar and Memorization: The Jabberwocky Argument

Memorized vs. Computed

 

I have previously written about why I believe that distinctions in the literature between words (or sentences) that are “stored” as units vs. words (or sentences) that are “computed” are not well articulated.  I claimed, instead, that, in a sense that is crucial for understanding language processing, all words (and all sentences) are both stored AND computed, even those words and sentences a speaker has never encountered before.  A speaker stores or memorizes all the infinite number of words (and sentences) in his/her language by learning a grammar for the language.  Attempts in the literature to distinguish the stored words from the computed ones fail to be clear about what it means to store or compute a word – particularly on what it means to memorize a word.  I claimed that, as we become clear on how grammars can be used to recognize and produce words (and sentences), any strict separation between the stored and the computed disappears.

 

Despite my earlier efforts, however, I find that I have not convinced my audience.  So here I’ll follow a line of argument suggested to me by Dave Embick and try again.

 

Let’s start with Jabberwocky, by Lewis Carroll of course (1871, this text from Wikipedia):

 

Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

“Beware the Jabberwock, my son!
The jaws that bite, the claws that catch!
Beware the Jubjub bird, and shun
The frumious Bandersnatch!”

He took his vorpal sword in hand:
Long time the manxome foe he sought—
So rested he by the Tumtum tree,
And stood awhile in thought.

And as in uffish thought he stood,
The Jabberwock, with eyes of flame,
Came whiffling through the tulgey wood,
And burbled as it came!

One, two! One, two! And through and through
The vorpal blade went snicker-snack!
He left it dead, and with its head
He went galumphing back.

“And hast thou slain the Jabberwock?
Come to my arms, my beamish boy!
O frabjous day! Callooh! Callay!”
He chortled in his joy.

‘Twas brillig, and the slithy toves
Did gyre and gimble in the wabe;
All mimsy were the borogoves,
And the mome raths outgrabe.

 

The poem is full of words that the reader might not recognize and/or be able to define.  Quick quiz:  which words did Carroll make up?

 

Not so easy, at least for me.  Some words that Carroll was the first to use (as far as lexicographers know) have entered the language subsequently, e.g., vorpal (sword).  What about chortle?  Are you sure?  How about gyreGimbleBeamishWhiffling?

 

The fact is that when we encounter a word in context, we use our knowledge of grammar, including our knowledge of generalizations about sound/meaning connections, to assign a syntax and semantics to the word. (Chuang, Yu-Ying, et al. “The processing of pseudoword form and meaning in production and comprehension: A computational modeling approach using linear discriminative learning.” Behavior research methods (2020): 1-32.)  Suppose the word is one we have not previously encountered, but it is already in use in the language.  Can we tell it’s a “real” word as opposed to Jabberwocky?  That the word has found a place in the language probably means that it fits with generalizations in the language, including those about correlations between sound and meaning and between sound and syntactic category. Children must be in this first encounter position all the time when they’re listening  – and I doubt that many of them are constantly asking, is that really a word of English?  Suppose, now, that the-new-to-us word in question is actually not yet a word in use in English, as was the case for the first readers of Jabberwocky encountering chortle.  In the course of things, there’s no difference between encountering in context a word that’s in use and you haven’t heard yet, but fits the grammar of your language, and one that the speaker made up, but also fits the grammar of your language equally well.  Lewis Carroll made up great words, extremely consistent with English, and many of them stuck.

 

Speakers of English have internalized a phonological grammar (a phonology) that stores our knowledge of the well-formedness of potentially infinite strings of phonemes.  The phonotactics of a language include an inventory of sounds (say an inventory of phonemes) and the principles of phonotactics – the sounds’ legal combinations.  The phonology – the phonotactic grammar – stores (and generates) all the potential words of the language, but doesn’t distinguish possible from “actual” words by itself.  Are the “actual” words distinguished as phoneme-strings carrying the extra feature [+Lexical Insertion], as Morris Halle once claimed for morphologically complex words that are in use as opposed to potential but not “actual” words (Halle, M. (1973). Prolegomena to a theory of word formation. Linguistic inquiry4(1), 3-16)?  It’s not particularly pertinent to the question at hand whether people can say, given a string of letters or phonemes in isolation, this a word of my language.  Experimental subjects are asked to do this all the time in lexical decision experiments, and some are surprisingly accurate, as measured against unabridged dictionaries or large corpora.  Most subjects are not so accurate, however, as one can see from examining the English Lexicon Project’s database of lexical decisions – 85% correct is fairly good for both the words (correct response is yes) and pronounceable non-words (correct response is no) in that database.  Lexical Decision is a game probing recognition memory – can I recover enough of my experiences with a letter or phoneme string to say with some confidence that I encountered it before in a sentence?  A better probe of our knowledge of potential and actual words is placing the strings in sentential context – the Jabberwocky probe.  Do we think a Jabberwocky word is a word in our language.  Here we see that our judgments are graded, with no clear intuition corresponding to a binary word/non-word classification.

 

For phonology, it’s somewhat clear what we mean when we say that the generative grammar “stores” the forms of potential and existing words in the language.  The consequences of this for the Chomskyan linguist (committed to the principle that there are not separate competence and performance grammars) is that the phonological grammar is used in recognizing and producing the words.  For committed Chomskyans, like me, at a first pass, we expect that phonotactic well-formedness will always play a role in word recognition and production – “knowing” a word doesn’t exempt it from obligatory “decomposition” via the grammar in use into, e.g., phonemes, and analysis via the phonotactic grammar.  “Retrieving” the phonological form of a word from memory and “generating” it from the grammar become the same process.

 

What is, then, the difference between words like chatter and Jabberwocky like chortle?  Although our grammar will assign a meaning to any well-formed possible word, without sentential or other context, the meaning might be vague.  As we experience words in context, we can develop sharper accounts of their meaning, perhaps primarily via word co-occurrences.  The sharpness of a semantic representation isn’t a property of the phonological grammar, but it is a property of the grammar as a whole.  For linguists, a “whole” grammar includes, in addition to a syntax that organizes morphemes into hierarchical tree structures and a phonology that maps the syntactic structure into a structured sequence of prosodic units like phonological words, also what Chomsky calls a language’s “externalization” in the conceptual system, i.e., in this case the meaning of words.

 

In important ways, words are like human faces to human speakers.  We have internalized a grammar of faces that allow us to recognize actual and potential faces.  We store this grammar, at least partially, in what is called the Fusiform Face Area.  Recognizing faces (as faces) involves obligatory decomposition into the elements of a face (like the eyes, ears, and nose) whose grammatical combinations the face grammar describes.  For faces, we don’t call the faces of people that we haven’t seen “potential” or “pseudo” faces – they’re just faces, and the faces of people that we have encountered (and can recall as belonging to people we’ve seen or met) we call “familiar” faces.  For words, I propose we adopt the same nomenclature – words and potential words should just be “words,” while words to which we push the “yes” button to in Lexical Decision experiments should be called “familiar words.”

 

Note that, for written words, there’s an even greater parallel between words and faces.  Our orthographic grammar, describing the orthotactics of the language, generates thus stores all the orthographic forms of the words of the language.  From neuroscientific studies, we know that the orthographic grammar – and thus the orthographic forms of words – is (at least partially) stored in an area of the brain adjacent to the Fusiform Face Area (called the “Visual Word Form Area”), and the recognition of words follows a parallel processing stream and time frame as the recognition of faces.  One can speculate (as I will in a future post) that the phonological grammar and thus the phonological forms of words (really morphemes of course) live in secondary auditory cortices on the superior temporal lobe, where auditory word recognition is parallel to the recognition of faces and visual word forms, with the interesting complication that the recognition process plays out over time, as the word is pronounced.

 

[To be continued…..]