Author: Alec Marantz (Page 9 of 11)

Teaching Harley & Noyer (1999)

If we look at Halle & Marantz (1993) for the origins of Distributed Morphology, we see a framework designed to show that inflectional morphemes should be pieces, distributed syntactically and realized phonologically after the syntax. In a sense, Halle & Marantz (1993) is an immediate expansion of Chomsky’s Syntactic Structures, as I will Blog about later. In his analysis of the English auxiliary system employing “affix hopping,” Chomsky does syntactic word formation and late insertion for inflectional morphemes, in much the same way as DM (nihil sub sole novum, particularly in Noam’s shadow).

But Harley & Noyer 1999, in their introduction to DM, actually emphasize thinking that went into my 1997 “No Escape from Syntax” paper and my 2000 WCCFL talk that spawned the heavily cited “Words” manuscript. Thus they present DM as essentially anti-“lexicalist.” There’s a concentration of issues surrounding derivational morphemes and roots, rather than inflection, and one sees there the strands of thought that Heidi will pursue in her later work.

Before turning to the “lexicalism” that DM was “anti-“ in the 1990’s, I should clarify two uses of “lexicalist,” only one of which is relevant to this post. On the non-relevant reading, a “lexicalist” endorses the notion that morphemes are “lexical items” in the sense of units that relate or contain both sound and meaning, with “meaning” broadly construed. This is the “morphemes are signs” position that often gets derided in the literature (see e.g. recent work by Blevins). It’s in this sense of “lexicalist” that, e.g., Lieber in Deconstructing Morphology is a lexicalist and Kayne is a lexicalist. These guys are serious, and their work cannot and should not be dismissed by waving at reduplication, zero morphs, patterns of syncretism in paradigms, etc. (see e.g. recent work by Blevins). It’s great for the field that serious linguists pursue this lexicalist hypothesis.

However, the lexicalism of anti-lexicalism in the 1990’s was the MIT-style lexicalism that was being explored in the Lexicon Project, a position associated with, e.g., certain versions of Lexical Morphology and Phonology, a position inspired by Wasow’s work contrasting lexical and syntactic rules, a position driving early versions of Lexical-Functional Grammar, and a position also endorsed by Chomsky at various points (see the discussion at the end of Halle & Marantz 1993). This view said that there was a difference between word formation before syntax (in the lexicon) and word formation that might be post-syntactic, and that the syntax operated on morphologically complex words from the lexicon, rather than on morphemes. There was a spirit around that a nexus of Chomsky’s “Remarks on Nominalization,” Kiparsky’s Lexical Morphology and Phonology, Wasow’s Lexical vs. Syntactic Rules, Lieber’s work on word formation, and Levin’s work on lexical semantics was creating a coherent and compelling picture of a Lexicon for generative grammar. The “No Escape” paper was meant to pop the bubble specifically by questioning the correlation of properties associated with “wordhood” that underlay the apparent consensus on the Lexicon. To perhaps oversimplify the conclusions of that paper, I argued that the “special behavior” that was claimed to distinguish lexical properties from syntactic properties was better understood as the local determination of properties of roots in the context of the first category node merged above them. Phonological wordhood per se was largely irrelevant to the syntactic, semantic, and morphological properties of a language.

By the end of the 1990’s, any notion of a transtheoretical consensus on a “lexicon” had vanished, and arguments for versions of lexicalism became more nuanced – less vulnerable to the Wreck-It Ralph treatment of “No Escape…” There’s less of a notion that a striking set of correlations converge on properties of a word as opposed to a syntactic phrase, and perhaps more of a notion that the lexicon allows for the unruly (the “lawless” for Di Sciullo and Williams) whereas the syntax plays by the rules. I hope to Blog on more recent anti-anti-Lexicalist positions later in the semester (e.g. papers by Kiparsky and by Bruening). As previewed in the Harley & Noyer article, the 2000’s saw an emphasis on the syntactic treatment of derivational morphology and of uncategorized roots of words. This work was and is not specifically or essentially anti-lexicalist – depends on the particulars on one’s theory of the lexicon. The work does however reject the notion that word formation is ever “lawless” – the adoption of syntactic word formation along with the strict locality implications that go along with this adoption in certain theoretical worlds is supposed to reduce the wiggle room for morphological analysis. So, less “anti-lexicalist” and more “pro-decomposition into minimal syntactic units organized hierarchically and subject to the locality constraints visible in syntax” or some such.

Contextual Allosemy in DM

So, Neil Myler and I are supposed to be writing a chapter on the topic of Contextual Allosemy for a DM volume. I thought I could Blog what I think is at stake here, to let the enormous Blogosphere let us know if we’re missing anything. All three of you readers. In our mind, the topic of contextual allosemy divides in two: contextual meanings of roots, and contextual meanings of functional morphemes. Both types of contextual allosemy, whether or not they reduce to a single phenomenon, should be subject to two sorts of locality constraints. Within the first phase in which they meet the interfaces, the trigger of allosemy – the context for contextual allosemy – must be structurally local to the target item whose meaning is being conditioned. In Marantz 2013 (Marantz, A. (2013). Locality domains for contextual allomorphy across the interfaces. Distributed morphology today: Morphemes for morris halle, 95-115), I suggested that the locality constraint here was adjacency, where semantically null items are invisible to the computation of the relevant “next to” relationship. Additionally, since the meaning of an element should be computed when it first hits the semantic interface, anything outside its first phase of interpretation could not serve to trigger a special meaning.

If Embick (perhaps Embick and me) is right, roots need to be categorized in the syntax – they won’t emerge bare at the semantic interface. So in a sense roots are always subject to contextual allosemy; they don’t have a bare semantic value. For functional morphemes, we’re inspired by Neil’s work on possession, where the little v that will be pronounced “have” is given a null interpretation in predicate possessive constructions. What’s suggested in Marantz 2013 is that contextual allosemy for functional morphemes involves a toggle between a specific meaning – say introducing an event variable for little v – and no meaning. The “no meaning” option creates situations in which a phonologically overt (but semantically null) morpheme fails to intervene between a trigger of contextual allosemy and a root subject to allosemy even though the morpheme intervenes phonologically (and thus would block contextual allomorphy between the trigger and the root).

I’ve been thinking more about this topic in light of phonological work by my colleague Juliet Stanton with Donca Steriade (Stanton, J. & Steriade, D. (2014). Stress windows and Base Faithfulness in English suffixal derivatives. (Handout)). S&S argue that, in English derivational morphology, the determination of the pronunciation of a derived form may depend on the pronunciation of a form of the root morpheme not included in the (cyclic) derivation of the form. For example, the first vowel of “atomicity” finds its quality, as a secondarily stressed vowel, in the form “atom” – the first vowel of its stem, “atomic,” is a reduced shwa from which the necessary value for stressed “a” in “atomicity” cannot be determined. If we’re thinking in DM terms, the adjective “atomic” should constitute a phase for phonological and semantic interpretation, after which the underlying vowel of “atom” in “atomic” would no longer be accessible, e.g., in the phase where noun “atomicity” is processed.

This argument assumes, reasonably, that “atomicity” has “atomic” as its base. The -ity ending is potentiated by -ic, and the derivation of a noun in -ity from an adjective in -ic is perhaps even productive. But is “atomicity” derived from “atomic” semantically?

Here’s the online definition of “atomic” in the sense most relevant to “atomicity”:

adjective
1. relating to an atom or atoms.
“the atomic nucleus”
o CHEMISTRY
(of a substance) consisting of uncombined atoms rather than molecules.
“atomic hydrogen”
o of or forming a single irreducible unit or component in a larger system.
“a society made up of atomic individuals pursuing private interests”

Here’s “atomicity”:

noun
1.
CHEMISTRY
the number of atoms in the molecules of an element.
2.
the state or fact of being composed of indivisible units.

Note that it’s “atomic individuals” and the “atomicity of society,” not the “atomicity of individuals” or “atomic society” (“atomic society” is post-apocalyptic). I think one can make the case that both “atomic” and “atomicity” (here in their non-nuclear, non-chemistry meanings) are semantically derived directly from “atom.”

Perhaps, then, the non-cyclicity of “atomicity” phonologically is paralleled by its non-cyclicity semantically, as would need to be the case in a strict interpretation of derivation by phase within DM. We would need -ic NOT to trigger a phase, meaning it could not be the realization of a little a node. I believe we’d need to commit to a theory in which the phonological form of most derivational affixes are the realizations of roots, not of category determining heads. So -ic in “atomicity” could then be a root attached to a category neutral head that does not trigger a phase. This conclusion that derivational affixes include phonologically contentful but a-categorical roots has already been argued for by Lowenstamm (on phonological grounds) and by De Belder (on syntactic and semantic grounds). De Belder specifically claims that -ic does not have an inherent category; we can point to words like “music,” “attic,” “traffic,” “mimic,” etc., alongside of words that are N/Adj ambiguous like “agnostic,” “stoic,” mystic,” etc.

In conclusion, although the within-phase domains of contextual allosemy and contextual allomorphy might diverge because null morphemes don’t intervene for the trigger/target relation of context/undergoer and what’s null in the phonology may differ from what’s null in the semantics, the phases that define the biggest domains from contextual allosemy/allomorphy might be the same. Standard DM assumes they are: one phase to rule them all.

The Canonical What I Learned On My Summer Vacation Post: SNL in Helsinki

In retrospect, we can identify the beginnings of contemporary neurolinguistics with Neville et al 1991 (Neville, H., Nicol, J. L., Barss, A., Forster, K. I., & Garrett, M. F. (1991). Syntactically based sentence processing classes: Evidence from event-related brain potentials. Journal of cognitive Neuroscience, 3(2), 151-165.) At the time, a dominant approach to language processing envisaged a system whereby people would predict the next word in a sentence by applying their statistical knowledge of word n-grams, based on their experience with language. Grammatical knowledge, as studied by linguists, was somehow an emergent property of string learning and not part of the cognitive model of linguistic performance. On this view, every ungrammatical string involved a point in which one’s string knowledge predicted zero likelihood of the encountered word, and ungrammaticality was on a continuum with unacceptability and very low Cloze probability. What the Neville et al paper demonstrated was that different types of ungrammaticality had different neural “signatures,” and these differed from that of an unexpected word (with low Cloze probability). One can have many quibbles with this paper. The generalization from the specific sentence types studied to classes of syntactic violations (as in the title), for example, is suspect. But the paper launched a ton of work examining the connection between detailed aspects of sentence structure and neural responses as measured by EEG. In recent years, there has been a move away from violation studies to work varying grammatical sentences along different dimensions, but experiments have found a consistent correlation between interesting linguistic structure and brain responses.

So it was a bit of a surprise to hear Ev Fedorenko at the just finished Neurobiology of Language meeting in Helsinki claim that the “language network” in the brain wasn’t particularly interested in the sort of details that the electrophysiologists (ERP, MEG researchers) among us have been studying. In particular, she was explicitly equating the type of computation that is involved in Cloze probability (lexical surprisal) with syntactic computations. Fedorenko’s gold standard for localizing the language network within an individual’s brain is an fMRI paradigm contrasting the brain’s response to listening to grammatical, coherent sentences and the brain’s response to listening to lists of pronounceable non-words. The activity in this network, for example, seems equally well predicted by syntactic and lexical surprisal modulations.

Given that the ERP/MEG literature details specific differences between e.g. lexical prediction and grammatical computations, if Fedorenko’s language network were in fact responsible for language processing, then perhaps the same areas of the brain are performing different tasks – i.e., separation in brain space would perhaps not be informative for the neurobiology of language. Fedorenko was asked this question after her talk, but she didn’t understand it. However, Riitta Salmelin’s talk in the same session of the conference did address Fedorenko’s position. Salmelin has been investigating whether ERP/MEG responses in particular experimental paradigms might yield different localizations for the source of language processing than fMRI activation from identical paradigms. Her work demonstrates that this is in fact the case, and she presented some ideas about why. She also remarked to me at the conference that Fedorenko’s “language network” does not include areas of interest for linguistic processing that she studies with MEG.

Of interest for our Blog notes is the nature of Fedorenko’s “syntactic surprisal” measure – the one that is supposed to correlate with activity in the same network as lexical surprisal, where lexical surprisal is computed via word 5-grams. Fedorenko’s syntactic surprisal measure comes from a delexicalized probabilistic context free grammar, i.e., from a consideration of the likelihood of a syntactic structure independent of any lexical items. We asked in a previous post whether this kind of syntactic surprisal is likely to represent speakers’ use of syntactic knowledge for parsing words, given the importance of lexical specific selection in word processing, but the same question could be asked about sentence processing. A recent paper from our lab, Sharpe, V., Reddigari, S., Pylkkänen, L., & Marantz, A. (2018). Automatic access to verb continuations on the lexical and categorical levels: evidence from MEG. Language, Cognition and Neuroscience, 34(2), 137–150, clearly separates predictions from verbs about the follow syntactic category from predictions for the following word. However, the predictions are grounded in the identity of the verb, so this is a mix of lexical and syntactic prediction (the predictor is lexicalized but the predicted category is syntactic, modulo the prediction of particular prepositions). What is clear is that syntactic surprisal as measured by a de-lexicalized probabilistic context free grammar is not the be-all and end-all of possible variables that might be used to explore the brain for areas performing syntactic computation. In particular, the status of de-lexicalized syntactic structure for syntactic parsing is up in the air. Nevertheless, in a proper multiple regression analysis of MEG brain responses to naturalistic speech, I’m willing to go out on the limb and predict that different brain regions will be sensitive to word n-gram surprisal and syntactic surprisal, as measured via a de-lexicalized probabilistic context free grammar.

A final note of clarification: Fedorenko in her talk suggested that there were linguistic theories that might predict no clear separation between, I believe, word meaning and syntax. Thus somehow e.g. Jackendoff’s Parallel Architecture and Construction Grammar would predict a lack of separation between lexical and syntactic surprisal. For both Jackendoff and Construction Grammar – and all serious current linguistic frameworks I know of – the ontology of lexical semantics and the ontology of syntactic categories are distinct. So Jackendoff has parallel syntactic and semantic structures, not no distinction between syntax and word meaning. Construction Grammar is similar in this respect. The question of whether speakers use de-lexicalized probabilistic syntactic knowledge in processing is a question for any syntactic theory, and all theories I can think of would survive a yes or no answer.

« Older posts Newer posts »

© 2024 NYU MorphLab

Theme by Anders NorenUp ↑