Sampson on Pustejowsky and Boguraev, "Lexical Semantics"

Lexical Semantics: The Problem of Polysemy.

Edited by James Pustejowsky and Branimir Boguraev. Clarendon Press, Oxford, 1996, 214pp. ISBN 0 19 823622 X. £14.99.

Reviewed by:

Geoffrey Sampson

University of Sussex

http://www.grsampson.net

After an introductory chapter by the editors, summarizing the chapters that follow, this book consists of five contributions by different authors on various aspects of the question how word polysemy might be formalized in the context of computer systems for simulating human language understanding. The writers’ shared assumption is that it should be possible to design a formal system akin to the calculi of mathematical logic but very much more complex, so that it reflects not just the inferences we draw from the grammar of sentences (which, broadly, is what standard mathematical logic aims to do) but also the inferences we draw from vocabulary.

Such a system would need to provide alternative representations not only for accidental one-off homonyms, such as “bank as raised ground” versus “bank as financial institution”, but also for many systematic meaning distinctions running through large sets of words. Ann Copestake and Ted Briscoe’s example is “The south side of Cambridge voted Conservative”, where side out of context means a place but in context means the people who live there – and this is not an alternation specific to that individual word, but would apply equally to other ways of referring to inhabited areas. Many issues arise about how far such facts can be captured by general rules, how the rules should be expressed, and so forth.

The collection evidently originated as a special issue of a journal – we are not told which journal, but various clues betray the fact, most obviously Geoffrey Nunberg’s acknowledgment of the comments of “two anonymous referees for this journal” – which the publisher has reissued in book form in hopes of selling it twice to hard-pressed libraries. (Time was when Oxford University Press was above that sort of thing … )

The enterprise represented in this book strikes me as severely problematic, for reasons that I shall illustrate from Nicholas Asher and Alex Lascarides’s contribution “Lexical disambiguation in a discourse context” (though I could have made the same point from other chapters).

The body of Asher and Lascarides’s chapter is formidably technical. Try the following, for instance (to keep things simple for this Journal’s typesetter, I shall write “X.Y” for X subscripted with Y):

We can build a general multi-model for the language of feature structures L.fs (Blackburn 1992), and the language of CE L.> (where > is the default conditional connective): viz. a model for L.(fs, >). Nevertheless – and this is important – our formulation countenances no interaction between the CE logic of > (and a fortiori the nonmonotonic consequence relation $) on the one hand, and Blackburn’s modal operators on the other. So from the perspective of DICE, we can translate L.(fs, >) into L.> by treating each feature structure description and statement about subsumption relations, as atomic formulae of L.>.

“CE” stands for Commonsense Entailment and “DICE” for Discourse in Commonsense Entailment, which are formal systems that have been briefly introduced on the previous page, though they are never described in detail. I have substituted dollar sign for a symbol of Asher and Lascarides’s that I have never seen before and cannot reproduce on my word processor, consisting of a vertical bar with two wavy horizontals to its right.

Now, before launching into these technicalities, Asher and Lascarides have introduced the problems they aim to address by discussing examples. Their main example concerns disambiguation of the word bar in the following text:

The judge asked where the defendant was. The barrister apologised, and said he was at the pub across the street. The court bailiff found him slumped underneath the bar.

Asher and Lascarides explain that the legal references, including the word barrister, would in isolation suggest that bar referred to “the courtroom bar”, but the text taken as a whole forces bar to be taken “in its pub sense”. They discuss this initially for four pages, and frequently revert to the example later.

The trouble with this is that, in the first place, there is no object in a court of law called a bar. The phrase “prisoner at the bar” may have had a literal use centuries ago, but in modern times it is merely a picturesque but empty phrase. The “bar” to which a barrister is notionally called is not located in a court but in a place where legal professionals meet for training and social purposes. Furthermore, in a drinking career spanning several decades I have yet to visit a pub where there would be any easy way in which a drinker could slump underneath the bar. A pub bar normally presents a solid vertical face to the paying customers.

In other words, the authors’ strenuous mathematical formalisms are erected on the basis of concrete linguistic examples that do not begin to work. Now, of course, this need not be a fatal criticism. The general point that Asher and Lascarides try to make about bar might equally have been made using dozens of alternative examples which would have worked perfectly well. Nevertheless, the contrast between the density of the formal analysis and the superficial nature of the linguistic example strikes me as indicative. The contributors to this book (not just Asher and Lascarides) are so keen to launch into detailed mathematical analysis that they do not give themselves time to stand back and ask themselves whether human language is really like that at all.

Two of the most influential philosophical works of the twentieth century are Ludwig Wittgenstein’s Philosophical Investigations, and Willard van Orman Quine’s “Two dogmas of empiricism”, published in 1953 and 1951 respectively. In the English-speaking world at least, what they say was for decades broadly accepted as correct; and both imply that the enterprise of Lexical Semantics is doomed. There is a good reason why traditional logic fails to address word meanings; that area of human linguistic behaviour is too underdetermined and too subject to unpredictably creative thought to be reduced to formal rules. Less attention is paid to Wittgenstein or to Quine nowadays than used to be the case, but so far as I am aware no-one has put forward new reasons to reject this point of view. Certainly the writers under review do not do so; they just bypass it.

In the 1990s, there are strong pressures on professional academics to involve themselves in research projects of the kind that cost money to carry out and generate tangible “deliverable output” in the shape of computer software, technical theorizing, or the like. It has become difficult to base a career on the kind of intellectual hygiene that was in fashion in British universities in the postwar decades, when the goal of education was seen largely as cultivating clear thinking. The writers under review identify many true facts about individual points of usage in English and other languages; and their assumption is that by assembling enough facts of this sort, and by being sufficiently clever in fitting them together, they can eventually achieve a computational system that simulates a significant part (at least) of human language understanding. I believe they are unconsciously closing their eyes to the reasons why that can never happen, because it would be too inconvenient to acknowledge them.

Along the way, some of them do come up with interesting points. In particular, Geoffrey Nunberg in his chapter “Transfers of meaning” discusses cases where grammar requires elements to co-refer, but sense seems to make them refer to contrasting categories – his (to my mind rather awkward) example is “Ringo squeezed himself into a narrow space”, with himself understood as the car Ringo is driving. According to Nunberg, the facts of usage imply that such examples are never what they seem: himself really does refer to Ringo the man, and the word that must be interpreted non-straightforwardly, in the car-parking scenario, is squeezed.

But it is notable that this point of Nunberg’s is made in plain English, without resort to formalism. That makes it quite unusual in this book, which employs one of the largest ranges of formal notations I have ever seen between a single pair of covers. Often, some unusual sign crops up on just one or two pages never to be seen again. Realistically, no outsider is likely to be able to follow the details. But this book is perhaps intended more for display than enlightenment.