The following online article has been derived mechanically from an MS produced on the way towards conventional print publication. Many details are likely to deviate from the print version; figures and footnotes may even be missing altogether, and where negotiation with journal editors has led to improvements in the published wording, these will not be reflected in this online version. Shortage of time makes it impossible for me to offer a more careful rendering. I hope that placing this imperfect version online may be useful to some readers, but they should note that the print version is definitive. I shall not let myself be held to the precise wording of an online version, where this differs from the print version. Published in J. of Literary Semantics 26.229–32, 1997. |
Lexical Semantics: The Problem of Polysemy.
Edited by James Pustejowsky and Branimir Boguraev. Clarendon Press, Oxford, 1996,
214pp. ISBN 0 19 823622 X. £14.99.
Reviewed by:
Geoffrey Sampson
University of Sussex
http://www.grsampson.net
After an introductory chapter by the editors, summarizing
the chapters that follow, this book consists of five contributions by different
authors on various aspects of the question how word polysemy might be
formalized in the context of computer systems for simulating human language
understanding. The writers’ shared
assumption is that it should be possible to design a formal system akin to the
calculi of mathematical logic but very much more complex, so that it reflects
not just the inferences we draw from the grammar of sentences (which, broadly,
is what standard mathematical logic aims to do) but also the inferences we draw
from vocabulary.
Such a system would need to provide alternative
representations not only for accidental one-off homonyms, such as “bank as raised ground” versus “bank as financial institution”,
but also for many systematic meaning distinctions running through large sets of
words. Ann Copestake and Ted
Briscoe’s example is “The south side of Cambridge voted Conservative”, where side out of context means a
place but in context means the people who live there – and this is not an
alternation specific to that individual word, but would apply equally to other
ways of referring to inhabited areas.
Many issues arise about how far such facts can be captured by general
rules, how the rules should be expressed, and so forth.
The collection evidently originated as a special issue of
a journal – we are not told which journal, but various clues betray the fact,
most obviously Geoffrey Nunberg’s acknowledgment of the comments of “two anonymous
referees for this journal” – which the publisher has reissued in book form in
hopes of selling it twice to hard-pressed libraries. (Time was when Oxford University Press was above that sort
of thing … )
The enterprise represented in this book strikes me as
severely problematic, for reasons that I shall illustrate from Nicholas Asher
and Alex Lascarides’s contribution “Lexical disambiguation in a discourse
context” (though I could have made the same point from other chapters).
The body of Asher and Lascarides’s chapter is formidably
technical. Try the following, for
instance (to keep things simple for this Journal’s typesetter, I shall write “X.Y” for X subscripted with Y):
We can build a general multi-model for
the language of feature structures L.fs (Blackburn 1992), and the language of
CE L.>
(where > is the default conditional connective): viz. a model for L.(fs, >). Nevertheless – and this is important – our formulation
countenances no interaction between the CE logic of > (and a fortiori the nonmonotonic
consequence relation $) on the one hand, and Blackburn’s modal operators on the
other. So from the perspective of
DICE, we can translate L.(fs, >) into L.> by treating each
feature structure description and statement about subsumption relations, as
atomic formulae of L.>.
“CE” stands for Commonsense Entailment and “DICE” for
Discourse in Commonsense Entailment, which are formal systems that have been
briefly introduced on the previous page, though they are never described in
detail. I have substituted dollar
sign for a symbol of Asher and Lascarides’s that I have never seen before and
cannot reproduce on my word processor, consisting of a vertical bar with two
wavy horizontals to its right.
Now, before launching into these technicalities, Asher
and Lascarides have introduced the problems they aim to address by discussing
examples. Their main example concerns
disambiguation of the word bar in the following text:
The judge asked where the defendant
was. The barrister apologised, and
said he was at the pub across the street.
The court bailiff found him slumped underneath the bar.
Asher and Lascarides explain that the legal references,
including the word barrister, would in isolation suggest that bar referred to “the courtroom
bar”, but the text taken as a whole forces bar to be taken “in its pub sense”. They discuss this initially for four
pages, and frequently revert to the example later.
The trouble with this is that, in the first place, there
is no object in a court of law called a bar. The phrase “prisoner at the bar” may have had a literal use
centuries ago, but in modern times it is merely a picturesque but empty
phrase. The “bar” to which a
barrister is notionally called is not located in a court but in a place where
legal professionals meet for training and social purposes. Furthermore, in a drinking career
spanning several decades I have yet to visit a pub where there would be any
easy way in which a drinker could slump underneath the bar. A pub bar normally presents a solid
vertical face to the paying customers.
In other words, the authors’ strenuous mathematical
formalisms are erected on the basis of concrete linguistic examples that do not
begin to work. Now, of course,
this need not be a fatal criticism.
The general point that Asher and Lascarides try to make about bar might equally have been
made using dozens of alternative examples which would have worked perfectly
well. Nevertheless, the contrast
between the density of the formal analysis and the superficial nature of the
linguistic example strikes me as indicative. The contributors to this book (not just Asher and Lascarides)
are so keen to launch into detailed mathematical analysis that they do not give
themselves time to stand back and ask themselves whether human language is
really like that at all.
Two of the most influential philosophical works of the
twentieth century are Ludwig Wittgenstein’s Philosophical Investigations, and Willard van Orman
Quine’s “Two dogmas of empiricism”, published in 1953 and 1951
respectively. In the
English-speaking world at least, what they say was for decades broadly accepted
as correct; and both imply that the enterprise of Lexical Semantics is doomed. There is a good reason why traditional
logic fails to address word meanings; that area of human linguistic behaviour
is too underdetermined and too subject to unpredictably creative thought to be
reduced to formal rules. Less
attention is paid to Wittgenstein or to Quine nowadays than used to be the
case, but so far as I am aware no-one has put forward new reasons to reject
this point of view. Certainly the
writers under review do not do so; they just bypass it.
In the 1990s, there are strong pressures on professional
academics to involve themselves in research projects of the kind that cost
money to carry out and generate tangible “deliverable output” in the shape of
computer software, technical theorizing, or the like. It has become difficult to base a career on the kind of
intellectual hygiene that was in fashion in British universities in the postwar
decades, when the goal of education was seen largely as cultivating clear
thinking. The writers under review
identify many true facts about individual points of usage in English and other
languages; and their assumption is that by assembling enough facts of this
sort, and by being sufficiently clever in fitting them together, they can eventually
achieve a computational system that simulates a significant part (at least) of
human language understanding. I
believe they are unconsciously closing their eyes to the reasons why that can
never happen, because it would be too inconvenient to acknowledge them.
Along the way, some of them do come up with interesting
points. In particular, Geoffrey
Nunberg in his chapter “Transfers of meaning” discusses cases where grammar
requires elements to co-refer, but sense seems to make them refer to
contrasting categories – his (to my mind rather awkward) example is “Ringo
squeezed himself into a narrow space”, with himself understood as the car
Ringo is driving. According to
Nunberg, the facts of usage imply that such examples are never what they
seem: himself really does refer to Ringo
the man, and the word that must be interpreted non-straightforwardly, in the
car-parking scenario, is squeezed.
But it is notable that this point of Nunberg’s is made in
plain English, without resort to formalism. That makes it quite unusual in this book, which employs one
of the largest ranges of formal notations I have ever seen between a single
pair of covers. Often, some
unusual sign crops up on just one or two pages never to be seen again. Realistically, no outsider is likely to
be able to follow the details. But
this book is perhaps intended more for display than enlightenment.