Geoffrey Sampson

The Computational Analysis of English:
a corpus-based approach

edited by Roger Garside, Geoffrey Leech, and Geoffrey Sampson

Introduces a range of techniques for automatically analysing and extracting useful information from real-life English text, based on statistical data derived from large machine-readable collections of language samples or corpora.

Some critical comment:

It is a great relief to read a book like this, which is based on real texts rather than upon the imaginary language, sharing a few word forms with English, that is studied at MIT and some other research institutes ... I heartily recommend this book ... [It] is a testimony to the superiority of experience over fantasy.

— Michael Lesk (Manager, Computer Science Research Division, Bell Communications Research) in Computational Linguistics

Their success, which is considerable ... is largely due to their willingness to develop modes of analysis which are neither traditional nor particularly fashionable in today’s linguistics.

— W. Nelson Francis in Language

These comments are taken from book reviews published at the end of the 1980s, shortly after the book was published. Probably no-one would describe our techniques as “unfashionable” today; computational linguistics has embraced the empirical, statistics-based approaches outlined in our book. But in the 1980s, those methods were heresy; this book was one significant factor in changing minds.

208 pp.

Published by Longman, 1987. New or used copies available via relevant British or American Amazon pages.

ISBN 0-608-03585-8

Geoffrey Sampson

last changed 21 Jan 2025

Geoffrey Sampson

The Computational Analysis of English: a corpus-based approach

edited by Roger Garside, Geoffrey Leech, and Geoffrey Sampson

Geoffrey Sampson

The Computational Analysis of English:
a corpus-based approach