Corpus linguistics means research based on large machine-readable samples — “corpora” — of real-life written or spoken language usage. This is a branch of linguistics and of computational natural-language processing that was very much a minority hobby fifteen or twenty years ago. It has now widened out to become a major focus, both for advancing our intellectual understanding of human language, and for developing economically-valuable language engineering systems. This anthology reprints a collection of key articles in the field.
Many people are finding themselves newly drawn into corpus linguistics activities without much background knowledge of where the subject has come from, or what its overall shape is. In particular, people with a humanities background are often uncomfortable with more technical aspects of the subject, while researchers who are essentially computer scientists may know very little of the traditional linguistic ideas which underlie their work and which, often, justify the research projects that employ them.
Because the field was a minority interest until recently, classic papers which would help newcomers to “read themselves in” have tended to appear in obscure, hard-to-get-hold-of sources. By reprinting 42 key articles (with dates of first appearance ranging from 1952 to 2002), our book addresses this problem. In particular, we aim to give readers from both arts-based and technical backgrounds an accessible introduction to the other side of the subject. As well as a general introductory chapter, we have provided each article with an editorial introduction, putting it into context for the benefit of newcomers to the field.
We also include a leavening of papers that are beguiling as well as instructive. We hope our book may, among other things, help academics to “sell” corpus linguistics to their students.
The text of the original articles has been completely re-set for this collection, and tables and graphics professionally re-drawn (from sometimes crudely-reproduced originals) to a common and high visual standard. This may often be the most convenient location to consult the papers included, even for those who have access to earlier editions.
By now, Corpus Linguistics is a recognized textbook on university courses
in places as distant as California and France.
Some critical comment:
- easily accessible and thoroughly rewarding to read … This excellent book should be required reading for students and teachers involved in corpus-based research … an impressive volume
- — Jonathan Clenton (Osaka University), on The LINGUIST List
an ideal source ... Beyond the selection of papers, the “value added” material in this collection is uniformly helpful and well done ... a wonderful addition to the currently available textbooks on corpus linguistics- — Robert Malouf (San Diego State University), in Computational Linguistics
an extremely valuable resource to own, not only for corpus linguists as reference, but also for those newly interested in the area to understand the wider field- — Ute Knoch (University of Auckland), on The LINGUIST List
Your book is a source of inspiration for my students- — Geoffrey Williams, Université de Bretagne-Sud
a volume to be highly recommended- — Milica Gačić (University of Zagreb), in Corpus Linguistics and Linguistic Theory
a diverse yet accessible collection- — Jack Grieve (Northern Arizona University), in Corpora
Corpus Linguistics: readings in a widening discipline is published by Continuum, now an imprint of Bloomsbury Publishing, of London, Sydney, New York, and New Delhi. New or used copies available via relevant British or American Amazon pages.
xv + 524 pp., 2004. ISBN (hardback) 0-8264-6013-5; (paperback) 0-8264-8803-X; also available as PDF e-book.
last changed 7 Dec 2020