Sampson, A linguistic axiom challenged

Text Box: Introductory chapter from Sampson, Gil, and Trudgill, eds, Language Complexity as an Evolving Variable, to be published by Oxford University Press in early 2009.

1 A linguistic axiom challenged

Geoffrey Sampson

1 Background

This book responds to the fact that an idea which ranked for many decades as an unquestioned truism of linguistics is now coming under attack from many different directions. This introductory chapter sketches the history of the axiom, and goes on to draw together some of the diverse ways in which it is now under challenge.

For much of the twentieth century, linguistics was strongly attached to a principle of invariance of language complexity as one of its bedrock assumptions – and not just the kind of assumption that lurks tacitly in the background, so that researchers are barely conscious of it, but one that linguists were very given to insisting on explicitly. Yet it never seemed to be an assumption that linguists offered much justification for – they appeared to believe it because they wanted to believe it, much more than because they had reasons for believing it. Then, quite suddenly at the beginning of the new century, a range of researchers began challenging the assumption, though these challenges were so diverse that people who became aware of one of them would not necessarily link it in their minds with the others, and might not even be aware of the other challenges.

Linguists and non-linguists alike agree in seeing human language as the clearest mirror we have of the activities of the human mind, and as a specially important component of human culture, because it underpins most of the other components. Thus, if there is serious disagreement about whether language complexity is a universal constant or an evolving variable, that is surely a question which merits careful scrutiny. There cannot be many current topics of academic debate which have greater general human importance than this one.

2 Complexity invariance in early twentieth-century linguistics

When I first studied linguistics, in the early 1960s, Noam Chomsky’s name was mentioned, but the mainstream subject as my fellow students and I encountered it was the “descriptivist” tradition that had been inaugurated early in the twentieth century by Franz Boas and Leonard Bloomfield; it was exemplified by the papers collected in Martin Joos’s anthology Readings in Linguistics, which contained Joos’s famous summary of that tradition as holding “that languages can differ from each other without limit and in unpredictable ways” (Joos 1957: 96).

Most fundamental assumptions about language changed completely as between the descriptive linguistics of the first two-thirds of the twentieth century and the generative linguistics which became influential from the mid-1960s onwards. For the descriptivists, languages were part of human cultures; for the generativists, language is part of human biology – as Chomsky (1980: 134) put it, “we do not really learn language; rather, grammar grows in the mind”. The descriptivists thought that languages could differ from one another in any and every way, as Martin Joos said; the generativists see human beings as all speaking in essence the same language, though with minor local dialect differences (Chomsky 1991: 26). But the invariance of language complexity is an exception: this assumption is common to both descriptivists and generativists.

The clearest statement that I have found occurred in Charles Hockett’s influential 1958 textbook A Course in Modern Linguistics:

… impressionistically it would seem that the total grammatical complexity of any language, counting both morphology and syntax, is about the same as that of any other. This is not surprising, since all languages have about equally complex jobs to do, and what is not done morphologically has to be done syntactically. Fox, with a more complex morphology than English, thus ought to have a somewhat simpler syntax; and this is the case. (Hockett 1958: 180–1)

Although Hockett used the word “impressionistically” to avoid seeming to claim that he could pin precise figures on the language features he was quantifying, notice how strong his claim is. Like Joos, Hockett believed that languages could differ extensively with respect to particular subsystems: e.g. Fox has complex morphology, English has simple morphology. But when one adds together the complexity derived from the separate subsystems, and if one can find some way to replace “impressions” with concrete numbers, Hockett’s claim is that the overall total will always come out about the same.

Hockett justifies his claim by saying that “languages have about equally complex jobs to do”, but it seems to me very difficult to define the job which grammar does in a way that is specific enough to imply any particular prediction about grammatical complexity. If it really were so that languages varied greatly in the complexity of subsystem X, varied greatly in the complexity of subsystem Y, and so on, yet for all language the totals from the separate subsystems added together could be shown to come out the same, then I would not agree with Hockett in finding this unsurprising. To me it would feel almost like magic.

3 Apparent counterexamples

And at an intuitive level it is hard to accept that the totals do always come out roughly the same. Consider for instance Archi, spoken by a thousand people in one village 2300 metres above sea level in the Caucasus. According to Aleksandr Kibrik (1998), an Archi verb can inflect into any of about 1•5 million contrasting forms. English is said to be simple morphologically but more complex syntactically than some languages; but how much syntactic complexity would it take to balance the morphology of Archi? – and does English truly have that complex a syntax? Relative to some languages I know, English syntax as well as English morphology seems to be on the simple side.

Or consider the Latin of classical poetry, which is presumably a variety of language as valid for linguists’ consideration as any other. Not only did classical Latin in general have a far richer morphology than any modern Western European language – but, in poetry, one had the extraordinary situation whereby the words of a sentence could be permuted almost randomly out of their logical hierarchy, so that Horace could write a verse such as (Odes Book I, XXII):

namque me silva lupus in Sabina,

dum meam canto Lalagen et ultra

terminum curis vagor expeditis,

fugit inermem

which means something like:

for a wolf flees from me while, unarmed, I sing about my Lalage and wander, cares banished, in the Sabine woods beyond my boundary

but expresses it in the sequence:

for me woods a wolf in the Sabine, while my I sing about Lalage and beyond my boundary cares wander banished, flees from unarmed

We know, of course, that languages with extensive case marking, such as Latin, tend to be free word order languages. But it seems to me that what linguists normally mean by “free word order” is really free constituent order – the constituents of a logical unit at any level can appear in any order, when case-endings show their respective roles within the higher-level unit, but the expectation is that (with limited exceptions) the logical units will be kept together as physical units. In Latin poetry that expectation is comprehensively flouted, creating highly complex relationships between the physical text and the sense-structure it encodes.[1]

4 Ideological motives

For the descriptivist school, I believe the assumption of invariance of total language complexity as between different languages was motivated largely by ideological considerations. The reason why linguistics was worth studying, for many descriptivists, was that it helped to demonstrate that “all men are brothers” – Mankind is not divided into a group of civilized nations or races who think and speak in subtle and complex ways, and another group of primitive societies with crudely simple languages. They assumed that more complex languages would be more admirable languages, and they wanted to urge that the unwritten languages of the third world were fully as complex as the well-known European languages, indeed the third world languages often contained subtleties of expression having no parallel in European languages.

One point worth making about this is that it is not obvious that “more complex” should be equated with “more admirable”. We all know the acronym KISS, “Keep it simple, stupid”, implying that the best systems are often the simplest rather than the most complex ones. Whoever invented that acronym probably was not thinking about language structure, but arguably the principle should apply as much to our domain as to any other. We English-speakers do not seem to spend much time yearning to use languages more like Archi or poetic Latin.

Still, it is certainly true that the popular layman’s idea of language diversity was that primitive third-world tribes had crude, simple languages, whereas the languages of Europe were precision instruments. The descriptivists were anxious to urge that this equation just does not hold – unwritten third-world grammars often contain highly sophisticated structural features; and they were surely correct in making that point.

It is worth adding that the founders of the descriptivist school seem often to have been subtler about this than those who came after them, such as Hockett. Edward Sapir, for instance, did not say that all languages were equally complex – he did not believe they were, but he believed that there was no correlation between language complexity and level of civilization:

Both simple and complex types of language … may be found spoken at any desired level of cultural advance. When it comes to linguistic form, Plato walks with the Macedonian swineherd, Confucius with the head-hunting savages of Assam. (Sapir [1921] 1963: 219)

Franz Boas was more knowledgeable about detailed interrelationships between language, race, and culture than later descriptivist linguists, who tended to focus on language alone in a more specialized way, and Boas was responsible for the classic discussion of how third-world languages sometimes enforced precision about intellectual categories that European languages leave vague:

In Kwakiutl this sentence [the man is sick] would have to be rendered by an expression which would mean, in the vaguest possible form that could be given to it, definite man near him invisible sick near him invisible. … in case the speaker had not seen the sick person himself, he would have to express whether he knows it by hearsay or by evidence that the person is sick, or whether he has dreamed it. (Boas [1911] 1966: 39)

But Boas did not claim, as later linguists often have done, that there was no overall difference in the intellectual nature of the ideas expressed in the languages of culturally primitive and culturally advanced societies. Boas believed that there were important differences, notably that the languages of primitive cultures tended to lack expressions for abstract ideas and to focus almost exclusively on concrete things, but he wanted to say that this was because life in a primitive society gave people little practical need to think about abstractions, and whenever the need did arise, languages would quickly be adapted accordingly:

Primitive man … is not in the habit of discussing abstract ideas. His interests center around the occupations of his daily life … (op. cit.: 60)

… the language alone would not prevent a people from advancing to more generalized forms of thinking if the general state of their culture should require expression of such thought; … the language would be moulded rather by the cultural state. (op. cit.: 63)

5 Complexity invariance and generative linguistics

If we come forward to the generative linguistics of the last forty-odd years, we find that linguists are no longer interested in discussing whether language structure reflects a society’s cultural level, because generative linguists do not see language structure as an aspect of human culture. Except for minor details, they believe it is determined by human biology, and in consequence there is no question of some languages being structurally more complex than other languages – in essence they are all structurally identical to one another. Of course there are some parameters which are set differently in different languages: adjectives precede nouns in English but follow nouns in French. But choice of parameter settings is a matter of detail that specifies how an individual language expresses a universal range of logical distinctions – it does not allow for languages to differ from one another with respect to the overall complexity of the set of distinctions expressed, and it does not admit any historical evolution with respect to that complexity. According to Ray Jackendoff (1993: 32), “the earliest written documents already display the full expressive variety and grammatical complexity of modern languages”.

I have good reason to know that the generativists are attached to that assumption, after experiencing the reception that was given to my 1997 book Educating Eve, written as an answer to Steven Pinker’s The Language Instinct. Pinker deployed many different arguments to persuade his readers of the reality of a biologically inbuilt “language instinct”, and my book set out to refute each argument separately. I thought my book might be somewhat controversial – I intended it to be controversial; but I had not foreseen the extent to which the controversy would focus on one particular point, which was not specially central either in my book or in Pinker’s. I argued against Ray Jackendoff’s claim just quoted by referring to Walter Ong’s (1982) discussion of the way that Biblical Hebrew made strikingly little use of subordinate clauses, and also to Eduard Hermann’s belief (Hermann 1895) that Proto-Indo-European probably contained no clause subordination at all.

On the electronic Linguist List, this led to a furore. From the tone of some of what was written, it was clear that some present-day linguists did not just factually disagree with Ong and Hermann, but passionately rejected any possibility that their ideas might be worth considering.

This did not seem to be explainable in terms of politically-correct ideology – people who use words like “imperialism” or “racism” to condemn any suggestion that some present-day human societies may be less sophisticated than others are hardly likely to feel a need to defend the Jews of almost 3000 years ago, and still less the mysterious Indo-Europeans even further back in the past: those peoples cannot now be touched by 21st-century neo-imperialism. So it seemed that the generative doctrine of innate knowledge of language creates an intellectual outlook within which it becomes just unthinkable that overall complexity could vary significantly between language and language, or from period to period of the historical record. The innate cognitive machinery which is central to the generative concept of language competence is taken to be too comprehensive to leave room for significant differences with respect to complexity.

6 “Some issues on which linguists can agree”

Summing up, the idea that languages are similar in degree of complexity has been common ground between linguists of very different theoretical persuasions. In 1981 Richard Hudson made a well-known attempt to identify things that linguists all agreed about, and his list turned out to be sufficiently uncontentious that the updated version of it is nowadays promulgated as “Some issues on which linguists can agree” on the website of the UK Higher Education Academy. Item 2.2d on the list runs:

There is no evidence that normal human languages differ greatly in the complexity of their rules, or that there are any languages that are “primitive” in the size of their vocabulary (or any other part of their language [sic]), however “primitive” their speakers may be from a cultural point of view. (The term “normal human language” is meant to exclude on the one hand artificial languages such as Esperanto or computer languages, and on the other hand languages which are not used as the primary means of communication within any community, notably pidgin languages. Such languages may be simpler than normal human languages, though this is not necessarily so.)

Echoing Hockett’s comparison of English with Fox, Hudson’s item 3.4b states that “Although English has little inflectional morphology, it has a complex syntax …”

This axiom became so well established within linguistics that writers on other subjects have treated it as a reliable premiss. For instance, Matt Ridley’s popular book about recent genetic discoveries, Genome, illustrates the concept of instinct with a chapter uncritically retailing the gospel according to Noam Chomsky and Steven Pinker about language, and stating as an uncontentious fact that “All human people speak languages of comparable grammatical complexity, even those isolated in the highlands of New Guinea since the Stone Age” (Ridley 1999: 94). Or, at a more academic level, the philosopher Stephen Laurence in a discussion of the relationship between language and thought writes:

there seems to be good evidence that language isn’t just [sic] a cultural artefact or human “invention”. For example, there is no known correlation between the existence or complexity of language with cultural development, though we would expect there would be if language were a cultural artefact (Laurence 1998: 211)

– Laurence cites Pinker as his authority.[2]

7 Complexity invariance over individuals’ lifespans

So far I have discussed the existence or non-existence of complexity differences between different languages: linguists either believe that no sizeable differences exist, or that what differences do exist do not correlate with other cultural features of the respective societies. That is not the only sense in which linguists have believed in complexity invariance, though. One can also hold that an individual’s language remains constant in complexity during his or her lifetime.

If the period considered were to include early childhood, that would be an absurd position: everyone knows that small children have to move through stages of single-word utterances and then brief phrases before they begin to use their mother tongue in the complex ways characteristic of adults. But the generative school believe that the language-acquisition period of an individual’s life is sharply separate from the period when the individual has become a mature speaker, and that in that mature period the speaker remains in a linguistic “steady state”. According to Chomsky:

children proceed through a series of cognitive states … [terminating in] a “steady state” attained fairly early in life and not changing in significant respects from that point on … Attainment of a steady state at some not-too-delayed stage of intellectual development is presumably characteristic of “learning” within the range of cognitive capacity. (Chomsky 1976: 119)

Chomsky’s word “presumably” seems to imply that he was adopting the steady-state hypothesis because he saw it as self-evidently plausible. Yet surely, in the sphere of day-to-day common-sense social discussion, we are all familiar with the opposite point of view: people say “you never stop learning, do you” as a truism.

And this is a case where the generative position on complexity invariance goes well beyond what was believed by their descriptivist predecessors. Leonard Bloomfield, for instance, held a view which matched the common-sense idea rather than Chomsky’s steady-state idea:

there is no hour or day when we can say that a person has finished learning to speak, but, rather, to the end of his life, the speaker keeps on doing the very things which make up infantile language-learning. (Bloomfield 1933: 46)

8 Complexity invariance between individuals

There is a third sense in which language complexity might or might not be invariant. Just as one can compare complexity between the languages of separate societies, and between the languages of different stages of an individual’s life, one can also compare complexity as between the idiolects spoken by different members of a single speech-community. I am not sure that the descriptivists discussed this issue much, but the generativists have sometimes explicitly treated complexity invariance as axiomatic at this level also:

[Grammar acquisition] is essentially independent of intelligence … We know that the grammars that are in fact constructed vary only slightly among speakers of the same language, despite wide variations not only in intelligence but also in the conditions under which language is acquired. (Chomsky 1968: 68–9)

Chomsky goes on to say that speakers may differ in “ability to use language”, that is, their “performance” levels may differ, but he holds that their underlying “competence” is essentially uniform. Later, he wrote:

[Unlike school subjects such as physics] Grammar and common sense are acquired by virtually everyone, effortlessly, rapidly, in a uniform manner … To a very good first approximation, individuals are indistinguishable (apart from gross deficits and abnormalities) in their ability to acquire grammar … individuals of a given community each acquire a cognitive structure that is … essentially the same as the systems acquired by others. Knowledge of physics, on the other hand, is acquired selectively … It is not quickly and uniformly attained as a steady state … (Chomsky 1976: 144)

In most areas of human culture, we take for granted that some individual members of society will acquire deeper and richer patterns of ability than others, whether because of differences in native capacity, different learning opportunities, different tastes and interests, or all of these. Language, though, is taken to be different: unless we suffer from some gross disability such as profound deafness, then (according to Chomsky and other generativists) irrespective of degrees of native wit and particular tastes and formative backgrounds we grow up essentially indistinguishable with respect to linguistic competence.

Admittedly, on at least one occasion Chomsky retreated fairly dramatically from this assumption of uniformity between individuals. Pressed on the point by Seymour Papert at a 1975 conference, Chomsky made remarks which seemed flatly to contradict the quotation given above (Piattelli-Palmarini 1980: 175–6). But the consensus among generative linguists has been for linguistic uniformity between individuals, and few generativists have noticed or given any weight to what Chomsky said to Papert on a single occasion in 1975. When Ngoni Chipere decided to work on native-speaker variations in syntactic competence for his MA thesis, he tells us that his supervisor warned him that it was a bad career move if he hoped to get published, and he found that “a strong mixture of ideology and theoretical objections has created a powerful taboo on the subject of individual differences”, so that “experimental evidence of such differences has been effectively ignored for the last forty years” (Chipere 2003: xv, 5).

9 Diverse challenges to the axiom

Now let us consider some of the ways in which the consensus has recently been challenged. I cannot pretend to survey developments comprehensively, and many of the challengers will be speaking for themselves in later chapters.

But it is worth listing some of the publications which first showed that the constant-complexity axiom was no longer axiomatic, in order to demonstrate that it is not just one facet of this consensus that is now being questioned or contradicted: every facet is being challenged simultaneously. (For further references that I have no room to discuss here, see Karlsson et al. 2008.)

10 Complexity growth in language history

For me, as it happened, the first item that showed me that my private doubts about complexity invariance were not just an eccentric personal heresy but a topic open to serious, painstaking scholarly research was Guy Deutscher’s year-2000 book Syntactic Change in Akkadian. Akkadian is one of the earliest languages to have been reduced to writing, and Deutscher claims that if one looks at the earliest recorded stages of Akkadian one finds a complete absence of finite complement clauses. What’s more, this is not just a matter of the surviving records happening not to include examples of recursive structures that did exist in speech; Deutscher shows that if we inspect the 2000-year history of Akkadian, we see complement clauses gradually developing out of simpler, non-recursive structures which did exist in the early records. And Deutscher argues that this development was visibly a response to new communicative needs arising in Babylonian society.

That is as direct a refutation as there could be of Ray Jackendoff’s statement that “the earliest written documents already display the full expressive variety and grammatical complexity of modern languages”. I do not know what evidence Jackendoff thought he had for his claim; he quoted none. It seemed to me that we had an aprioristic ideological position from Jackendoff being contradicted by a position based on hard, detailed evidence from Deutscher.

This was particularly striking to me because of the Linguist List controversy I had found myself embroiled with. I had quoted other writers on Biblical Hebrew and Proto-Indo-European, and I was not qualified to pursue the controversy from my own knowledge, because I have not got the necessary expertise in those languages. One man who knew Biblical Hebrew well told me that Walter Ong, and therefore I also, had exaggerated the extent to which that language lacked clause subordination. But I have not encountered anyone querying the solidity of Deutscher’s analysis of Akkadian.

If the complexity invariance axiom is understood as absolutely as Stephen Laurence expressed it in the passage I quoted in section 6 above, it was never tenable in the first place. For instance, even the mainstream generative linguists Brent Berlin and Paul Kay, in their well-known cross-linguistic study of colour vocabularies, recognized that “languages add basic color terms as the peoples who speak them become technologically and culturally more complex” (Berlin and Kay 1969: 150), in other words there is a correlation between this aspect of language complexity and cultural development. But vocabulary size was not seen by linguists as an ideologically-charged aspect of language complexity. Grammatical structure was, so Deutscher was contradicting a significant article of faith.

The way that linguists were muddling evidence with ideology was underlined by a book that appeared in English just before the turn of the century: Louis-Jean Calvet’s Language Wars and Linguistic Politics (a 1997 translation of a French original published in 1987). You could not find a doughtier enemy of “linguistic imperialism” than Calvet; but, coming from a separate national background, his conception of what anti-imperialism requires us to believe about language structure turned out to differ from the consensus among English-speaking linguists. One passage (chapter 9) in his book, based on a 1983 doctoral dissertation by Elisabeth Michenot, discusses the South American language Quechua, and how it is being deformed through the influence of Spanish. According to Calvet, the Quechua of rural areas, free from Spanish contamination, has very little clause subordination; the “official” Quechua of the towns has adopted a richly recursive system of clause subordination similar to Spanish or English.

Many American or British linguists might take Calvet’s statement about rural Quechua as a shocking suggestion that this peasant society uses a crudely simple language; they would want to insist that rural Quechua-speakers have recursive structures at their disposal even if they rarely use them. For Calvet, the scenario was a case of domineering European culture deforming the structural properties which can still be observed in Quechua where it is free from imperialist contamination. I do not doubt the sincerity of any of these linguists’ ideological commitments, but surely it is clear that we need to establish the facts about variation in structural complexity, before it is reasonable to move on to debating their ideological implications?

11 Differences among individuals’ levels of linguistic competence

I have already mentioned Ngoni Chipere’s 2003 book Understanding Complex Sentences. Chipere sets out from a finding by the scholar who was his MA supervisor, Ewa Dąbrowska (1997), that adult members of a speech community differ in their ability to deal with syntactic complexity. Chipere quotes the English example:

The doctor knows that the fact that taking care of himself is necessary surprises Tom.

The grammatical structure here is moderately complicated, but any generative grammarian would unquestionably agree that it is a well-formed example of English. Indeed, the grammar is less tangled than plenty of prose which is in everyday use in written English. But Dąbrowska found that native speakers’ ability to understand examples like this varied fairly dramatically.

When she asked participants in her experiment to answer simple comprehension questions … she found that university lecturers performed better than undergraduates, who, in turn, performed better than cleaners and porters, most of whom completely failed to answer the questions correctly. (Chipere 2003: 2)

What is more, when Chipere carried out similar experiments, he

found, unexpectedly, that post-graduate students who were not native speakers of English performed better than native English cleaners and porters and, in some respects, even performed better than native English post-graduates. Presumably this was because non-native speakers actually learn the grammatical rules of English, whereas explicit grammatical instruction has not been considered necessary for native speakers … (Chipere 2003: 3)

If our mother tongue is simply part of our culture, which we learn using the same general ability to learn complicated things that we use to learn to play chess or keep accounts or program computers, then it is entirely unsurprising that brighter individuals learn their mother tongue rather better than individuals who are less intelligent, and even that people who go through explicit language training may learn a language better than individuals who are born into the relevant speech community and just pick their mother tongue up as they go along.

The linguistic consensus has been that mother-tongue acquisition is a separate issue. According to the generativists, “we do not really learn language”, using general learning abilities; “rather, grammar grows in the mind”. Descriptivist linguists did not believe that, but they did see native-speaker mastery as the definitive measure of language ability. For many twentieth-century linguists, descriptivist or generativist, I believe it would simply not have made sense to suggest that a non-native-speaker might be better at a language than a native speaker. Native-speaker ability was the standard; to say that someone knew a language better than its native speakers would be like saying that one had a ruler which was more precisely one metre long than the standard metre, in the days when that was an actual metal bar.

However, in connection with syntactic complexity there are quite natural ways to quantify language ability independently of the particular level of ability possessed by particular native speakers, and in those terms it is evidently not axiomatic that mother-tongue speakers are better at their own languages than everyone else.

12 Complexity growth during individuals’ lifetimes

The minor contribution I made myself to this process of challenging the linguistic consensus related to the idea of complexity invariance over individuals’ lifetimes. It concerned what some are calling structural complexity as opposed to system complexity: complexity not in the sense of richness of the set of rules defining a speaker’s language, but in the sense of how many cycles of recursion speakers produce when the rules they use are recursive.

The British National Corpus contains transcriptions of spontaneous speech by a cross-section of the British population; I took a subset and looked at how the average complexity of utterances, in terms of the incidence of clause subordination, related to the kinds of people the speakers were (Sampson 2001). It is well known that Basil Bernstein (1971) claimed to find a correlation between structural complexity and social class, though by present-day standards his evidence seems quite weak.

I did not find a correlation of that sort (the BNC data on social class are in any case not very reliable); but what I did find, to my considerable surprise, was a correlation with age.

We would expect that small children use simpler grammar than adults, and in the BNC data they did. But I had not expected to find that structural complexity goes on growing, long after the generativists’ putative steady state has set in. Forty-year-olds use, on average, more complex structures than thirty-year-olds. People over sixty use more complex structures than people in their forties and fifties. Before I looked at the data, I would confidently have predicted that we would not find anything like that.

13 New versus old and big versus small languages

Meanwhile, John McWhorter moved the issue about some languages being simpler than others in an interesting new direction, by arguing that not only are new languages simpler than old ones, but big languages tend to be simpler than small ones.

Everybody agrees that pidgins are simpler than established, mother-tongue languages; but the consensus has been that once a pidgin acquires native speakers it turns into something different in kind, a creole, and creoles are said to be similar to any other languages in their inherent properties. Only their past history is “special”. McWhorter (2001a) proposed a metric for measuring language complexity, and he claimed that, in terms of his metric, creoles are commonly simpler than “old” languages. He came in for a lot of flak (e.g. DeGraff 2001) from people who objected to the suggestion that politically-powerless groups might speak languages which in some sense lack the full sophistication of European languages.

But, perhaps even more interestingly, McWhorter has also argued (e.g. 2001b) that English, and other languages of large and successful civilizations, tend to be simpler than languages used by small communities and rarely learned by outsiders. Indeed, similar ideas were already being expressed by Peter Trudgill well before the turn of the century (e.g. Trudgill 1989). It possibly requires the insularity of a remote, impoverished village community to evolve and maintain the more baroque language structures that linguists have encountered. Perhaps Archi not only is a language of the high Caucasus but could only be a language of a place like that.

14 An unusually simple present-day language

Possibly the most transgressive work of all has been Dan Everett’s description of the Pirahã language of the southern Amazon basin (Everett 2005). Early Akkadian, according to Guy Deutscher, lacked complement clauses, but it did have some clause subordination. Pirahã in our own time, as Everett describes it, has no clause subordination at all, indeed it has no grammatical embedding of any kind, and in other ways too it sounds astonishingly crude as an intellectual medium. For instance, Pirahã has no quantifier words, comparable to all, some, or most in English; and it has no number words at all – even “one, two, many” is a stage of mathematical sophistication outside the ken of the Pirahã.

If the generativists were right to claim that human biology guarantees that all natural languages must be cut to a common pattern, then Pirahã as Dan Everett describes it surely could not be a natural human language. (It is reasonable to include the proviso about correctness – faced with such remarkable material, we must bear in mind that we are dealing with one fallible scholar’s interpretations.) Despite the oddity of their language, though, the Pirahã are certainly members of our species. The difference between Pirahã and better-known languages is a cultural difference, not a biological difference, and in the cultural sphere we expect some systems to be simpler than others. Nobody is surprised if symphonic music has richer structure than early mediaeval music, or if the present-day law of England requires many more books to define it than the Anglo-Saxon Common Law out of which it evolved. Cultural institutions standardly evolve in complexity over time, often becoming more complex, sometimes being simplified. But we have not been accustomed to thinking that way about language.

15 Language “universals” as products of cultural influence

Finally, David Gil documents a point about isomorphism between exotic and European languages. Where non-Indo-European languages of distant cultures in our own time do seem to be structurally more or less isomorphic with European languages, this is not necessarily evidence for biological mechanisms which cause all human languages to conform to a common pattern. It may instead merely show that the “official” versions of languages in all parts of the world have nowadays been heavily remodelled under European influence.

Gil has published a series of comparisons (e.g. Gil 2005b) between the indigenous Indonesian dialect of the island of Riau and the formal Indonesian language used for written communication, which outsiders who study Indonesian are encouraged to see as the standard language of that country. Formal Indonesian does share many features of European languages which generative theory sees as necessary properties of any natural language. But colloquial Riau Indonesian does not: it fails to conform to various characteristics that generativists describe as universal requirements. And Gil argues (if I interpret him correctly) that part of the explanation is that official, formal Indonesian is to some extent an artificial construct, created in order to mirror the logical properties of European languages.

Because the formal language has prestige, when a foreigner asks a Riau speaker how he expresses something, the answer is likely to be an example of formal Indonesian. But when the same speaker talks spontaneously, he will use the very different structural patterns of Riau Indonesian.

In modern political circumstances Gil’s argument is very plausible. Consider the opening sentence of the United Nations Charter, which in English contains 177 words, consisting of one main clause with three clauses at one degree of subordination, eight clauses at two degrees of subordination, four clauses at three degrees of subordination, and one clause at four degrees of subordination (square brackets delimit finite and angle brackets delimit non-finite clauses):

[We the peoples of the United Nations <determined <to save succeeding generations from the scourge of war, [which twice in our lifetime has brought untold sorrow to mankind]>, and <to reaffirm faith in fundamental human rights, in the dignity and worth of the human person, in the equal rights of men and women and of nations large and small>, and <to establish conditions [under which justice and respect for the obligations <arising from treaties and other sources of international law> can be maintained]>, and <to promote social progress and better standards of life in larger freedom>, and for these ends <to practise tolerance and live together in peace with one another as good neighbours>, and <to unite our strength <to maintain international peace and security>>, and <to ensure, by the acceptance of principles and the institution of methods, [that armed force shall not be used, save in the common interest]>, and <to employ international machinery for the promotion of the economic and social advancement of all peoples>>, have resolved <to combine our efforts><to accomplish these aims>].

Although the sentence was composed by speakers of modern European or European-derived languages (specifically, Afrikaans and English), it would translate readily enough into the Latin of 2000-odd years ago – which is no surprise, since formal usage in modern European languages has historically been heavily influenced by Latin models.

On the other hand, the early non-European language I know best is Old Chinese; so far as I can see, it would be quite impossible to come close to an equivalent of this sentence in that language (cf. Sampson 2006). Old Chinese did have some clause-subordination mechanisms, but they were extremely restricted by comparison with English (Pulleyblank 1995: e.g. 37, 148ff.) However, if the community of Old Chinese speakers were living in the 21st century, their leaders would find that it would not do to say “You can say that in your language, but you can’t say it in our language”. In order to survive as a society in the modern world they would have to change Old Chinese into a very different kind of language, in which translations were available for the UN Charter and for a great deal of other Western officialese.

And then, once this new language had been invented, generative linguists would come along and point to it as yet further corroboration of the idea that human beings share innate cognitive machinery which imposes a common structure on all natural languages. A large cultural shift, carried out in order to maintain a society’s position vis-à-vis more powerful Western societies, would be cited as proof that a central aspect of the society’s culture never was more than trivially different from Western models, and that it is biologically impossible for any human society to be more than trivially different with respect to cognitive structure. Obviously this scenario is purely hypothetical in the case of Old Chinese of 3000 years ago. But I believe essentially that process has been happening a lot with third-world languages in modern times.

That turns the generative belief that all languages are similar in structural complexity into something like a self-fulfilling prophecy. What European or North American linguists count as the “real language” of a distant part of the world will be the version of its language which has been remodelled in order to be similar to European languages. Because of the immense dominance nowadays of European-derived cultures, most or all countries will have that kind of version of their language available, and for a Western linguist who arrives at the airport and has to spend considerable time dealing with officialdom, that will be the version most accessible to study. It takes a lot of extra effort to penetrate to places like Riau, where language varieties are spoken that test the axiom of constant language complexity more robustly.

If we are concerned about the moral duty to respect alien cultures, what implies real lack of respect is this insistence on interpreting exotic cognitive systems as minor variations on a European theme (Sampson 2007) – not the recognition that languages can differ in many ways, including their degree of complexity.

16 A melting iceberg

The traditional consensus on linguistic complexity was accepted as wholly authoritative for much of the past century. Developments of the last few years, such as those surveyed above, suggest that it finds itself now in the situation of an iceberg that has floated into warm waters.

[1] In the example, the logical constituency of the Latin original can be inferred straightforwardly from that of my English translation, except that inermem, “unarmed”, belongs in the original to the main rather than the subordinate clause – a more literal translation would run “… flees from unarmed me …”.

[2] The word “just” in the first line of the Laurence quotation seems to imply that products of cultural evolution will be simpler than biological structures. This may well be a tacit assumption among proponents of the “language instinct” concept more generally, but if so it is a baseless one. The English legal system, for instance, with its common-law foundation overlain by massive corpora of statute and case law, its hierarchy of courts, its rules of evidence, arrangements for legal education, dress codes, and so forth is certainly a product of cultural rather than biological evolution, but no-one would describe it as simple.