The following online article has been derived mechanically from an MS produced on the way towards conventional print publication. Many details are likely to deviate from the print version; figures and footnotes may even be missing altogether, and where negotiation with journal editors has led to improvements in the published wording, these will not be reflected in this online version. Shortage of time makes it impossible for me to offer a more careful rendering. I hope that placing this imperfect version online may be useful to some readers, but they should note that the print version is definitive. I shall not let myself be held to the precise wording of an online version, where this differs from the print version.

Published in A Review of “The Poverty of Stimulus Argument”, special issue of The Linguistic Review edited by Nancy Ritter, vol. 19, pp. 73–104, 2002.




Exploring the richness of the stimulus[1]




Geoffrey Sampson


School of Cognitive and Computing Sciences

University of Sussex


Brighton BN1 9QH, England




1.  Introduction


To my mind it is both a common-sense assumption, and one supported by all the empirical evidence I am aware of, that the languages people grow up speaking reflect the structural properties of the language-samples they hear (or, later, read) produced by other members of their language community.  I do not understand what leads a number of linguists to doubt this.  Apart from some matters of detail, I fully agree with the points Pullum & Scholz are making; working as I do at a distance both geographically and in terms of academic subject affiliation from the community of “poverty of stimulus” theorists, I do not find it easy to grasp that it is appropriate to go to the lengths Pullum & Scholz do in order to demonstrate that the language-acquirer’s experience is as rich as the language acquired in response to it.


However, the choice of this topic for a special journal issue is good evidence that, in the present climate of opinion, such detailed demonstrations must be needed.  That being so, I can offer evidence which strengthens Pullum & Scholz’s exposition at points where they suggest that it is less robust than it might be.


The two most significant cases of alleged stimulus poverty discussed by Pullum & Scholz seem to be the ones they treat under the headings “auxiliary sequences” and “auxiliary-initial clauses” – the former being significant because Kimball (1973) expressed his claim about the corresponding “lacuna” in unusually explicit and seemingly empirical terms, the latter because it has been by far the most widely discussed case in the linguistic literature.


Discussing Kimball’s claim that complex auxiliary sequences are “vanishingly rare” in English, Pullum & Scholz give evidence that they are not rare in a set of texts they have examined.  But they describe their corpus as “in some ways unsuitable”.  Most of Pullum & Scholz’s texts are novels from the literary canon (their one spoken sample, from Ronald Reagan’s inaugural address as American President, may or may not have been scripted).  Sceptics might object (Janet Dean Fodor and Carrie Crowther in their contribution to this special issue do object) that the structures found in literary writing are not representative of those occurring in the speech heard by young children; in some respects this is certainly true.  On “auxiliary-initial clauses”, most of Pullum & Scholz’s evidence is taken from the text of the Wall Street Journal, which self-evidently is not representative of the language samples available to children during the early years of intensive language acquisition.


In the first part of this article, I study the incidence of both auxiliary sequences and what Pullum & Scholz call “auxiliary-initial clauses” in everyday, spontaneous speech, such as young children are routinely exposed to.  The corpus I used for this purpose was the 4.2 million word “demographically-sampled speech” section of the British National Corpus (“BNC/demographic”), which represents the spoken output of the UK population during the past decade.[2]  It was generated by supplying recording equipment to a set of individuals selected to be representative of the national population with respect to sex, age, social class, and region, and inviting them to record all speech events they took part in on two days, including a weekday and a Saturday or Sunday.  The bulk of the material is social conversation.


In later sections of the article I consider some related conceptual issues which Pullum & Scholz chose not to cover.



2.  The term “auxiliary verb”


Before proceeding to the substance of the analysis, let me get out of the way one difference of terminology between Pullum & Scholz and myself. Pullum & Scholz use the term “auxiliary verb” in what is to me an unusual manner, so that for them an “auxiliary verb” can be the main and sole verb of a clause.  Their discussion of “auxiliary-initial sequences” relates to the English yes/no-question construction, whereby:


           if the first verb of a declarative main clause (whether or not it is the sole verb of the clause) is a modal or a form of BE, DO, or (in some cases) HAVE, it is moved to the beginning of the clause


           if it is another verb, say X, the verb DO, inflected to match X, is placed at the beginning, and X is reduced to its base form




(1)  a.  He is in the kitchen  ~  Is he in the kitchen?


        b.  James will have left by now  ~  Will James have left by now?


        c.  Millie shook his hand  ~  Did Millie shake his hand?


I would call a verb “auxiliary” only if it is followed by another verb, so I would not want to describe all the questions in (1) as “auxiliary-initial”.  I shall refer to questions constructed in this way as “verb-fronted” constructions.  The particular cases of verb-fronting which have been used in poverty-of-the-stimulus arguments are cases where, in the declarative (unfronted) order, the verb is preceded by a constituent which consists of or contains a finite subordinate clause – a “complex constituent”, I shall say.



3.  Auxiliary sequences


Kimball identified a particular kind of verb group, namely one containing all three of the auxiliary markings modal, perfect, and progressive, which he believed were “vanishingly rare” in real-life usage, implying (for him) that it was remarkable that members of the English speech-community succeeded in acquiring the ability to produce such groups. 


Conducting a rough-and-ready search of BNC/demographic for this type of verb group, I found 61 clear examples and three debatable cases.  A few of the clear cases are the following.  Here and below, extracts from BNC/demographic are cited with their three-character BNC filename followed by five-digit “s-unit” number.  In brackets I give (so far as the information is available) place[3] and date of utterance; five-character BNC speaker identifier, followed by speaker’s sex, age, occupation, and regional dialect;[4] and the context of the discourse from which the example is taken.


(2)   a.  somebody must have been overtaking {unclear}[5]  KBJ.01188

(Nottingham, Jan 1992; PS1DS, male, 47, area union organizer, Midlands dialect; driving to airport)


        b. ... it’s what he should’ve been doing  KDL.00812

(Woking, Surrey, Nov/Dec 1991; PS0P0, female, 44, doctor, SW England dialect; shopping)


        c.  I’d have been shitting myself!  KPG.04412

(Greater London,[6] date unknown; PS555, female, 14, schoolgirl, SE England dialect; conversation with friends)


        d. Well should have been really {pause} taking two of her strong tablets she says  KBC.04010

(Sale, Cheshire, Apr 1992; PS1AA, male, 61, teacher, Northern England dialect; at home, housework and cooking)


Pullum & Scholz express frequency of the constructions in their literary corpus as one per three to four thousand sentences, but sentences are not well-defined units in spontaneous speech.  I estimated what their figure means in terms of words by counting the lengths of sentences drawn at random from three of the adult novels they quote, and from two novels for children.  The mean length was 12.5 words; if this is representative, Pullum & Scholz would be estimating instances of the auxiliary sequence in their literary corpus at one in every 40-50,000 words. 


The figure of 61 clear cases in BNC/demographic is not much less – about one per 70,000 words.  And in any case my count is not an accurate assessment, but only a lower bound on the number of instances in BNC/demographic.  There are certainly more.  I manually checked each case listed by the program I used, so I know that each case counted is genuine (cases such as It would have been disgusting anyway KD0.03070, where disgusting is functioning as adjective rather than participle, were excluded).  But the program will have missed cases, for many reasons.  The British National Corpus is not a “clean” research resource.  The transcribers made many spelling mistakes, and in particular what is standardly written have or -’ve following a modal is very often transcribed as of (e.g. must of, should of – this is a common British spelling error).  The semi-automatic technique used to annotate BNC words with wordclass tags performed indifferently on the spoken material, one consequence being that enclitic -’d is often tagged as had in cases where it stands for would.  Either of these factors would have caused my program to miss relevant examples, and there are other problems.  (I have already stumbled on a 62nd example purely by chance, which my software missed because of the should of problem; unless this was an extraordinary stroke of luck, there must be many further examples buried amid the millions of words which my eyes have never scanned.)


The CHRISTINE Corpus, Stage I (“CHRISTINE/I”)[7] contains a 2% random sample of BNC/demographic which has been cleaned up with respect to issues like spelling mistakes and homograph identification, and equipped with annotations marking grammatical structure according to the very explicit and detailed SUSANNE scheme (Sampson 1995).  Under the SUSANNE scheme, verb groups (sequences of main verb possibly preceded by auxiliaries) are classified as containing or not containing each of six qualifiers:



            past tense


            modal (i.e. presence of one or another of the modal verbs)




            perfect (have + past participle)


            progressive (be + present participle)


            passive (be + past participle)



In connexion with other research (Sampson forthcoming) I have studied statistics on combinations of these six qualifiers within those CHRISTINE/I verb groups that occur in finite clauses, where neither the verb group itself nor the clause is a second or subsequent conjunct (in which case its constituency might be affected by Co-ordination Reduction), and excluding tag questions, cases where the verb group is split by Subject-Auxiliary Inversion, “semi-auxiliaries” (Quirk et al. 1985: §3.47), and cases where the main verb is “understood”. CHRISTINE/I contains 9430 verb groups meeting these criteria (I shall call them “primary verb groups”) in about 80,500 “full words” (excluding hesitation phenomena, etc.).  The incidence of various combinations of qualifiers within this data set is predicted fairly well by a simple model in which each of the six qualifiers has an independent probability of occurrence, and the probability of a combination is the product of the probabilities of the separate qualifiers forming the combination.  There does not seem to be a systematic avoidance of multiple qualifiers, beyond the obvious point that the product of fractional probabilities must always be a smaller fraction.  This model predicts that groups containing each of the qualifiers modal, perfect, and progressive should occur at the rate of about 5.6 per 10,000 primary verb groups.  That would imply a frequency about three times higher than suggested by Pullum & Scholz.


Even taking the minimum figure of 61 cases for the incidence in BNC/demographic, the estimates quoted by Pullum & Scholz from Hart & Risley (1995) for children’s total exposure to language imply that linguistically deprived three-year-olds from “families on welfare” should have encountered 145 instances of the relevant auxiliary sequences – about one a week on average.  (The figure projected from the model of the preceding paragraph would give the “welfare three-year-old” 655 examples – one every day or two.)  Children from more fortunate backgrounds would have heard proportionately more instances.  It is not clear why even the lowest of these figures is appropriately described as “vanishingly rare”, or as insufficient for learning the construction. 


So far as I know, it is uncontroversial that people often begin to use a new word after encountering it only once or a handful of times; and this must be based on experience.  (There is no possibility of suggesting that the link between e.g. the phoneme string /kA:/ and the concept “car” is innately available to a child, because closely-related languages use different phoneme strings – German Auto, Danish bil – and the concept did not exist for our quite recent biological ancestors.)  Is there a reason why learning a new grammatical construction from experience should require far greater exposure than learning a new word?  If so, I have never seen the argument spelled out.


Note that I do not assume here that children actually do need to learn each possible auxiliary combination separately by hearing examples of that specific combination.  Prima facie it seems equally plausible, as suggested in the closing paragraphs of Pullum & Scholz’s §3.2, that children might infer the possibility of combinations which they have not heard, as a consequence of the simplest generalizations they can induce in order to account for the combinations they have heard.  But my point here is that, even if children did have to learn the modal + perfect + progressive construction by hearing instances of that particular construction, they would succeed, because they will hear plenty of instances.



4.  Verb-fronted constructions


I turn now to the constructions which Pullum & Scholz refer to as “auxiliary-initial clauses”.  Poverty of stimulus theorists claim that children will typically not hear examples of questions formed by fronting verbs which, in the corresponding declarative statements, are preceded by complex constituents; and hence that they will have no basis other than innate linguistic knowledge for deciding that the language they are acquiring uses a “structure-dependent” rather than a “structure-independent” question rule.


Here, if we ask what happens in spontaneous spoken English it turns out that the data depend heavily on whether we consider examples like Pullum & Scholz’s (24), e.g. Will those who are coming raise their hands? (where the complex constituent is the subject of the fronted verb), or examples like their (25), e.g. If you don’t need this, can I have it? (where the main clause is preceded by an adverbial clause).



5.  Initial adverbial clauses


Taking the latter case first, there are plenty of examples in BNC/demographic.  I have not attempted to carry out a search that is even close to exhaustive; that would not be easy with this particular grammatical pattern and this corpus resource, but it is not necessary – one does not need many examples to refute a claim expressed in language as strong as “A person might go through much or all of his life without ever having been exposed to relevant evidence” (Chomsky in Piattelli-Palmarini 1980: 40).  I searched for cases where the adverbial clause begins with if (in fact my search pattern required the BNC orthographic transcription to use If, capitalized, and a question mark, so for instance cases like … and if X, would you … would have been missed).   Provided we count only cases of yes/no rather than wh-questions, there are 22 clear cases; some examples are:


(3)   a.  If I can eat them all tonight can I have them?  KD5.05079

(Ipswich, Suffolk, Feb 1992; PS0JX, male, 27, technician, no dialect information; conversation in unspecified setting)


        b. If I’ve got four of something can I throw them away now?  KE0.00474

(Broadstone, Dorset, Feb 1992; PS0SX, no personal information; at home)


        c.  If it, if it was a {pause} {unclear} match on a Tuesday this week, would you have played?  KBF.13426

(Woking, Surrey, Nov/Dec 1991; PS04V, male, 37, HGV driver, SE England dialect; at home)


        d. If we put the old name do we get one mark?  KD0.06756

(Woking, Surrey, Nov/Dec 1991; PS0HX, female, 70+, retired, SW England dialect; at home)


Arguably wh-questions are also relevant, since they too normally involve moving an auxiliary which again must be the auxiliary of the main clause rather than the one in the preceding adverb clause (since I do not share the intuition of nativist linguists that the evidence available to language learners contains “lacunae”, I am not wholly clear what the lacuna is believed to be in this case and hence precisely what types of example are relevant for demonstrating its non-existence).  If this class of example is relevant, there are a further 23 cases in BNC/demographic (discounting the small number of cases such as If you let me do it, what happens?, where no auxiliary is moved).  One of the 23 examples is:


(4)       If you were given the choice what would you like to eat?  KBW.05644

(Redditch, Worcs., Mar 1992; PS087, female, 34, teacher/housewife, Midlands dialect; at home, lunch and afternoon activities)


As well as clear cases there are 14 debatable cases where it is not certain whether or not the wording amounts to a relevant example, for instance:


(5)       If you like, ain’t you got none?  KBR.00024

(Andover, Hants., Feb 1991; PS10D, female, age unknown, housekeeper, SE England dialect; at home)


The context of (5) gives no help in guessing whether it should be interpreted as “if you like X, haven’t you got any X?” (in which case it would be a relevant example) or as the broken-off beginning of something like “if you like, [I will do X]”, followed by “haven’t you got any Y?” as a change of tack (in which case it would not be relevant).


Taking just the figure of 22 clear yes/no cases (ignoring the likelihood that further cases were missed because of the crudity of my search pattern), if we use Hart & Risley’s middle figure of 20 million words in three years as a typical rate of exposure to speech throughout life, and if we translate Chomsky’s “much or all of [a person’s] life” as a period of 50 years, we see that BNC/demographic suggests an estimate of about 1700 instances experienced, in the case of adverbial clauses introduced by if – together, presumably, with comparable numbers of instances introduced by several other common subordinating conjunctions. 


If the typical rate of exposure is of this order of magnitude, why should it be thought plausible that some people encounter no examples at all?  What proportion of the population are alleged never to hear yes/no questions with initial adverbial clauses, and what evidence have we showing that these unfortunates themselves use the construction normally?



6.  Complex preverbal subjects in speech


Turning to cases like Pullum & Scholz’s examples (24), where the complex constituent is the subject of the fronted verb:  here the picture suggested by spontaneous speech data is considerably more complicated.


In the first place, in the spontaneous speech data available to me, it is true that yes/no questions like Is the dog that is in the corner hungry?, formed by fronting a main-clause verb which in the corresponding declarative would follow a complex subject, are vanishingly rare.  In rough searches of BNC/demographic I have found none; as always, the possibility exists that examples are present but were missed through some shortcoming of the search patterns used, but for the sake of argument I am happy to concede that the 4.2 million words of BNC/demographic may not contain a single case.[8]  Consequently, I believe Pullum & Scholz may be mistaken in thinking it implausible that a child could “reach kindergarten” without exposure to sentences like their examples (24), and in suggesting that three-year-olds will typically have encountered thousands of examples.  A kindergarten entrant could well have experienced only spontaneous speech, as opposed to written language or genres of speech (such as bedtime stories read from books) influenced by written language, where such sentences do occur.


However, absence of these patterns from BNC/demographic can be interpreted in alternative ways. 


The “poverty-of-stimulus interpretation” would hold that this kind of question is perfectly idiomatic in spontaneous spoken English.  When the communicative situation makes it appropriate to utter a question corresponding to a statement such as The man who is tall is in the room, “the child unerringly forms the question [Is the man who is tall in the room], not [Is the man who tall is in the room ]” (Chomsky 1976: 31); but the absence of examples in BNC/demographic confirms that this achievement must be based on little or no experience of precedents. 


The other interpretation is that, systematically, spontaneous speech avoids this structural pattern, and that is why there are no examples in the corpus – it is not an idiomatic form of question in spontaneous spoken English, though in written English it is.


One would expect questions like Pullum & Scholz’s examples (24) to be idiomatic only if the corresponding declaratives are so.  We have known since Yngve (1960, 1961 – cf. Sampson 1997) that English has a tendency to avoid structures in which “heavy” constituents occur other than as the last constituent of their containing construction, and that the language uses various syntactic devices, such as extraposition, to avoid the need for such structures.  In spontaneous speech this tendency seems to be stronger than in writing, so that preverbal subjects are overwhelmingly single words, usually pronouns, or at most short phrases not including subordinate clauses.


Complex subjects can of course occur post-verbally, but in that case the main verb and the first verb of the sentence coincide, so the issue about the question-formation rule does not arise.  Examples of complex subjects following the verb are:


(7)   a.  yeah but there’s stuff you put on before you gotta paint a certain square amount of area  KB6.00586

(Bristol, Dec 1991; PS02D, female, 28, housewife, SW England dialect; at home)


        b. well it – it wasn’t funny that day before I went to the {unclear} KB2.02622

(Doncaster, West Riding, Jan/Feb 1992; PS01V, female, 63, retired, Northern England dialect; at market, shopping and talking to friends)


Example (7a) is a case where there-insertion would be used to place the complex subject after the verb even in written English.  In (7b), the device of representing the complex logical subject by a preverbal it and postposing the “real subject” to a position after the verb would probably not be resorted to in writing (one might be more likely to write That day before I went to the … was not funny), but the device is very characteristic of speech.


Declaratives in which complex subjects precede verbs do occur in spontaneous speech, but they are infrequent and most of them fall into a few limited patterns.  The 80,500 words of CHRISTINE/I contain just 44 cases.  (It is not practical to search for this type of structure in the full BNC/demographic corpus, which lacks structural annotation.)  In 38 of the 44 cases, the main-clause verb is is or was used equatively.  Of these 38 cases, 29 are of one of the forms:


all + relative clause is/was


the only noun + relative clause is/was


antecedentless relative clause is/was


for instance:


(8)   a.  all you can do is {pause} put your belly up  KBJ.00999

(Nottingham, Jan 1992; PS1DV, female, 11, schoolgirl, Midlands dialect; two 11-year-old girl friends playing at the home of one of them)


        b. the only people who’ll get the Conservative parties back in is the snobby people who work at Oxford and that  KDX.00785

(South Shields, Co. Durham, Apr 1992; PS1CK, male, 22, butcher and baker, Northern England dialect; at home, preparing to listen to General Election results broadcast)


        c.  what you could really do is just erm sort of erm {pause} have a hammer {pause} necklace[9]  KD2.02915 

(Doncaster, West Riding, Jan/Feb 1992; PS0J7, female, 23, trainee typist, Northern England dialect; at home, reading a letter and making plans to go to America)


A minority example where the main verb is not equative is or was is:


(9)       every watch that comes on has to do a drill  KBJ.01274

(speaker and context details the same as for (2a) above)


I have no theories about why complex preverbal subjects occur in such specialized patterns in spontaneous speech, but for present purposes it is sufficient to note that those are the facts.


Each declarative in (8-9) asserts a proposition which it would be reasonable to question, at least if first and second person pronouns are swapped.  The “poverty-of-stimulus interpretation” is apparently committed to claiming that idiomatic spoken questions corresponding to these statements would be:


(10) a.  is all I can do put my belly up?


        b. is the only people who’ll get the Conservative parties back in the snobby people who work at Oxford and that?


        c.  is what I could really do just sort of have a hammer necklace?


        d. does every watch that comes on have to do a drill?



However, the data include other ways of constructing yes/no questions.  One is to add a tag question to the unchanged declarative sequence.  Another is to front the main-clause verb but replace the subject with that or a personal pronoun, appending the full subject at the end of the clause as a separate tone group.  (With equative sentences there is logically no difference between querying whether A = B and querying whether B = A; when one of the equated terms is a quantified expression such as all …, the only …, it is usual in questions to make this term the complement rather than the subject.)  Thus an alternative to the poverty-of-stimulus interpretation would be to claim that the forms of (10) are systematically avoided in spontaneous speech, and that the only idiomatic questions corresponding to (8-9) are sequences such as:


(11) a.  all I can do is put my belly up, isn’t it?


        b. is that all I can do, put my belly up?


        c.  the only people who’ll get the Conservative parties back in is the snobby people who work at Oxford and that, aren’t they?


        d. is that the only people who’ll get the Conservative parties back in, the snobby people who work at Oxford and that?


        e.  is that what I could really do, just sort of have a hammer necklace?


        f.  every watch that comes on has to do a drill, don’t they?


        g. do they have to do a drill, every watch that comes on?  [or perhaps “do they all have to …”]



For these question-types there seem to be abundant examples in the speech data.


If it is true that spoken language systematically avoids structures like (10), it is not difficult to suggest possible reasons.  In the great majority of cases, where the sole main-clause verb is the single word is or was, these structures involve sequences of two multi-word constituents immediately adjacent to one another, which is difficult for a hearer to process because there is no reliable cue to locate the boundary between the constituents.  That could have led to an across-the-board convention whereby question formation by verb fronting is not applied to clauses with complex preverbal subjects, even in the minority of cases such as every watch that comes on has to do a drill where this processing problem would not arise.  Of course, this in itself does not establish that examples (10) are unidiomatic – it only provides a possible explanation for their non-idiomatic status, if indeed they are unidiomatic.  My impression is that one would not normally hear questions like (10) used in spontaneous speech, but I would not wish to give weight to that intuitive feeling.[10]  Science is about observable events at particular times and places, not about somebody’s “intuitions”.


However, I do not understand what kind of evidence could be used by poverty-of-stimulus theorists to argue that (10) are good idiomatic examples of spontaneous spoken English – as they need to.  So far as the evidence available to me goes, these structures do not occur in spontaneous speech.  (If the structures did occur in the corpus, the poverty of stimulus argument would not get off the ground with respect to them.)  Why believe that these structures are missing from the data despite the fact that speakers routinely use them, rather than believe that they are missing because speakers systematically avoid them?


Incidentally, in view of the claim that speakers are able “unerringly” to produce verb-fronted questions of patterns they have not heard, it is perhaps relevant to mention that the data suggest that mature speakers find this construction a difficult one even in simple cases.  Strikingly often, speakers embark on a verb-fronted question but break off before it is complete and substitute a tag question, as in:


(12) a.  Isn’t it two {pause} it’s two oh something isn’t it?  KCH.01597

(Great Driffield, East Riding, Apr 1992; PS1BS, female, 40, optician/student, Northern England dialect; at home, playing with toys)


        b. Is that, is that {pause} it’s not a Catholic bible is it?  KCL.02209

(Redditch, Worcs., Mar 1992; PS0F8, male, 53, engineer, dialect unknown; at home, general talk in kitchen area)


        c.  Is i- that’s the one with the picture of a man on the front isn’t it?  KR1.00604

(place and date unknown; PS59U, male, 12, schoolboy, SE England dialect; at home)[11]


I have not devised a statistical method for testing whether verb-fronted questions are broken off significantly more often than other constructions, but impressionistically their failure rate seems unusually high.  If speakers have difficulty forming verb-fronted questions when the subject is a simple pronoun, it might be all the more difficult to produce them with complex subjects, where there is more wording between the fronted verb and its declarative-clause position.  This could be another reason, additional to the processing problem for the hearer, which has led to a systematic avoidance in spontaneous spoken English of verb-fronted questions with complex preverbal subjects.


I cannot claim that this latter type of question never occurs in spontaneous conversation, because I have recorded a single case from my personal experience.  But in that case the speaker used the wrong, structure-independent version of the question rule (which is why I noted the utterance down in writing at the time).  On 18 Oct 1999, in my house, a native English-speaking conversational partner (female, 51, housewife/voluntary ambulance driver, SE England dialect) told me that one advantage of her kind of voluntary work was that she never needed to ask herself:


(13)     Am what I doing is worthwhile?


I would not suggest that this utterance was other than a very unusual phenomenon; but it takes only one instance to refute a claim such as “It is certainly true that children never make mistakes about this kind of thing” (Chomsky in Piattelli-Palmarini 1980: 114).  (Note that Chomsky’s statement meant “even children who have not completed the task of language-acquisition never produce errors like (13)”.  It did not mean “children do not produce such errors, though mature speakers may” – that would make no sense in terms of Chomsky’s nativist theory of language.)



7.  Written v. spoken usage


In written English, verb-fronted questions with complex preverbal subjects are normal.  Since this point seems uncontentious, I have not tried to carry out even a rough count of their frequency in the 90 million word written-language section of BNC, but here are the first few examples I found via specialized searches of portions of the corpus:


(14) a.  Did the fact that he is accompanied by a doctor on the campaign trail help to lose him last week’s TV showdown with Clinton?  CAT.00742

(Punch magazine, 1992)


        b. Did Mr Mortimer, 69, who has an Equity card, enjoy himself?  CBC.08606

(Today newspaper, 1992)


        c.  “Is the lady who plays Alice a child or a teenager?” asked my six-year-old.[12]  B03.00647

(Alton Herald newspaper, Farnham, Surrey, 1992)


        d. Is a clause which is known to be unenforceable in certain circumstances an unreasonable one?  J6T.00908

(R. Christou, Drafting Commercial Agreements, Longman, 1993)


        e.  Will whoever is ripping the pages out of the Stoney new route book please grow up.  CG2.01379

(Climber and Hill Walker magazine, George Outram & Co., Glasgow, 1991)



It seems that the ability to produce these patterns must be one of the skills acquired in the process of moving from being an articulate speaker to being, in addition, a literate writer.  (Example (13) in §6 above would be explained as a case where a literate adult became confused when trying to use in speech a construction normally restricted to writing.)  The fact that people typically do acquire this skill does nothing to support the poverty-of-stimulus argument, unless it can be shown that some people are not exposed to relevant models in the language they encounter during literacy-acquisition.  But I have seen no attempt to document such lack of exposure.  Sampson (1999: 41-2) quoted relevant examples from material available to children; I even found a particular example (the line from Blake’s “Tiger”) which I believe is encountered by a high proportion of English schoolchildren, either read themselves or read aloud to them by a teacher, at ages when their own written output is quite limited in structural complexity. 


My two examples did not amount to a systematic refutation of the poverty-of-stimulus argument with respect to verb-fronted structures with complex preverbal subjects.  Before it is possible to construct such a refutation, we would need to know precisely what prediction the poverty-of-stimulus theorists are making in this area.  It seems that they must implicitly be claiming that there is some specific domain of spoken and/or written language – as it might be, the texts of primary-school reading books – which would need to contain examples of these structures, for speakers to learn to produce them without innate knowledge of language, but which in practice does not contain examples, or contains fewer than some crucial frequency of examples.  Only when these linguists specify the domain (and explain why just this domain and this frequency are crucial) will it become possible and worthwhile to conduct systematic tests of their lacuna prediction.


It seems conceivable that the extensive theoretical literature documented by Pullum & Scholz which has centred round sentences of the type Is the man who is tall in the room? has stemmed from linguists noticing, and misconstruing, the fact that they never hear such structures in everyday speech although they are aware that these structures are legal in the written language.  It is sobering to reflect that a tradition of scholarly discourse which claims to have such far-reaching psychological and philosophical implications may have been based on nothing more than a misunderstanding of the fairly banal fact that there are differences between the grammatical norms of colloquial and literary English.



8.  Experience adequate though unneeded


In the preceding sections I have argued that, when individuals produce verb-fronted questions, it is either clear from corpus evidence that they will have been exposed to plenty of models enabling them to choose between alternative hypotheses about the nature of the construction, or at the very least, for the special type of question which occurs only in literate usage, we have been given no evidence of lack of exposure at the period when the structure is being learned, and there is some evidence pointing the other way. 


As with the case of complex auxiliary sequences, though, note that I do not myself assume that individuals without specific innate knowledge of language would need such experience in order to choose the correct form of the question rule.  I have argued elsewhere (Sampson 1999: 111-19, 122ff.) that considerations about the nature of learning as a general process, not related to language-acquisition in particular, predict that a person acquiring the verb-fronted question construction would adopt the “structure-dependent” rather than “structure-independent” version of the rule, whether or not he had encountered crucial examples.  But my central point in the present context is that, if language-learners needed experience of crucial cases to decide between structure-dependent and structure-independent variants of the question rule, they would succeed in acquiring the structure-dependent variant, because they would have the relevant experience.



9.  The Crain & Nakayama experiments


Crain & Nakayama (1987, and cf. Nakayama 1987) studied the complex preverbal subject issue experimentally.  They elicited yes/no questions from children between the ages of 3;2 and 5;11 in response to prompts such as Ask Jabba if the boy who is watching Mickey Mouse is happy; they found that (with different frequencies at different ages) children sometimes produced correct forms such as (15a) and sometimes produced various incorrect forms, one example being (15b), but never produced the kind of incorrect form predicted by the “structure-independent hypothesis”, such as (15c).  Crain & Nakayama offered this as support for the theory of innate linguistic knowledge.


(15) a.  Is the boy that is watching Mickey Mouse happy?


        b. Is the boy who’s watching Mickey Mouse is happy?


        c.  Is the boy who watching Mickey Mouse is happy?


This might be answered along the lines of the preceding section, by saying that structure-dependence can be predicted from general considerations about the nature of learning.  But that point may be redundant, because the finding that verb-fronted yes/no questions with complex subjects are systematically absent from spontaneous speech makes it difficult to draw any particular conclusion from Crain & Nakayama’s experiments.


The BNC/demographic data suggest that, although (15a) is correct in written English, none of the question-types in (15) is an idiomatic utterance by the norms of spontaneous spoken English, which one would expect to be overwhelmingly the dominant genre of linguistic experience for at least the younger subjects in the Crain & Nakayama experiments.  In spontaneous speech one would expect to find either (16a), which is functionally equivalent to (15a), or possibly (16b), which differs from (15a) by implying the expected answer.


(16) a.  Is he happy, the boy who’s watching Mickey Mouse?


        b. The boy who’s watching Mickey Mouse is happy, isn’t he?


Although Crain & Nakayama give figures for their subjects’ production of “correct” questions of the (15a) pattern, and incorrect questions of various patterns, they do not mention any cases at all where the subjects uttered questions of the (16a-b) patterns.  This seems to imply that, whatever the subjects were doing, they were not asking questions in the way that was most natural for them.  Apparently the experimental situation must have involved an element of training the children in literary question-formation.  Crain & Nakayama’s paper does not give enough detail to show what features of the experimental situation led the children to avoid (16)-type patterns in favour of (15a-b)-type patterns.  Without knowing that, one cannot know whether those same features external to the children also explain their avoidance of the (15c) pattern.



10.  Poverty of the stimulus asserted inexplicitly


One criticism Pullum & Scholz make of the poverty of stimulus theory is that its proponents have grossly exaggerated the number of different properties of language for which arguments have been published that the properties are successfully acquired despite inadequate experience. According to Neil Smith (1999: 42), “A glance at any textbook shows that half a century of research in generative syntax has uncovered innumerable such examples”; but, after combing the literature, Pullum & Scholz say that the four cases they discuss are “all the candidates we know of”.


Comments like Smith’s are certainly hyperbolical. Nevertheless, I believe the range of published poverty-of-stimulus arguments may be more diverse than Pullum & Scholz suggest. I shall briefly discuss one further case which has a significant feature distinguishing it from Pullum & Scholz’s four examples.


Derek Bickerton (1981: 172-5) reanalyses data originally published by Antinucci & Miller (1976) about Italian children’s acquisition of verb morphology. He finds that children “early in their third year” systematically use an Italian tense distinction (“perfective v. imperfective”) to express a semantic contrast (“punctual” v. “non-punctual”) which is not the contrast to which that morphological distinction corresponds in the adult system that they are destined eventually to acquire. Bickerton treats this as support for his “linguistic bioprogram” theory, according to which language structure develops progressively in a biologically-determined sequence, such that children at this early stage are bound to express the punctuality concept rather than other time-related distinctions, whatever distinctions are expressed in the adult language they are in the process of acquiring.


Shirai & Andersen (1995) take issue with Bickerton, using data about children’s acquisition of the somewhat comparable contrast between simple past and progressive verb forms in English. They find that in the speech of mothers to children in their second and third years, the past/progressive distinction in practice shows a strong correlation with semantic distinctions, similar to Bickerton’s “punctuality”, which are not the distinction that this formal contrast is seen to express when one examines the more diverse range of usage of adult English as a whole, within which “Motherese” is one specialised genre. The children’s speech reflects this correlation in their mothers’ speech, though in a more absolute form.


Now there are clearly many ways in which one might question the force of Shirai & Andersen’s findings as a refutation of Bickerton. I do not know how close the match is between the English and Italian linguistic structures and children’s behaviours; and even if the match is close, there remains a question about why Motherese contains the usage bias described by Shirai & Andersen. (The suggestion seems to be that talk between mothers and small children naturally revolves round a limited domain of proposition-types, within which some of the logical categories that are important in the adult language in general do not arise – but Shirai & Andersen do not spell this out to any extent.) I would not be qualified to offer a definitive resolution of this particular debate; my reason for discussing it is to draw attention to the structure of Bickerton’s argument. He uses the poverty-of-stimulus concept to support a nativist account of language acquisition, yet he never explicitly asserts that the stimulus is impoverished.


The Bickerton passage is a poverty-of-stimulus argument. Bickerton finds that small children consistently use Italian in a certain way, which resembles phenomena he has found in creole languages, and he invites readers to conclude that the Italian children speak that way because of an innate “bioprogram”. Clearly this argument could work only if it is assumed that the Italian children’s usage is not a reflection of usage in the speech they are exposed to. But Bickerton does not actually tell us that the children’s experience lacks the morphological/semantic correlation observed in their output. So far as I have seen, the nearest he comes to saying this is:  “this does not reflect anything in Italian grammar; all Italian verbs, whether punctual or nonpunctual, activity or change-of-state verbs, have both perfective and imperfective past tenses” (Bickerton 1981: 174). But that is a statement about the grammatical possibilities available in the total spectrum of Italian genres, including written language, and types of spoken communication which two-year-olds are very unlikely to encounter. Nobody on any side of the theoretical debate suggests that a small child’s linguistic experience comprises a “fair sample” of all the kinds of grammatical structure he will eventually learn to deal with.


I make no criticism of Bickerton for failing to examine whether Italian Motherese is representative of the full range of Italian grammatical possibilities.  No researcher can hope to eliminate all possible avenues of counter-argument in advance. What is interesting about this case is that readers should be willing to accept an argument based on a poverty-of-stimulus assumption without that assumption even being made explicit.


One might think that, on learning that young Italian children use tenses in a way that differs from the way they are used in general adult Italian, virtually every reader would immediately assume that there must be something about the kind of Italian that young children hear which pushes them that way; and that readers would consider the possibility that the discrepancy arises from something which children themselves bring to the acquisition task only if it were shown convincingly that there is nothing special in Italian Motherese which could account for it.


The favourable reception given to Bickerton’s argument implies that, for many linguists over the last twenty years, this is incorrect. They have been so willing to believe that children learn specific things without experience that, faced with an unexpected interim learning outcome like the Italian children’s use of tenses to express punctuality, they will embrace the conclusion “this must reflect innate knowledge” without even asking “could this reflect the nature of their experience?” Yet, as Shirai & Andersen have shown, in a case like Bickerton’s it is at least entirely possible that biases in experience may explain the data (though I do not suggest that Shirai & Andersen make a watertight case against Bickerton’s bioprogram explanation).


That confirms the point in my opening paragraph that, in the current climate of opinion, it is worthwhile to produce detailed studies such as the one in the earlier sections of this article, demonstrating that children get plenty of experience of the structures they learn to produce. It seems odd that such studies should be necessary. The kind of intellectual response I sketched in the last paragraph feels rather as though a child were to run indoors and announce “Dad, the new neighbours have got twins!”, and the father were to think “it’s remarkable what children innately know about neighbours” without even considering the possibility that the child might have been round and met the newcomers. But we must always set off from where we happen to be.  If linguists currently are disposed to assume that children lack evidence for the things they learn, then studies showing that they have abundant evidence are a priority.



11.  Changing research possibilities


If poverty-of-stimulus arguments can be as inexplicit as the one I have quoted from Bickerton, then Pullum & Scholz’s survey of four cases is likely not to be exhaustive, although their four examples may well be the most explicit cases in the literature.  I surmise that the linguistic literature probably contains significantly more (explicit or implied) claims that children’s data lacks evidence bearing on some structural feature than it contains refutations of such claims by reference to empirical evidence.  But, if true, that would not imply that the unchallenged claims are correct.  One would expect to find such an imbalance, because it is far easier and less time-consuming to make a speculative claim than to assemble the evidence needed to refute it, even when that evidence does exist.


It is understandable that linguists a quarter-century ago were to some extent forced to speculate about what does or does not occur in real-life usage, because hard data on any genre of usage were scarcely available. The 1973 passage quoted from Kimball is commendable in attempting to back up its claim about vanishing rarity with empirical evidence, rather than just making unsupported assertions such as “the belief that in every case relevant … evidence has been provided surely strains credulity” (Chomsky 1976: 213), or “you can go over a vast amount of data of experience without ever finding” a relevant example (Chomsky in Piattelli-Palmarini 1980: 115), in both cases referring to the task of acquiring the correct verb-fronted question rule.  Kimball reported failure to find a single example of the auxiliary sequence he was discussing in a  “computerized sample of more than a million sentences from educated, written text”, and in view of Pullum & Scholz’s findings in their corpus this lack does seem surprising and in need of explanation.  (We are not told exactly what pattern was searched for – it is all too easy for a mechanical search for some common pattern to deliver a nil return because of a tiny inadvertent error in the search specification; and Kimball gives no more information than is quoted here about the machine-readable corpus he used.  A corpus of “more than a million sentences” is surprisingly large for the date in question; the largest corpus generally available in 1973 was the Brown Corpus, which was compiled so as to be representative of an identified domain but contained only one million words.  Kimball’s much larger collection must have been one that was not widely circulated, and possibly was in some way grammatically unrepresentative.)   Kimball also quotes a colleague who had noticed relevant examples fewer than a dozen times in eight years of conversation; again this represents an attempt to found the assertion of stimulus poverty on empirical evidence, though in this case the low observed frequency is easily explained (in practice it is not possible reliably to detect examples of a complex construction in unrecorded conversations in which one is a participant). 


In 1973 Kimball was, I think, unusual in going as far as this to found his claims about stimulus poverty on empirical evidence.  But that was about the latest date at which it was arguably excusable for other linguists to discuss such issues purely speculatively.  The machine-readable Brown Corpus had been available since 1964, with other corpora becoming available in the 1970s and later (the London-Lund Corpus of Spoken English was distributed in 1980); in the mid-1970s the introduction of Unix with its grep facility made it child’s play for anyone with access to a university computing environment to execute rough searches of corpora for patterns like the one discussed by Kimball. (A quick and dirty grep search of the million-word Brown Corpus for modal + perfective + progressive combinations gives five examples; the crude technique used cannot guarantee that these are the only examples in Brown, but they are enough to refute a claim that the construction is vanishingly rare.) For at least the last twenty years there has been no justification for making unsupported controversial claims about what does or does not occur in real-life usage (though Pullum & Scholz quote cases where such claims have been made in publications up to the year 2000). 


Even today there are particular genres of language, and particular types of structural patterning, which cannot be adequately monitored with the currently-available panoply of annotated corpora. But the scientifically responsible thing to do in such cases is to refrain from making claims until suitable data become available, rather than to erect theories of psychology on speculations about what the data will show when we have them. 



12.  Negative evidence


So far, I have discussed allegations about poverty of the positive evidence which shows the child what it is legitimate to say (or write) in his mother tongue.  But poverty-of-stimulus theorists discuss another kind of shortcoming in the evidence about language available to children:  negative evidence, identifying word-strings that do not have a use in the language, is said to be not just insufficient but completely absent.


Pullum & Scholz explicitly refrain from dealing with the negative evidence issue (“we are not going to discuss cases based merely on positivity”, their §1.1).  But the issue merits some attention here.  For many nativist linguists, lack of negative evidence in the child’s data-set is a key argument for innate knowledge of language structure, possibly even more important than alleged deficiencies in the positive evidence.  (Marcus 1993, in a discussion of the negative evidence issue, gives a useful list of publications in which its implications have been considered by various linguists from 1971 up to the date when he was writing.)


Mastering a language, it is said, means learning (among other things) what can be said in the language and what cannot be said – learning how to divide the class of all possible strings of words from the vocabulary into a class of grammatical sequences and a class of ungrammatical sequences, sometimes called “word salad” (cf. Chomsky 1957: 13).  If the evidence available to a child never identifies particular strings as ungrammatical, how can he learn to draw this distinction correctly?  Why, for instance, does he not end up with a very simple grammar according to which any string is grammatical?


To these questions there is a straightforward answer; but it is a very different sort of answer from the one I have given, above, to the claim that the child’s positive evidence is inadequate.


The nub of the “negative evidence” issue is the idea that the child’s evidence about his mother tongue is asymmetrical.  The experience of hearing a string uttered by one of his elders shows that that string is grammatical; but there is no corresponding type of experience which shows the child that a particular string is ungrammatical.  Adults speaking to children do not produce ungrammatical strings accompanied by indicators equivalent to the asterisk used in linguistics books, to show the child that those strings should be disallowed by the grammar which the child is in the process of formulating.  Adults merely fail to utter ungrammatical strings.  But the child will only encounter a finite subset of the range of grammatical strings, so failure to hear a particular string does not show it to be ungrammatical; it might be one of the many grammatical strings which simply happen not to have featured in the individual child’s sample.


Some writers have queried whether the child’s experience is as wholly asymmetrical as this suggests.  Even if adults do not utter “non-sentences” and identify them as such, children will make mistakes in speaking, and adults may correct these mistakes.  For a child to have a mistaken utterance corrected would be negative evidence as much as if the non-sentence had been produced by an adult, and arguably it might be more useful evidence, because it would be targeted on an aspect of the child’s grammar which needs further work.  And even if adults do not often explicitly correct children’s linguistic errors, they might react in ways (e.g. a puzzled facial expression) which show the child, inexplicitly but nevertheless effectively, that a particular utterance is erroneous.


However, various linguists who have studied adult/child interactions from this point of view (cf. Gropen et al. 1989: 203-4 and references cited there; Marcus 1993) claim that the child’s experience really is asymmetrical.  Children do not get feedback that is systematically correlated with errors in their linguistic output, and if they do occasionally get some negative feedback it is fairly clear that the child’s language-acquisition process makes little use of it.


From the tenor of earlier sections of this article, readers may suppose that my strategy will be to argue that the consensus of the previous paragraph is mistaken, and that linguists have failed to notice the existence of rich sources of negative feedback.  That is not my position.  I have no special expertise concerning adult/child interaction in language acquisition, but I find the claims that adults do not provide children with useful feedback in response to linguistic errors reasonably convincing.  In any case, even if we suppose that no negative evidence whatever is available to children, I do not believe any inference could be drawn that children need innate knowledge of language structure in order to succeed at language-acquisition.


If the child has no innate knowledge about language structure, then, in seeking to make sense of the language he is exposed to, he is in a situation analogous to the adult scientist seeking to understand some previously unexplored domain of natural phenomena.  Now, in the latter case it is clear that negative evidence is completely absent.  Logically it would be possible in the language-acquisition case that adults might utter ungrammatical strings to children, using a special tone of voice meaning “this is a starred string”, or (more plausibly) might show through their body language that some of the child’s outputs are ungrammatical – though, in reality, these things probably do not happen to any great extent.  But it is not even logically possible that a scientist could observe violations of laws of nature, accompanied by indicators that these are negative rather than positive examples of the phenomenon he aims to understand.  If an event violates a putative law of nature, that shows that there is a mistake in the formulation of the law.  Laws of nature are not like laws of the land, which are sometimes obeyed but sometimes violated.


The logic of how scientific discovery proceeds using exclusively positive evidence and no negative evidence is well understood by philosophers of science.  It was classically expounded by Sir Karl Popper (e.g. Popper [1934] 1968, 1963); as with any scholar, there are controversies about some aspects of Popper’s ideas, but at the very general level which is relevant here I believe there is little disagreement.  Most people understand that scientists attempt to formulate simple principles which succeed in accounting for the confusing welter of individual data in a given domain of study.  Popper’s insight was that simplicity is not enough, and that scientific theories must also be strong, in the sense that they must rule out large ranges of logically-possible events (while allowing all events that have actually been observed).  If scientific theorizing did not seek to maximize theory strength, then very simple theories could be formulated for any domain, saying in essence “Any conjunction of observations is possible”.  This would be akin to a child responding to positive linguistic evidence by formulating a grammar which implies that any string of words is grammatical.  A good scientific theory says something like “Satellites travel in ellipses (and only in ellipses)”:  this tells us something worth knowing, because while being compatible with the body of existing satellite observations it rules out a huge range of logical possibilities involving orbits of other shapes.


If children have no initial knowledge about specific domains, such as human language structure, then we must ascribe to them a disposition to respond to whatever phenomena they encounter after conception and birth using the same very general maxims which guide the scientist’s search for explanatory theories in unexplored domains.  In particular, we must expect children to respond to linguistic data by seeking to formulate strong grammars.  They do not need explicit negative evidence about ungrammatical strings, because they will aim to formulate sets of grammatical rules which characterize as many strings as possible as ungrammatical while permitting the various strings which have occurred in the positive data-set.


As with all subjects, there are complications, which I have skated over here.  I have discussed these ideas in more detail elsewhere (Sampson 1975: ch. 4; 2001: ch. 8).  But, in essence, this is the explanation of why children born with no innate knowledge of language structure do not need negative evidence in order to acquire a mother tongue.


It is noteworthy (cf. Bowerman 1988: 77) that some linguistic theorists have attributed a predisposition to choose “strong” grammars to children, treating this as a specifically language-related trait (the “Subset Principle”).  But philosophy of science suggests that it would be impossible to formulate successful theories in any domain without a disposition to maximize theory-strength.  If human beings have that general disposition, it is redundant also to postulate a language-specific Subset Principle.



13.  The Fodor & Crowther response


Since Janet Dean Fodor and Carrie Crowther’s contribution takes up some of the same themes that I have discussed above (and they quote other writing by me), I should like to comment on those aspects of their article.  I peforce quote their wording as it appears in the pre-final draft supplied to me in April 2001, as I was putting the finishing touches to my own contribution.


Some of Fodor & Crowther’s preliminary remarks serve only to confuse the issue.  They say that Pullum & Scholz, and I, quote Blake’s “Tiger” poem “as a useful source for learners of English” – we do not, of course.  They describe Pullum & Scholz as “using the Wall Street Journal as a source of information about language input to small children”, and do not shy away from using the words “absurdity” and “silliness” in this connexion.


When I introduced Blake’s poem into this debate, I was careful to say “Of course, I am not suggesting that experience of this particular line of poetry is actually decisive” for the child’s learning of the relevant grammatical point (Sampson 1999: 41).  The point was rather that (since many, probably most, English schoolchildren encounter the poem) this one example alone goes a long way towards undermining Chomsky’s assertion that it “strains credulity” to believe that every speaker has been provided with such evidence.


Pullum & Scholz point out, repeatedly, that the data sources they use are not ideal.  That does not mean that they are irrelevant.  Examples from the literary canon, or from the Wall Street Journal, are valid counterexamples to Chomsky’s claims that “A person might go through much or all of his life without ever having been exposed to relevant evidence”, “you can go over a vast amount of data of experience without ever finding such a case” (Chomsky in Piattelli-Palmarini 1980: 40, 115).  Those claims do not refer narrowly to spoken language heard by children.  But Pullum & Scholz acknowledge that it would be better to do this kind of research using corpora of “household conversation”.  They did not do so, because such resources were not available to them.  I hope that my British National Corpus data fill that gap adequately.  (For legal reasons, the BNC was not available outside the European Union before the publication of the new World Edition, a matter of weeks before the time at which I write.)


The fact that British linguists have for years been easily able to test usage claims against millions of words of transcribed everyday conversation, while a corresponding American National Corpus is at the time of writing only at the planning stage, says something interesting about the priorities of the discipline of linguistics as practised in the USA.  Pullum & Scholz cannot be held individually responsible for that resource lack.


Another preliminary point is that I do not understand Fodor & Crowther when, in the same passage, they describe “misparsings by children of perfectly grammatical adult utterances” as “defective” input to learners.  If the utterances are grammatical, presumably as input they are not defective, whatever mistakes children may make in dealing with their input.  But this is perhaps a side issue.


The central theme of Fodor & Crowther’s article is that they find it inappropriate for Pullum & Scholz to focus on the argument from poverty of the stimulus in the sense that Fodor & Crowther call “POPS”, the argument from “poverty of the positive stimulus” – which is very explicitly what Pullum & Scholz do.  Fodor & Crowther say “we show that the argument … as defined by Pullum & Scholz covers only POPS and excludes PONS [their term for what I have called lack of negative evidence]”; but it is redundant for Fodor & Crowther to “show” this, since Pullum & Scholz themselves spell the point out at length in the passage of their §1.1 which begins “we are not going to discuss cases based merely on positivity” – the latter being their term corresponding to Fodor & Crowther’s “PONS”.


Earlier in the same section, Pullum & Scholz note that the proponents of linguistic nativism deploy a wide variety of arguments, and they clearly identity the particular argument which they aim to address.  They, and I, quote a number of passages where Chomsky committed himself to the claim that crucial kinds of positive evidence are typically missing from the linguistic data available to a child or even to an adult throughout life; and their §3.4 gives a (non-exhaustive) list of references to many other linguists who have echoed this claim of Chomsky’s over the subsequent decades.


Because linguistic nativists resort to such diverse arguments, I have become familiar over the years with a scenario in which one patiently devotes time and effort to explaining why some apparently-significant nativist argument is in fact fallacious, only to be told that that particular argument was never meant very seriously anyway, so refuting it leaves the nativist ideology unshaken.  I have tried to counter this strategy elsewhere by combing the literature to identify all the different arguments deployed at various times by leading linguistic nativists, and assembling refutations of each of them between a single pair of covers (Sampson 1999).  But that was a book-length work and cannot be recapitulated as a whole here.  Pullum & Scholz, and I, have limited ourselves on this occasion mainly to discussing the specific topic to which (as I understand it) this special issue was intended to be devoted, namely the very influential argument that Fodor & Crowther call the argument from “poverty of the positive stimulus”.


In §12, above, I did discuss what Fodor & Crowther call the argument from “poverty of the negative stimulus”, the nativist argument which they feel Pullum & Scholz ought to have focused on.  I outlined the essence of my longer treatment in Sampson (2001: ch. 8).  In a footnote, Fodor & Crowther comment briefly on that publication, but their remark “simplicity metrics have a bad history of biting the hand that applies them” is too allusive for me to interpret easily.  (“Metrics” is Fodor & Crowther’s word; I do not recall this term appearing in relevant contexts in my writing or in Karl Popper’s expositions of the logic of scientific discovery, on which my discussion was based.)  If Fodor & Crowther are implying rejection of Popperian philosophy of science in general, this would surely need to be developed at a length going beyond the scope of a journal issue concerned with a different topic.


Fodor & Crowther are entitled to feel little personal interest in what Pullum & Scholz call the “APS” – the argument from poverty of the (positive) stimulus.  But that argument, as we have seen, features heavily in the literature of linguistic nativism.  In intellectual life generally, if a group urges the public to believe X because Y, and X is controversial and Y fallacious, it is normally appropriate to put time into spelling out what is wrong with Y.  Fodor & Crowther apparently see the “APS” as a special case in this respect, so that effort spent refuting it is effort foolishly wasted, but the only passage I can find in their article which explains that attitude is the remark “The assumption of many linguists and psycholinguists that language acquisition is guided by innate linguistic knowledge (Universal Grammar, UG) does not rest on APS (rhetoric aside)”.


I am not sure what Fodor & Crowther mean by their parenthetical “rhetoric aside”.  Are they suggesting that, although Chomsky repeatedly appeals to APS in his arguments for innate linguistic knowledge, and many other linguists have followed him in doing so, sophisticated members of the research community understand that these parts of Chomsky’s writings are to be discounted as empty verbal flourishes, so that serious critics ought to limit their discussion to other aspects of his Šuvre?  If so, I (and perhaps others) would be grateful for an authoritative checklist showing which of Chomsky’s pronouncements are to be taken seriously and which are to be set aside as mere Clintonisms that depend on what one means by “is” and “the”.  When Chomsky asserts that “A person might go through much or all of his life without ever having been exposed” to specified constructions, before reading Fodor & Crowther I had taken it for granted that he meant this seriously, although it is surely false.  Actually, despite Fodor & Crowther, I still believe that Chomsky meant what he wrote; and I am sure that many other linguists have thought he was right.


Fodor & Crowther go on to state that the linguistic nativist (“UG”) hypothesis “would be virtually untouched if it were discovered that APS is … just plain wrong.  That is:  the evidence for UG would not be significantly eroded if every time a child utters or comprehends some type of sentence for the first time, an adult has previously uttered an instance of that sentence type in the child’s hearing”.


In other words, with respect to the issue that Pullum & Scholz, and I, are discussing, Fodor & Crowther do not disagree.  They hold that the hypothesis of linguistic nativism is independent of the alleged “poverty of the positive stimulus”.  Pullum & Scholz, and I, do not believe that the positive stimulus is impoverished.  All of us agree that poverty of the positive stimulus does not provide a reason to believe in linguistic nativism.  It is good to have Fodor and Crowther on board.



14.  Stimulus poverty and linguistic nativism


In one respect Pullum & Scholz seem unduly kind towards the poverty of stimulus theorists. We have seen that the reason why those theorists claim that the child’s linguistic evidence is inadequate is as an argument for linguistic nativism – the hypothesis that human beings genetically inherit innate knowledge of language structure, so that for many aspects of the mother-tongue-acquisition task they do not need experience. Pullum & Scholz are careful to say they do not challenge linguistic nativism “directly, or claim that it is false”.


As a matter of logic it is true that when p entails q, showing that p is false does not show that q is false. On the other hand, stimulus poverty is one of the main arguments that has been put forward to make linguistic nativism seem plausible; David Lightfoot (1981: 165) sees it as Chomsky’s sole argument for linguistic nativism. As a matter of scientific balance of probability rather than abstract logic, if q is a novel and surprising proposition, p is its main evidential support, and p is shown to be false, I believe most people would conclude that q is probably wrong too (and would be justified in doing so). Thus an article like Pullum & Scholz’s is in effect an argument against linguistic nativism, whether presented as such or not.


But it does not matter very much whether a refutation of the poverty of stimulus concept would be fatal for linguistic nativism, because it is easy to show that linguistic nativism is wrong.


The hallmark of traits which are genetically inherited is that they show constancy across the species with respect to features which logically and in terms of the individual’s fitness might be other than they are. Consider teeth, for instance. All human beings, irrespective of ancestry and of cultural matters such as diet, inherit a set of 32 teeth. Logically it would not be contradictory to imagine a person with more or fewer teeth, and in terms of function presumably a few more or less, with compensating modifications in shape to fit the jaws, might enable people to eat just as efficiently; but the number is 32, because the human genome happens to specify that number. Now consider a clearly cultural human trait, say the games and pastimes people play: cricket, rugby, and bridge, the Basque’s pelota, mah-jongg in China, and so on. Here we find great differences between cultures that have evolved independently.  Someone who had grown up in one culture but was suddenly transplanted to a different one would have to put time and effort into learning both the rules of legal play and the strategies and techniques of successful play for games in the new culture, just as young members of that culture must. These differences in knowledge are unrelated to genetic origin:  descendants of slaves transported a few generations ago from Africa to the West Indies have been among the finest cricketers in the history of the game, while descendants of slaves transported, perhaps from the very same African villages, to the United States would commonly be nonplussed about what actions were expected if they suddenly found themselves in the middle of a cricket match.


If one asks which of these two scenarios offers a better analogue to human language, it seems too obvious to be worth saying that it is the second. Separate human cultures have developed languages which are very different from one another, so that someone who travels to another country cannot understand what is said or written there, before he has put time and effort into learning the language of the new country. Ability to use a particular language depends not at all on biological descent but entirely on individual experience; a child of Chinese ancestry brought up in an English-speaking community will be a fluent speaker of English, not of Chinese, and vice versa. The different languages of the world have no common properties, beyond very general features that could hardly be otherwise in a system serving human communicative needs (for instance, every human language has words to refer to things that are crucial to human life, such as rain, food).



15.  Contingent linguistic universals


Admittedly, for a period this last point was challenged by linguistic nativists, who suggested that, despite the differences between them, all human languages did share substantial common features which were not logically necessary or required by the communicative function of language – they were more like the fact that people have 32 teeth, a feature which we could easily imagine being otherwise but which happens to be enforced by human genetics. I shall use the phrase contingent linguistic universals to refer to hypothetical properties which are common to all human languages but which are not part of what is required for a system to be a successful human communicative system, and are not a natural consequence of the environmental pressures which have moulded all languages.


Traditionally, linguists believed that there are no contingent linguistic universals. Martin Joos is well-known for his reference (1957: 96) to “the American … tradition that languages could differ from each other without limit and in unpredictable ways”, and I do not believe this intellectual tradition was in fact limited to America. More recently, with the emergence of the “Minimalist Programme”, even generative linguists, who have been the only group to argue for the existence of contingent linguistic universals, seem to be giving that belief up. To quote Peter Culicover (1999: 137-8):[13]


At least from the Aspects theory through Principles and Parameters theory it has often been remarked that the syntax of natural language has some surprising, or at least abstract, non-obvious properties.  One example is the transformational cycle, another is rule ordering, another is the ECP, and so on.  Such properties are not predictable on the basis of “common sense”, and do not appear in any way to be logically necessary.  The fact that they appear to be true of natural language thus tells us something … about the architecture of the language faculty …  Or so the argument goes.

With the M[inimalist] P[rogramme] we see a shift to a deep skepticism about formal devices of traditional syntactic theory that are not in some sense reducible to “virtual conceptual necessity”.


But between those dates there have been many linguists who believed in the reality of contingent linguistic universals; undoubtedly some still do.


However, it is very difficult to identify in the writings of such linguists specific properties which genuinely fulfil both requirements for the status of contingent linguistic universals – they could fail to apply to a system without that system thereby being obviously unsuitable to serve the tasks of human language, and yet they in fact do apply to all known human languages. I have surveyed this literature as exhaustively as I could manage (Sampson 1999: ch. 4), and I concluded that no candidate for the status of contingent linguistic universal survives scrutiny.


Sometimes, the rhetoric of contingent linguistic universals seems to depend on the fact that most English-speaking readers have limited knowledge of languages other than English, and certainly of non-Indo-European languages, so that they have little direct feeling for how different from one another human languages can in fact be. Probably the most influential recent proponent of the belief in contingent linguistic universals is Steven Pinker.  Seeking to persuade readers that children could not acquire their elders’ language without innate knowledge, he writes, for instance (Pinker 1994: 288):


logically speaking, an inflection could depend on whether the third word in the sentence referred to a reddish or a bluish object, whether the last word was long or short, whether the sentence was being uttered indoors or outdoors, and billions of other fruitless possibilities that a grammatically unfettered child would have to test for.


The suggestion is that no actual human language has a structural property as remote as these from the features (such as subject, object, tense, number) which are relevant to the grammar rules of familiar European languages, yet if human beings did not inherit innate linguistic knowledge, there would be nothing to prevent such languages existing. Many readers who are native English speakers with perhaps schoolboy knowledge of French or Spanish have doubtless found it easy to go along with Pinker’s exposition. But the truth is that, although Pinker proposes only three allegedly “innately impossible” properties, there do exist languages with a property very like at least one of these. A number of Australian aboriginal languages (see Dixon 1980: 58ff.) have the property that speakers must switch to a radically different sublanguage, with different vocabulary and to some extent different rules, when they are in the presence of their mother-in-law (or persons equivalent to mothers-in-law in the local kinship system). This seems a very close match to Pinker’s hypothetical outdoor/indoor language:  “mother-in-law v. no mother-in-law” and “indoors v. outdoors” are variables equally remote from the familiar European grammatical features.


It is no defence of Pinker’s point here to say that his specific prediction was that no community would use a language with the “indoor/outdoor property”, and the “mother-in-law property” shared by a number of Australian languages is different, so these languages do not refute Pinker’s prediction. There are only a few thousand languages spoken in the world, and they are grouped into a much smaller number of language-families whose members are similar to one another for reasons of cultural history, independent of biology. That means that the actual languages are dotted very sparsely through the space of possible languages; there are far too few independent languages to populate that space densely. Consequently, the fact that no real language exists with some very specific property does nothing to show that that property is a contingent linguistic universal enforced by Man’s biological inheritance – it is equally plausible that the limited number of real languages just happens by chance not to include one with that property. Pinker’s discussion of a hypothetical indoor/outdoor language had some force, because he implicitly invited the reader to accept that no language with the indoor/outdoor property or any property similar to it exists, and the concept “languages with properties like the indoor/outdoor property” covers a large portion of the space of possible languages.  Absence of any real language from that entire portion might indeed suggest that biology rules such languages out. But real languages are not absent from that portion of language space.


This was a case where an alleged contingent linguistic universal was false, but only data on little-known and relatively inaccessible languages showed that it was false. Some claims that have been made about contingent linguistic universals are falsifiable from data about our own language. Again I can illustrate from Pinker’s writing.


At one point Pinker (1994: 142-3) discusses plurals of “headless” compound nouns – compounds such as sabre-tooth, which is not a kind of tooth but a kind of tiger.  Pinker explains that human beings have innate mechanisms, which he describes in some detail, that ensure that headless compounds form their plurals regularly even if the last root, when occurring outside the compound, has an irregular plural:  thus teeth, but sabre-tooths.


If the generalization about headless-compound plurals were true, it might or might not amount to an argument for innate linguistic knowledge.  But it is not true.  I picked at random foot as another English root with an irregular plural, and very quickly found cases of Blackfoot Indians referred to as Blackfeet, and pinkfoot geese referred to as pinkfeet (Sampson 1999: 98).


Pinker has attempted to respond to this point (1999: 172),[14] suggesting in essence that someone who refers to Blackfeet or pinkfeet may be thinking of the referents of the words as feet rather than complete organisms, so that for such a speaker these compounds are not “headless”.  But this suggestion is either absurd or circular.  If it means that speakers really believe that American Indians or geese are disconnected feet, it is absurd – such a speaker would behave in bizarre ways, for instance not speaking to a Blackfoot since a foot has no organ of hearing, and if use of the form Blackfeet were associated with insanity of that order we would certainly have heard about it independently of linguistics.  If, much more probably, Pinker knows that speakers who say Blackfeet or pinkfeet are as aware as anyone else that these words refer to complete humans and birds, then his suggestion that they think of them as feet seems to be circular:  we would have no independent way of deciding how someone thinks of Blackfoot Indians or pinkfoot geese other than seeing how they pluralize the words.  What Pinker wrote in 1994 amounted to a clear, testable prediction:  when any language contains a word of a specifiable type, universally speakers will inflect it in a specified way.  In 1999 he is reduced to saying that speakers will either inflect a word of the relevant type in the specified way, or not.


Despite considerable searching, I have found no proposed example of a contingent linguistic universal in the literature which seems better founded than the examples I have discussed above. (These examples were candidates that are problematic because they are not universal. In some other cases, the problem is that the candidate linguistic property is not genuinely contingent. Space does not permit discussion of that type of case here, but they are dealt with at length in Sampson (1999).)  Lack of contingent linguistic universals implies lack of specific innate linguistic knowledge.



16.  Conclusion


We know what the spectrum of human languages would have to look like, if human language-acquisition were largely controlled by detailed innate knowledge of language. All human beings would speak either the same language, or a range of very similar languages, comparable in the real world to members of a language subfamily such as the Romance languages. The grammar rules and the vocabulary would both be largely constant; there might be minor differences akin to the differences between Spanish and Italian, just as there are differences of skin colour, blood-group distribution, etc., among different branches of the human race, but if so the linguistic differences would be inherited – a child of (say) Spanish-speaking parents brought up by (say) Italian speakers would speak Spanish and would never acquire Italian.  In this situation there really would be contingent linguistic universals. For instance, it might be true that all human languages had grammatical gender; no language was a tone language; in all languages the word for “fish” began with the phoneme /p/. (I have chosen hypothetical illustrative examples suggested by my limited knowledge of the Romance subfamily.)


The linguistic situation in the real world could hardly be more different from this. Everything we know about the languages of the world shows us that they are products of cultural evolution, not of biology.


If so, it follows that individuals learn their mother tongue from experience; which in turn implies that experience is an adequate source of data for language acquisition. When that last proposition has been challenged, we have seen that the challenge has failed.



Antinucci, F. & Ruth Miller  (1976)  “How children talk about what happened”.  Journal of Child Language 3.167-89.

Bickerton, D.  (1981)  Roots of Language.  Karoma (Ann Arbor, Mich.).

Bowerman, Melissa  (1988)  “The ‘no negative evidence’ problem: how do children avoid constructing an overly general grammar?”  In J.A. Hawkins (ed.), Explaining Language Universals, Blackwell (Oxford), 73-101.

Chomsky, A.N.  (1957)  Syntactic Structures.  Mouton (‘s-Gravenhage).

Chomsky, A.N.  (1976)  Reflections on Language.  Temple Smith (London).

Crain, S. & M. Nakayama  (1987)  “Structure dependence in grammar formation”.  Language 63.522-43.

Culicover, P.W.  (1999)  “Minimalist architectures”.  Journal of Linguistics 35.137-50.

Dixon, R.M.W.  (1980)  The Languages of Australia.  Cambridge University Press (Cambridge).

Gropen, J., S. Pinker, et al.  (1989)  “The learnability and acquisition of the dative alternation in English”.  Language 65.203-57.

Hart, Betty & T.R. Risley  (1995)  Meaningful Differences in the Everyday Experiences of Young Children.  P.H. Brookes (Baltimore).

Joos, M., ed.  (1957)  Readings in Linguistics.  American Council of Learned Societies (New York).

Kimball, J.P.  (1973)  The Formal Theory of Grammar.  Prentice-Hall (Englewood Cliffs, N.J.).

Lightfoot, D.  (1981)  Review of Sampson, Liberty and Language.  Journal of Linguistics 17.160-73.

Marcus, G.F.  (1993)  “Negative evidence in language acquisition”.  Cognition 46.53-85.

Nakayama, M.  (1987)  “Performance factors in subject-auxiliary inversion by children”.  Journal of Child Language 14.113-25.

Piattelli-Palmarini, M.  (1980)  Language and Learning: The Debate Between Jean Piaget and Noam Chomsky.  Routledge & Kegan Paul (London).

Pinker, S.  (1994)  The Language Instinct.  Penguin Press (London).

Pinker, S.  (1999)  Words and Rules: The Ingredients of Language.  Weidenfeld & Nicolson (London).

Popper, K.R.  ([1934] 1968)  The Logic of Scientific Discovery, rev. edn.  Hutchinson (London).  (English translation of a book originally published in 1934.)

Popper, K.R.  (1963)  Conjectures and Refutations: The Growth of Scientific Knowledge.  Routledge & Kegan Paul (London).

Quirk, R., S. Greenbaum, et al.  (1985)  A Comprehensive Grammar of the English Language.  Longman (London).

Sampson, G.R.  (1975)  The Form of Language.  Weidenfeld & Nicolson (London).

Sampson, G.R.  (1995)  English for the Computer: The SUSANNE Corpus and Analytic Scheme.  Clarendon Press (Oxford).

Sampson, G.R.  (1997)  Depth in English grammar”.  Journal of Linguistics 33.131-51.  Reprinted as ch. 4 of Sampson (2001).

Sampson, G.R.  (1999)  Educating Eve: The “Language Instinct” Debate (revised edn).   Continuum (London and New York).

Sampson, G.R.  (2001)  Empirical Linguistics.   Continuum (London and New York).

Sampson, G.R.  (forthcoming)  Regional variation in the English verb qualifier system”.

Shirai, Y. & R.W. Andersen  (1995)  “The acquisition of tense-aspect morphology: a prototype account”.  Language 71.743-62.

Smith, N.V.  (1999)  Chomsky: Ideas and Ideals.  Cambridge University Press (Cambridge).

Trudgill, P.  (1990)  The Dialects of England.  Blackwell (Oxford).

Yngve, V.H.  (1960)  “A model and an hypothesis for language structure”.  Proceedings of the American Philosophical Society 104.444-66.

Yngve, V.H.  (1961)  “The depth hypothesis”.  In R. Jakobson (ed.), Structure of Language and Its Mathematical Aspects (Proceedings of Symposia in Applied Mathematics, 12), American Mathematical Society (Providence, R.I.), 130-8.  Reprinted as ch. 8 of F.W. Householder (ed.), Syntactic Theory I: Structuralist, Penguin (Harmondsworth, Mddx).



[1] I am grateful to Anna Babarczy for comments on a draft of this article.  Responsibility for its shortcomings is mine alone.

[2] For the British National Corpus, see

[3] Places are identified using “geographical” (pre-1974) county boundaries.

[4] Dialect classifications use a coarse scheme which divides England into four regions, North, Midlands, South West, and South East (with boundaries following the top-level isoglosses in Trudgill 1990: 63, Map 18), and treats the other three British nations as a region each, rather than the scheme used in the BNC, which is finer-grained but unsatisfactory (see §4.3 of the CHRISTINE Corpus documentation file,→CHRISTINE documentation file).

[5] The notation {unclear} means that the word overtaking was followed by material which the transcriber could not make out – but its identity is irrelevant to the point at issue.

[6] No more precise indication of location is given for this division of the BNC file, but other divisions involving the same speakers are attributed to Stoke Newington.

[7]→CHRISTINE Corpus.

[8] There are a small number of wh-questions meeting this criterion, for instance:


(6)       where’s all the parts that fell apart?  KCA.02666

(Llanbradach, Glams., Jan 1992; PS0DL, male, 32, unemployed, Welsh dialect; at home)


The 80,500 words of CHRISTINE/I contain five such cases.

[9] In context it appears that a hammer necklace is a phrase meaning “a necklace made of hammers”.

[10] Question (10d) would be normal in written English; (10a-c) would, I believe, not occur even in writing, for a variety of reasons.

[11] I assume that poverty-of-stimulus theorists would accept a 12-year-old boy as a “mature speaker” with respect to grammar.  These scholars commonly believe in a biologically-determined “critical period” for language acquisition which terminates at puberty; the current average age for onset of puberty among UK males is less than twelve.  If the example is rejected, I could give further examples from older speakers.

[12] If the quotation were exact, this example would of course be a striking refutation, by a small child, of my generalization that spontaneous speech does not contain this type of structure.  But people who write reviews of children’s puppet shows for small-town newspapers are not under an obligation to quote such remarks verbatim.  I believe the child will actually have said something like Is that a child …?, and this was converted into the relevant type of structure in the process of compiling the written review.

[13] I am grateful to James Hurford for drawing this quotation to my attention.

[14] Pinker does not explicitly refer to my counter-argument; but I take it that the passage I cite must be intended as an answer to my point, since it would stretch coincidence too far to suppose that he has independently chosen the examples Blackfeet and pinkfeet which I introduced into the debate.