[ CogSci Summaries home | UP | email ]

Berwick, R. C., Pietroski, Paul, Yankama, Beraca & Chomsky, Noam (2011). Poverty of the Stimulus Revisited. Cognitive Science, 35, 1207--1242.

Author  =      {Berwick, Robert C. and Pietroski, Paul Yankama, Beraca & Chomsky, Noam},
Title   =      {Poverty of the Stimulus Revisited},
Journal =      {Cognitive Science},
Year    =      {2011},
Volume  =      {35},
Pages   =      {1207-1242},
Month   =      {April}

Author of the summary: Michael Bell, 2012, Michael_c_bell@hotmail.com

Cite this paper for:

   - Innate schematism of the mind associated with language acquisition
   - Structure dependence of grammatical rules
   - Empirical foundations of grammar
   - Refutation of String substitution for acquisition of language
   - Refutation of Bayesian model selection of grammars
   - Analyzation of learning from bigrams, trigrams, and neural networks


Which expression generating procedures do children acquire; and how do children acquire these procedures?

How environmental stimuli underdetermines developmental outcomes.

Many POS arguments are based on knowledge acquired with very limited experience.

Innate structure dependence of grammatical rules for language acquisition:

It is shown that infants can acquire languages by selecting “language-related” data from other complicating data related to external stimuli with ease.

Language acquisition differs between language depending on 4 typically interacting factors:

1)	Innate, domain specific factors
2)	Innate, domain-general factors
3)	External stimuli, such as nutrition, modification of visual input in very early life, exposure to distinct languages such as Japanese versus English, or the like; and
4)	Natural law, for example, physical constraints such as those determining that dividing cells form spheres rather than rectangular prisms

	Now (1) is said to be crucial while (2) and (3) weigh differently depending on the language being acquired. Also relies on the relation between declarative sentences and “polar interrogatives (yes-no questions).

"Language acquisition is a process of acquiring a capacity to produce just the valid word strings of a language (1212)"

This means that acquiring a language is more than just pairing word string with interpretations of those words. An example of this is discussed in the article and is outlined below:


(5a) Can eagles that fly eat?
(5b) Eagles that fly can eat
(5c) Eagles that can fly eat

(5a) asks whether or not eagles that do fly can eat whereas (5c) answers whether eagles than can fly do eat.

(5b) is the direct answer to the question in (5a) but (5c) is a little different in context. But how do speakers of English know that the question posed in (5a) is unambiguous? There must be some underlying structure for grammar that is not visible by simply analyzing data. The next example outlined below will further explain this phenomenon.

(10) The boy saw the main with binoculars
(10a) The boy [saw [the [man [with binoculars]]]]
(10b) The boy [[saw] the man]] [with binoculars]]

This example shows that either the man had binoculars or the boy used binoculars to see the man.

This same form of language construction also appears in languages exhibiting a different order (such as VSO)

Three recent examples aiming to promote learning from contingent experience discussed (and refuted) in this article:

1)	String substitution for acquisition
2)	Bayesian model selection of grammars (Perfors et al, PTR’s model)
3)	Learning from bigrams, trigrams and neural networks

String substitution for acquisition: (Harris, 1951 & Clair and Eyraud, 2007)

Given sentences like:
37a) Men are happy
37b) Are men happy?
37c) Are men who are tall happy?
37d) Are men who tall are happy?* 

By breaking down this example, we can see how (37b) is a coherent response to (37a) by substituting “are” with “men” to form a question. This is called weak substitution. It does not however, work with regards to (37c) and (37d)

Substituting strings in that context does not lead to a coherent response and is refuted in this article for the following 2 reasons:

1)   It does not work for English, Dutch, or other natural languages
2)   It does not address the original POS question as to which interpretations and grammatical forms could be competently paired with string words

Bayesian model selection of grammars: (Perfors et al [PTR], 2011)

"There is an evaluation metric for grammars, a proxy for grammar size that estimates a grammar’s a priori probability as directly proportional to the number of symbols in that grammar. The priori probability is then adjusted by how well it "fits" the data it receives:"

"The a priori grammar probability is multiplied by that grammar’s empirical likelihood"

The resulting computation is the posterior probability of a particular grammar and the most highly valued and selected grammar is the one with the highest posterior probability

>What this means is that you take into account the number of symbols in a language, the size of the grammar as well as other factors and you obtain a value at the end. The resulting output gives you a posterior probability of a particular grammar and innate language structures in our mind are modified to accommodate the results that we received in order to comprehend the language.

>This proposal is refuted because it does not target the question of how children acquire ‘language data’ separate from other external stimuli. Also, note that PTR’s model also does not accommodate string examples like ‘Can eagles that fly eat’ mentioned above in (5a)

Learning from bigrams, trigrams and neural networks: (Reali and Christiensen, RC)

Came up with 3 sets of models for acquisition of yes-no questions:

1)	Bigram statistical model
2)	Trigram statistical model
3)	Simple recurrent neural network (SRN) model
Bigram model Experiment 1:
-	Used child directed speech as training data
-	Bigram likelihood calculation successfully chose the correct grammatical form 96% of the time

   Refuted in this article because the model does not account for the fact that the words `who ‘and `that` are textually ambiguous which could bias the data from the experiment.
Trigram Model:
-	Calculated sentence likelihoods according to 3-word frequencies
-	Likelihood calculated using formula aforementioned in PTR’s model noted above

   Refuted because further tests show that the high accuracy of the bigrams and trigrams in distinguishing grammatical sentences from ungrammatical sentences is mostly due to the accidental homography between pronouns and complementizers in English. What this means is that the two are interchangeable without jeopardizing the grammar of the language. This finding corrupts the data supporting the experiment conducted.
Learning from simple neural network:
-	Contained hidden “context” layer
-	Network still relies on bigram statistics for derminning grammatical sentences from ungrammatical sentences

This however fails to cope with more complex strings of words since words were mapped to finite restricted categories including: DET, PRON, V, ADJ and uses the order to predict the output.


Article appropriately refutes recent POS arguments to show that innate language acquisition cannot be directly explained by models recently proposed

Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:
JimDavies (jim@jimdavies.org)