The web page for this research can be found at:
http://www.jimdavies.org/research/despina/
Contents:
It was originally made to model high-level reflective thinking
where the smallest operations take a few seconds.
To evaluate the quality of a model, one must compare the results of
the model to human subject data. Cognitive psychology focuses on
performance (often accuracy) and reaction time, so it is important
that a cognitive architecture is able to make predictions of these
sorts.
A Four-stage retrieval/decision model (Ashcraft & Battaglia 1978) has
also been suggested. Facts are functionally represented as a table,
and the RT is proportional to distance traveled during the
search. Table is "stretched" for larger sums (post-hoc) to account for
exponential time fitting. Next a decision is made (comparing the
result in the table to the stimulus in a verification task). In this
model the decision takes constant time for positives, but for
negatives time proportional to the difference between correct and
incorrect. This accounts for the split effect, which is that people
can verify that a fact is false faster the further off the answer
is. In fact, large splits are faster than true facts in some
cases. That is, 234+321=4 is identified as false faster than
234+321=555 is identified as true.
In answer production tasks, the decision part does not take place.
Revised fast-access Accounts for the previous results by making the
following modification: Retrieval failure probability is a function of
size of min. With this change one can account for the exponential
increase.
Modeling Accuracy
In arithmetic experiments, accuracy and reaction time are
measured. Right now Despina never makes any mistakes, so it is unable
to replicate the mistakes humans make. It either retrieves the answer
from memory or it counts up to get it, but in either case the correct
answer is guaranteed. Lebiere's ACT-R model (1998) makes mistakes by
retrieving the wrong chunk at some point. This happens because of
partial matching and differences in activation. If 3 + 1 is retrieved
much more often than 3 + 2, then when asked what 3 + 2 is, the 3 + 1
fact may be retrieved, because the correct fact is insufficiently
active and the incorrect fact matches the goal partially. The other
way the model could make a mistake is that it could retrieve the wrong
next number at some point while counting up.
In the current version of Sirrine there is no way to model this. In
Despina the table used for finding the appropriate chunk in memory is
deterministic in that the same question will always result in the same
fact being retrieved, if any fact is retrieved at all. There is a
similar deterministic lookup table for incrementing numbers too.
SIRRINE2 could be modified to allow this kind of error. The most
straightforward and uncontroversial way to do this would be to
introduce an activation level for concept-instances or values, and to
allow spreading activation to occur. In the underlying architecture
there could be a retrieval threshold set for how activated a
concept-instance must be. This might prevent the correct fact from
being retrieved at times. So doing this would have an effect on which
strategy gets used, and might have results where no answer is given,
but would never result a wrong answer.
To get a wrong answer a question must have the potential to retrieve
different concept-instances at different times. ACT-R's method of
partial matching seems to be a straightforward solution to the
problem. It seems that partial matching of some kind would be
necessary, even if it were implicit in some kind of retrieval by
activation level. For example, the question is activated in the task,
and that spreads activation to facts, and the fact most activated gets
retrieved. Even in this scenario a fact wouldn't become the most
activated unless it shared features of the question. This retrieval
method could allow errors in retrieval during counting as well.
Another way to get around the problem might be to have the table refer
to a list of things to return in the order of priority. In this way
the retrieval attempts might resemble the :by slot in a task. For
example, upon seeing the problem 1 + 2, the table would try to return
1 + 2 = 3 first, and failing that return 1 + 3 = 4. This would be
consistent with the way that a task has an unchangable order in which
it tries different strategies. On the downside, it is unclear what the
rationale would be for which concept-instances would go in the list
and what the order of the items would be.
Modeling Reaction Time
SIRRINE2 makes no explicit reaction time predictions. Different
strategies take different numbers of steps, though, and the number of
steps could be claimed to be proportional somehow to the amount of
time taken. For example, one could say that the entire cycle required
to count up by one takes somewhere between 20 and 400ms. Since people
often choose the biggest number and count up, such an interpretation
would result in a model that fairly accurately fits the data (recall
that the RT curve can be fairly well modeled by the minimum of the two
numbers).
However it also seems to be the case that simple retrieval takes a
variable amount of time as well (Lebiere 1998). That is, you can
retrieve facts 1 + 1 = 2 and 5 + 2 = 7, but the latter will take longer
than the former. ACT-R solves this problem by making the
retrieval time for a chunk dependent on the activation level.
Modeling arithmetic shows the limits of SIRRINE2. But if we intend for
it to be a cognitive architecture then tasks of this sort will need to
be dealt with sooner or later. Perhaps if high and low level cognitive
tasks are considered from the start the architecture will have a
better chance or avoiding assumptions early on that will show
themselves to be unworkable when the range of phenomena to be modeled
expands. In other words, we want to avoid making theoretical claims
now which will prevent SIRRINE2 from modeling low level tasks later.
This article shows that in order for SIRRINE2 to be capable of
successfully modeling arithmetic (and likely other tasks as well), it
requires the following: 1) It needs some way to make retrieval errors,
and 2) it needs to make theoretical claims about the time it takes to
do things. When these changes are made then SIRRINE2 will be a better
cognitive architecture, capable of modeling accuracy and reaction
time, two of the most common measures in cognitive psychology.
Ashcraft, M. H., & Battaglia, J. (1978). Cognitive arithmetic:
Evidence for retrieval and decision processes in mental
addition. Journal of Experimental Psychology: Human Learning and
Memory, 4, 527-538.
Ashcraft, M. H. & E. H. Stazyk (1981). Mental addition: A test of
three verification models. Memory & Cognition. v9 pp 185-196
Groen, G. J., & Parkman, J. M. (1972). A chronometric analysis of
simple addition. Psychological Revies, 79, 329-343.
Lebiere, C. (1998). The dynamics of cognition: An ACT-R model of
cognitive arithmetic. Ph.D. Dissertation. CMU Computer Science Dept
Technical Report CMU-CS-98-186. Pittsburgh, PA.
Newell, A. (1990) Unified Theories of Cognition. Harvard University
Press. Cambridge, Massachusetts.
Stroulia, E. (1994). Failure-Driven LEarning as a Model-Based Self
Redesign. Ph.D. dissertation, Georgia Institute of Technology, College
of Computing.
Stroulia, E. & Goel, A. K. (1994). Reflective self-adaptive problem
solvers. In G. S. Luc Steels & W. V. de Velde (Eds.), Proceedings
of the 1994 European Conference on Knowledge Acquisition: A Future for
Knowledge Acquisition. Germany: Springer-Verlag.
Part 2: SIRRINE2
SIRRINE2 evolved from the reasoning shell of AUTOGNOSTIC (Stroulia
1994, Stroulia & Goel 1994). It has a semi-formal TMK language which
gives an agent explicit knowledge about what it can do and how. The
tasks specify what kind of information gets input and output. The
methods describe the control of subtasks, and the knowledge is the
data that gets manipulated. Part 3: Cognitive architectures
A cognitive architecture is a high-level modeling language that makes
claims about basic cognitive processes. Models written in an
architecture are built using the primitive elements of cognition
allowed by the architecture. For example, in ACT-R, a cognitive
architecture from John Anderson's lab (1998), there are two kinds of
memory: Procedural (consisting of productions) and Declarative
(consisting of chunks of information).
Part 4: Arithmetic
The small part of arithmetic under study is addition. There have been
several models suggested for what people do when adding. In the MIN
model (Groen and Parkman 1972) people count up from the larger
number. The curve for reaction time (RT) can be fit to the increase of
the minimum number in the problem (Ashcraft & Stazyk 1981). The
SUM-squared (Ashcraft & Battaglia 1978) model was shown to be better
than MIN, because an exponentially increasing RT is difficult to
reconcile with an increment model. In adults, most facts are
retrieved, which suggested the Fast-access model (Groen and Parkman
1972): the facts not retrieved (about 5%) need to be counted at a
400ms/increment rate. The exponential increase in RT predicted by the
SUM-squared is problematic for this model.Part 5: Despina
Depina has two methods for answering math questions: retrieval and
counting. If retrieval is impossible because the fact in question is
not in memory, then the systems uses the counting method. Despina has
a large memory of all addition and subtraction facts involving numbers
one through ten, the result of which is positive. Following is an
example of despina's math facts:
(defconcept-instance math-fact_7+5=12-concept-instance
:domain-concept math-fact-domain-concept
:symbol fact_7+5=12-value)
(setf fact_7+5=12-value '(seven plus five twelve))
The lower number is always second, and the reverse is not represented
(5+7=12 is not in the memory, only 7+5=12).
If asked what the sum of 7 and 5 is, Despina checks its fact
memory. There is a table indexed by the numbers and operator, pointing
to the fact if it is memory. If it is there, it is retrieved and
output. If not, then the system counts up from the smaller number
using the following count-up-method.
(defmmethod count-up-method
:transitions
(deftransition
; normally you would find the min here and make
; that the start sum. For now we will assume that
; the minimum number is given second. So set-sum
; sets the sum to the first argument.
(:initial :start
:subtask set-sum-task
:succeed s1
:fail )
; how many so far starts at zero.
(:initial s1
:subtask set-how-many-so-far-task
:succeed s2
:fail )
; set-count-up-to sets count-up-to to the second arg
(:initial s2
:subtask set-count-up-to-task
:succeed s3
:fail )
; when they are equal, the sum is correct
(:initial s3
:subtask confirm-how-many-so-far-equals-count-up-to-task
:succeed :succeed
:fail s4)
; if the sum is not correct, then increment one to the sum.
(:initial s4
:subtask increment-sum-task
:succeed s5
:fail :fail)
; Also increment how-many-so-far.
; Then check to see if the sum is correct again.
(:initial s5
:subtask increment-how-many-so-far-task
:succeed s3
:fail :fail)))
It starts with the smaller number as the sum, then keeps incrementing
until the number of times incremented equals the number being
added. When these numbers are equal, the method succeeded. Part 6: Discussion
Despina can attempt to retrieve an answer to a math question and
return the answer if it is found. If that method fails, it can try its
second strategy, counting up. Could Despina be used to model
experimental data?
Conclusion
SIRRINE2 was designed to model high-level thinking on the order of
seconds or longer. Arithmetic was chosen because it was a fairly
simple phenomenon with interesting properties (retrieval, multiple
strategies, etc.) and because it had already been modeled in ACT-R,
which allows comparison. Arithmetic may not be a fair test because
SIRRINE2 was not designed to model cognition at this level; we are
asking SIRRINE2 to do something it was not meant to. Soar and ACT-R
focus on theoretical claims of mental events that take a second or
less (Newell 1990, pp 80-1), leaving higher level mental events for
the modeler to program. SIRRINE2, on the other hand, can be looked at
as a architecture that is approaching cognition from the other
direction. That is, SIRRINE2 makes claims about high level cognition,
leaving the low level details ambiguous. Unlike other architectures,
though, SIRRINE2 does not allow the modeler to specify the details of
the lower level cognition. The quality of SIRRINE2 as a
cognitive architecture should be determined through an examination of
the kinds of models it was meant to run. Part 7: References
Anderson, J. & C. Lebiere (1998). The Atomic Components of
Thought. Lawrence Erlbaum: Mahwah, NJ.
JimDavies
(
jim@jimdavies.org
)
Last modified: Mon Apr 24 14:22:14 EDT 2000