Russell 1995: Rationality and Intelligence

[ CogSci Summaries home | UP | email ]
http://www.jimdavies.org/summaries/

Russell, S. (1995). Rationality and Intelligence. Artificial Intelligence, 94, 57–77.

@Article{Russell1997,
  author =	"Stuart J. Russell",
  title =	"Rationality and Intelligence",
  journal =	"Artificial Intelligence",
  year =  "1997",
  volume =	"94",
  number =	"",
  pages =	"57--77",
}

Author of the summary: Neil Conway, 2006, neilc NO-SPAM samurai.com.

Cite this paper for:

bounded optimality, asymptotic bounded optimality (ABO): definitions, motivations, and future directions
defining "rational" behaviour in the face of constraints on execution time and machine resources
an overview of the definitions of rational behaviour as proposed by different schools within AI
philosophical foundations of AI; what does the "intelligence" in "artificial intelligence" mean?
metareasoning and agent design: how do we balance the time spent on reasoning vs. metareasoning?

This paper is available online: http://www.cs.berkeley.edu/~russell/papers/aij-cnt.pdf.

Defining intelligence

The "long-term goal" of AI is defined by the author to be "the creation and understanding of intelligence" [p1]. But this begs the question: how is "intelligence" defined? Can we find a general way to characterize intelligent behaviour? If this cannot be done, we risk the balkanization of AI into a set of related specializations rather than an integrated field of study [p2].
The author proposes four candidate definitions of intelligence: perfect rationality, calculative rationality, metalevel rationality, and bounded optimality.
- The definitions will be compared in terms of the class of systems that satisfy the definition, and the sorts of research directions that the definition encourages.
The author advocates the "agent-based" approach to AI. That is, AI is considered to be the study of systems that "do the right thing", for some definition of "right" [p3].
- An agent is defined by the mapping from sequences of percepts to actions.
- This approach to AI focuses on the agent's behaviour, without making assumptions about the process used by the agent to decide on what actions to take: "What counts in the first instance is what the agent does, not necessarily what it thinks, or even whether it thinks at all" [p3].

Perfect rationality

A perfectly rational agent always acts in such a way as to have the maximum expectation of success given the information available [p4].
Constructing a perfectly rational AI is impossible, regardless of the speed of computational devices: "physical mechanisms take time to process information and select actions, hence the behaviour of real agents cannot immediately reflect changes in the environment and will generally be sub-optimal" [p6].
- Perfect rationality "can be used to establish an upper bound on the performance of any possible system," by comparing the performance of the system with the theoretical performance of a perfectly rational agent in the same environment [p5].

Calculative rationality

"Calculative rationality is displayed by programs that, if executed infinitely fast, would result in perfectly rational behaviour" [p6]. This definition of rationality was used for much of the early work in AI.
Problem: many calculatively rational agents need to solve intractable decision problems (NP-Hard problems and similar) [p7].
- Correctly solving such problems can require enormous computational effort; barring an unexpected breakthrough like P=NP, this is unlikely to change.
- This violates the assumption that the program implementing this behaviour can be executed infinitely fast, even given enormous improvements in machine resources.
One workaround is to ignore reasoning tasks that require exponential time (Levesque 1986, Levesque and Brachman 1987). (The assumption that we can solve polynomial-time problems infinitely quickly is at least somewhat more plausible.)
The most common solution is to employ various approximations and domain-specific performance improvements to attempt to achieve reasonable behaviour [p8]. Much AI research could be described in this way.
- This basically involves accepting that "intelligent agents will be unable to act rationally in all circumstances" under this definition of rational behaviour — most of these techniques do not guarantee optimality [p8].
- The "standard methodology" is to "first build calculatively rational agents and then speed them up" [p22].
  - The author is skeptical of the value of this technique: the algorithm we would choose if it could be executed infinitely fast may bear little or no relation to the optimal algorithm given finite time and resources [p22].

Metalevel rationality

The preceding definitions of intelligence do not take into account the time required by the agent to select a rational action: they assume that the agent will choose the "right" action infinitely fast.
This is inconsistent with many realistic environments. The time that the agent takes to choose an action may well be non-trivial, and the time taken to make this decision often influences the outcome. Metalevel rationality is defined by Good 1971 as "the maximization of expected utility taking into account deliberation costs".
Metalevel rationality is typically implemented via a metalevel architecture [p8], where the agent is divided into two components:
1. The "object level carries out computations concerned with the application domain" [p8]. These are the computations that allow the agent to perform whatever actions it is intended to perform.
2. The "metalevel is a second decision-making process whose application domain consists of the object-level computations themselves and the computational objects and states that they affect" [p8].
In the metalevel, object-level computations are viewed as actions: their cost is the time taken to perform the action, and the benefit is whatever improvement in decision quality the action yields. Reasoning at the metalevel includes reasoning about how long to spend making the decision, how to allocate the available time among the available computational actions, and how to order those actions.
One area of debate is whether metareasoning should incorporate domain-specific knowledge and reasoning techniques.
- In systems with domain-specific meta-reasoning, "metaknowledge is a sort of 'extra' domain knowledge ... that one has to add to an AI system to get it to work well" [p10].
- However, we can reason about the cost and effect of computations in a domain-independent fashion: "in principle, no additional domain knowledge is needed to assess the benefits of a computation" [p10].
- Performing metareasoning from first principles is extremely expensive, so most practical systems incorporate domain-specific knowledge in some form. It might also be possible to learn this knowledge from experiences with computations in a particular application domain.
There is much room for future work in this area. For example, current metalevel systems are myopic: they perform metareasoning for a single computational action at a time. This is not always ideal: "the value of a computation may not be apparent as an improvement in decision quality until further computations have been done" [p10].
- The problem of selecting the optimal sequence of computational actions is usually intractable — the author suggests that machine-learning techniques might be employed to workaround this.
Another problem with metalevel reasoning is that it does not account for the time taken to perform metareasoning itself. "Since metareasoning is expensive, it cannot be carried out optimally"; therefore, a tradeoff between time and optimality of metareasoning must be made [p10].
- This tradeoff cannot be made by introducing a "metametalevel", as that raises the question of how much metametareasoning ought to be done. Introducing more levels of reasoning just causes "conceptual regress" [p11].

Bounded optimality

Bounded optimality was informally defined by Horvitz 1989 as "the optimization of computational utility given a set of assumptions about expected problems and constraints on resources" [p11]. These assumptions are defined by the machine the program is running on.
This definition builds on the complementary notion of feasability, as defined by Russell and Subramanian 1995. The set of all agent functions that can be implemented by some program running on a specified machine is the set of feasible agents for that machine. The best agent from the set of feasible agents behaves with "bounded optimality" for the specified machine [p11].
One problem with bounded optimality results is that they are specific to a single formal machine [p13].
- This problem is analogous to that faced in the analysis of algorithm performance: we typically want to prove performance results about algorithms that are independent of the details of a particular machine configuration. This is typically achieved using the O() notation.
- Russell and Subramanian 1995 introduce a similar idea to allow the construction of robust bounded optimality analysis. Worst-case asymptotic bounded optimality (ABO) means roughly that the agent program "is basically along the right lines if it just needs a faster (larger) machine to have worst-case behavior as good as that of any other program in all environments" [p13].
This is the definition of intelligence preferred by the author.

Future work

The remainder of the paper sketches some areas for future research related to bounded optimality and ABO agents.

Calculative rationality improvements

Much of the prior work on calculative rationality has assumed that the environment is fully observable, deterministic, static, and discrete.
While there has been some work for defining rational behaviour in more uncertain domains (such as POMDPs for stochastic, partially-observable environments), the author suggests that more fundamental work is needed here. Defining calculatively rational behaviour for a more general class of environments would be the "AI equivalent of bricks, beams, and mortar with which AI architects can build the equivalent of cathedrals" [p14].

ABO proofs

There is a crude methodology for proving that agent programs are ABO: [p16]
1. Define a virtual machine that can execute a set of agent programs.
2. The programs are divided into subsets — in current research, agents are often divided into "architectural classes" based on their structural properties [p20].
3. Comparisons can be made between the best programs in each subset. For example, one class of agents might be shown to be asymptotically dominated by another class [p20].
For practical problems, this is difficult to do, and yields results that are too specific to the details of the proof methodology. A research goal is to enable bounded optimality results that are more general [p17].

Optimization techniques

Earlier work has identified some techniques for optimizing the temporal behaviour of an agent's components (e.g. Zilberstein and Russell 1996). While these techniques are currently somewhat primitive, the important point is that they are domain independent: they are unrelated to the content of the components they are optimizing [p17].
Therefore, these techniques "enable bounded optimality results that do not depend on the specific temporal aspects of the environment class" -- that is, they are domain-independent tools for building ABO agents [p17].
Another way to optimize an agent's temporal behaviour is via machine learning techniques such as inductive learning, reinforcement learning, and knowledge compilation (Russell and Subramanian 1995, Laird et al. 1986).

Offline vs. online

An offline mechanism for creating bounded-optimal agents is independent of the agent itself (and is therefore not subject to bounded optimality constraints). A typical offline agent constructor would create agents that are ABO for a specific environment [p19].
In online agent construction, the construction mechanism is a part of the agent itself. This allows the agent to be more flexible: an agent can be constructed that is bounded optimal for whatever environment the agent finds itself in [p19].

Variable agent architecture

Agents that learn can adapt their behavior, but prior research on adaption has usually applied learning techniques only to select components of the agent. Given finite memory, there are problems where "almost complete revision of the agent's mental structure is needed to achieve good performance" [p20] -- that is, learning techniques would be applied to the entirety of the agent's architecture.

Summary author's notes:

Page numbers are from downloaded version.

Back to the Cognitive Science Summaries homepage
Cognitive Science Summaries Webmaster:

JimDavies (jim@jimdavies.org)