http://www.jimdavies.org/summaries/

- Rational metareasoning
- Decision theoretic meta-control
- Meta-greedy strategy
- Single-step assumption

It discusses how decision theory principles can be applied to meta-reasoning, what assumption about reasoning is needed, and what implications we can have. More importantly, what information we can make use of and what simplifications are necessary to do meta-reasoning.

- Computations as internal actions, to be selected among based on their expected utilities
- Utility is from expected effects: time (and environmental changes) and revision of agent's intended external actions

+ Computation cannot make differences

+ Utility of each outcome unknown

*

*

*

*

*

*

*

*

*

*

*

*

* Basic formula: `E[U([A])] = \sum_k P(W_k) U([A, W_k])`

* Value of computation: `V(S) = U([S]) - U(\alpha)`

* For complete computation, S is followed by an external action:
` U([S]) = U([\alpha_S, [S]])`

* For partial computation:
` U([S]) = \sum_ T P(T) U([\alpha_T,
[S.T])`

* Ideal control algorithm:

+ Keep doing the available computation
`S `with the highest `V(S) `until all are negative

+ Do action `\alpha`

* We definitely need approximation, because the utilities are not known in advance, so we have to guess! In this model, one simplification (approximation) usually holds, which is time cost.

* Time and its cost, derivation as follows

+ Intrinsic utility: `U([A, [S]]) = U_I(A) -
C(A, S)`

+ Ideally, `C(A, S) = C(S),` independent
of `A`

+ Further, assume only the duration of `S`
matters, `C(S) = TC(|S|)`

+ Benefit of computation: `\Delta(S) = U(\alpha_S)
- U(\alpha)`

+ The value of computation `S` then is: `U(S)
= \Delta(S) - TC(|S|)`
`
`

* Estimates and partial information

+ `\hat{U}^ S([A]) = E[U([A]) | e]`
for sequence

+

+ Discussion: for non-axiometic probabilities, [more like possibilities]

* Analysis for complete computations

+ From the previous model:
` \hat{V}^ S(S_j) = E[(U(\alpha_{S_j}) - U(\alpha))
| e and e_j] - TC(|S_j|)`

+

[This we speculate is the

+

* Simplifying assumptions

* Meta-greedy algorithms: expand one step, then
estimate the ultimate effects; no commitment to which external action to
take

* Single-step assumption: any computation is complete,
assume commit to action after one step of computation

* Partial computations

* Problem without partial computation formulation:
credit assignment, seperability of benefit `\Delta`

* Qualitative behavior: Only compute when it helps

* Principles:

+ A computation only affects a certain components
of the internal structure (seperability of j)

+ Changes to these components affect the agent's
choice of external action in known ways (possibility for f)

* The formulas are confusing, seems the utility of external action
is independent of what these actions are. It is used in chapter 4
and 5 as the starting point.

- none

Back to the Cognitive Science Summaries homepage

Cognitive Science Summaries Webmaster: JimDavies (jim@jimdavies.org)

Last modified: Thu Apr 15 11:07:19 EDT 1999