@Book{Russell91, author = "Stuart Russell and Eric Wefald", title = "Do the right thing : studies in limited rationality", publisher = "MIT Press", year = "1991", series = "Artificial intelligence", address = "Cambridge, Mass", }
It discusses Korf's existing RTA*, presents a new and somewhat simplified algorithm SRTA*, which allows for more efficient search than RTA*, and then introduces DTA*, which combines a search algorithm governed by a decision-theoric control method.
g(n) is the exact cost of getting from the current node to n by the best path so farThe heuristic value h(n) is admissible if it is a lower bound on the cost of getting from n to a goal state (i.e., not overestimating the actual value). It is consistent if it satisfies the triangle inequality (h(s) <= h(s') + c(s,s') where s' is a successor of s). Consistency is stronger than admissibility.
h(n) is the heuristic estimate of the cost of getting from n to the goal
Lemma 1 (Pearl) If heuristic function h is consistent, then the f values are non-decreasing along any path away from the root.
Lemma 2 If the heuristic function h is consistent and admissible, the error in the backed-up cost estimate cannot increase with search depth.
The algorithm can avoid cycles as long as (paraphrasing lemma 3) the update procedure in the outer loop has the property that f values of each node in the cycle strictly increase between visits, specifically, the new value of current node is greater than or equal to that of the best child (lemma 4). Korf's RTA* simply updates a current node's f value with that of the second-best child. Alternative update procedures are, for example, best+1, best+10%, second-best+1, second-best+10%, etc. Since an accurate estimate of the second-best f value is not necessary, this allows for a greater degree of pruning, ... which leads us to the next algorithm.
The comparison between RTA* and SRTA* (section 5.2.5), unsurprisingly, shows SRTA* as more superior in several aspects.
The idea behind DTA* is that, for the node-selection criterion, we wish to choose the leaf node whose expansion has the greatest expected utility, and for the stopping criterion, we wish to stop searching when no node expansions have non-negative expected utility.
Let alpha and beta1 be the current best move (one with smallest backed-up estimate), and second-best move, respectively. No amount of searching can make us change our view that alpha is the best move to take unless it raises the backed-up cost estimate of alpha above the current backed-up estimate of beta1. If the new search is eventually successful (i.e., if alpha's heuristic estimate is raised above f'(beta1)), then all of the nodes in alpha's subtree with estimates lower than f'(beta1) will have been expanded (which will have been a lot of work). If the new search eventually fails to change our minds about alpha, we will wish to find this out as quickly as possible, in order to avoid as much unnecessary search as possible. It follows that hardest-first search is rational; always expand the hardest alpha-leaf, where this is the leaf descended from the current best move whose heuristic estimate currently has the lowest probability of exceeding f'(beta1) after expansion.
This probability depends on, among other things, the amount by which the leaf node has to change its value. The probability of a given size error is usually a decreasing function of the size of the error; hence the best leaf will be the one with minimum current estimate, i.e. whose current cost estimate is f(alpha). In case of tie, usually among nodes with minimum f estimate, the best leaf to expand will be the one with minimum h estimate, since the probability of a given size error is also in general a decreasing function of h; note that this will also be the deepest node among those with minimum f, in the case of uniform edge costs, since in that case g is equivalent to depth.
The algorithm uses the same outer loop as that of SRTA* (figure 5.7). Its inner loop is given in figure 5.14. The comparison between DTA* and SRTA* is also given in section 5.4, showing experiments on the 15-puzzle, in robotic path-planning, and applications in computer vision.