SCHOOL OF COMPUTER SCIENCE AND SOFTWARE ENGINEERING
TECHNICAL REPORT 1998/18
Dynamic non-uniform abstractions for approximate planning in large structured stochastic domains
J Baum and A E Nicholson
The theory of Markov Decision Processes (MDPs) provides algorithms for generating an optimal policy. For large domains these algorithms become intractable and approximate solutions become necessary. In this paper we extend previous work on approximate planning in large stochastic domains by using automatically-generated non-uniform abstractions which exploit the structure of the state space. We consider a state space expressed as a cross product of sets, or dimensions. We obtain approximate solutions by varying the level of abstraction, selectively ignoring some of the dimensions in some parts of the state space. We describe a modification of a standard policy generation algorithm for the now non-Markovian decision process, which re-calculates values for nearby states based on a locally uniform abstraction for each state. We present methods to automatically generate an initial abstraction based on the domain structure and to automatically modify the non-uniform abstraction. The changes to the abstraction are based on both the current policy and the likelihood of encountering particular states in the future, thereby taking into account the agent's changing circumstances.