Planning under uncertainty is crucial to the success of many autonomous systems. An agent interacting in the real-world often has to deal with uncertainty due to unknown environment, noisy sensor measurements, and imprecise actuation. It also has to continuously adapt to circumstances as the world unfolds. Partially Observable Markov Decision Process (POMDP) is an elegant and general framework for modeling planning under such uncertainties. Unfortunately, solving POMDPs grows computationally intractable as the size of state, action, and observations space increase. This thesis examines useful subclasses of POMDPs and algorithms to solve them efficiently.
We look at informative path planning (IPP) problems where an agent seeks a minimum cost path to sense the world and gather information. IPP generalizes the wellknown optimal decision tree problem from selecting subset of tests to selecting paths. We present Recursive Adaptive Identification (RAId), a new polynomial time algorithm and obtain a polylogarithmic approximation bound for IPP problems without observation noise.
We also study adaptive stochastic optimization problems, a generalization of IPP from gathering information to general goals. In adaptive stochastic optimization problems, an agent minimizes the cost of a sequence of actions to achieve its goal under uncertainty, where its progress towards the goal can be measured by an appropriate function. We propose the marginal likelihood rate bound condition for pointwise submodular functions as a condition that allows efficient approximation for adaptive stochastic optimization problems. We develop Recursive Adaptive Coverage (RAC), a near-optimal polynomial time algorithm that exploits properties of the marginal likelihood rate bound to solve problems that optimize these functions. We further propose a more general condition, the marginal likelihood bound that contains all finite pointwise submodular monotone functions. Using a modified version of RAC, we obtain an approximation bound that depends on a problem specific constant for the marginal likelihood bound condition.
Finally, scaling up POMDPs is hard when the task takes many actions to complete. We examine the special case of POMDPs that can be well approximated using sequences of macro-actions that encapsulate several primitive actions. We give sufficient conditions for macro actions model to retain good theoretical properties of POMDP. We introduce Macro-Monte Carlo Value Iteration (Macro-MCVI), an algorithm that enables the use of macro actions in POMDP. Macro-MCVI only needs a generative model for macro actions, making it easy to specify macro actions for effective approximation.
|School:||National University of Singapore (Singapore)|
|School Location:||Republic of Singapore|
|Source:||DAI-B 77/06(E), Dissertation Abstracts International|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
supplemental files is subject to the ProQuest Terms and Conditions of use.