Many problems in robotics have unknown, stochastic, high-dimensional, and highly nonlinear dynamics, and offer significant challenges to both traditional control methods and reinforcement learning algorithms. Some of the key difficulties that arise in these problems are: (i) It is often difficult to write down, in closed form, a formal specification of the control task. For example, what is the objective function for "flying well"? (ii) It is often difficult to build a good dynamics model because of both data collection and data modeling challenges (similar to the "exploration problem" in reinforcement learning). (iii) It is often computationally expensive to find closed-loop controllers for high dimensional, stochastic domains.
We describe learning algorithms with formal performance guarantees which show that these problems can be efficiently addressed in the apprenticeship learning setting—the setting when expert demonstrations of the task are available. Our algorithms are guaranteed to return a control policy with performance comparable to the expert's. We evaluate performance on the same task and in the same (typically stochastic, high-dimensional and non-linear) environment as the expert.
Besides having theoretical guarantees, our algorithms have also enabled us to solve some previously unsolved real-world control problems: They have enabled a quadruped robot to traverse challenging, previously unseen terrain. They have significantly extended the state-of-the-art in autonomous helicopter flight. Our helicopter has performed by far the most challenging aerobatic maneuvers performed by any autonomous helicopter to date, including maneuvers such as continuous in-place flips, rolls and tic-tocs, which only exceptional expert human pilots can fly. Our aerobatic flight performance is comparable to that of the best human pilots.
|School Location:||United States -- California|
|Source:||DAI-B 69/10, Dissertation Abstracts International|
|Subjects:||Robotics, Artificial intelligence, Computer science|
|Keywords:||Artificial intelligence, Machine learning, Reinforcement learning, Robotics|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
supplemental files is subject to the ProQuest Terms and Conditions of use.