Dissertation/Thesis Abstract

Towards Temporal Semantic Scene Understanding
by Ghafarianzadeh, Mahsa, Ph.D., University of Colorado at Boulder, 2017, 152; 10271366
Abstract (Summary)

The ability to quickly and accurately understand pixel-level scene semantics is a key capability required for various robotics applications such as autonomous driving. Until now, the temporal aspect of this problem has been largely overlooked. Therefore, the focus of research in this dissertation is to study the impact of temporal information in perception-related tasks and investigate whether it is useful to be included, more specifically for semantic scene understanding.

In this thesis, we first propose a set of novel techniques for unsupervised spatio-temporal segmentation in video sequences to obtain regions that are coherent in space and time. We then extend our method to exploit other strong cues present in the scene such as the depth signal or object parts to further improve the accuracy. The bottleneck in studying the temporal data is caused by both the limitations in computing resources and/or the lack of existing comprehensive labeled data. We tackle these issues by introducing a simple and efficient unsupervised label propagation algorithm that transfers the pixel-wise semantic labels from a groundtruth frame to its adjacent neighbor frames and produces auxiliary temporal groundtruth. Finally, we take a further step towards the ultimate goal of holistic scene understanding and present a deep, recurrent multi-scale network that is capable of leveraging the temporal information present in the video data. We show that our model can be easily extended to the related problem of prediction to estimate the expected semantics of the scene a small number of frames into the future. We achieve promising state-of-the-art results on various datasets and prove that our temporal approach is superior to the non-temporal baseline.

Indexing (document details)
Advisor: Sibley, Gabe
Commitee: Blaschko, Matthew, Correll, Nikolaus, Heckman, Christoffer, Martin, James, Sibley, Gabe
School: University of Colorado at Boulder
Department: Computer Science
School Location: United States -- Colorado
Source: DAI-B 78/10(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Robotics, Artificial intelligence, Computer science
Keywords: 3d superpixels, Efficient spectral clustering, Semantic scene understanding, Spatio-temporal segmentation, Temporal object discovery, Video understanding
Publication Number: 10271366
ISBN: 9781369785784
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest