Deep learning, in particular neural networks, achieved remarkable success in the recent years. However, most of it is based on supervised learning, and relies on ever larger datasets, and immense computing power. One step towards general artificial intelligence is to build a model of the world, with enough knowledge to acquire a kind of ``common sense''. Representations learned by such a model could be reused in a number of other tasks. It would reduce the requirement for labelled samples and possibly acquire a deeper understanding of the problem. The vast quantities of knowledge required to build common sense precludes the use of supervised learning, and suggests to rely on unsupervised learning instead.
The concept of uncertainty is central to unsupervised learning. The task is usually to learn a complex, multimodal distribution. Density estimation and generative models aim at representing the whole distribution of the data, while predictive learning consists of predicting the state of the world given the context and, more often than not, the prediction is not unique. That may be because the model lacks the capacity or the computing power to make a certain prediction, or because the future depends on parameters that are not part of the observation. Finally, the world can be chaotic of truly stochastic. Representing complex, multimodal continuous distributions with deep neural networks is still an open problem.
In this thesis, we first assess the difficulties of representing probabilities in high dimensional spaces, and review the related work in this domain. We then introduce two methods to address the problem of video prediction, first using a novel form of linearizing auto-encoders and latent variables, and secondly using Generative Adversarial Networks (GANs). We show how GANs can be seen as trainable loss functions to represent uncertainty, then how they can be used to disentangle factors of variation. Finally, we explore a new non-probabilistic framework for GANs.
|Commitee:||Bruna, Joan, Cho, Kyunghyun, Couprie, Camille, Fergus, Rob|
|School:||New York University|
|School Location:||United States -- New York|
|Source:||DAI-B 79/02(E), Dissertation Abstracts International|
|Subjects:||Artificial intelligence, Computer science|
|Keywords:||Computer vision, Generative adversarial networks, Machine learning, Unsupervised learning, Video prediction|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be