Variable splitting is a classic approach to decomposing a difficult optimization problem into an iterative sequence of easier sub-problems. This framework has found great success in solving signal processing problems across a vast variety of disciplines. Recently, several variable splitting techniques have been proposed for training deep neural networks, an extremely challenging optimization task. However, these algorithms are not scalable to real-world problem sizes due to their mainly offline nature. The first work presented in this dissertation is a stochastic, online alternating minimization algorithm for training deep neural networks via variable splitting, designed in collaboration with the IBM Thomas J. Watson Research Center. The proposed method demonstrates strong improvements over offline variable splitting approaches to neural network training, and also shows advantages over popular state-of-the-art backpropagation-based training algorithms.
After exploring how variable splitting can enhance deep learning, we ask a question in the other direction: how can deep learning techniques enhance variable splitting methods? One of the most widely used variable splitting frameworks, called the Alternating Direction Method of Multipliers (ADMM) is an extremely powerful optimizer but can sometimes take many tens of iterations to achieve a good approximate solution. We propose a framework that uses deep learning principles to accelerate ADMM on L1-regularized linear least squares minimization problems. The learned algorithm is generated by the recently proposed technique known as ``unrolling'', which in our case is to implement truncated-ADMM (i.e. ADMM for a fixed number of iterations) using a neural network. Then, deep learning techniques can be applied to strategically modify certain naturally arising linear operators of ADMM with data-driven corrections learned from a training dataset.
We empirically validate the resulting algorithm in a variety of sparse coding settings, i.e. where the goal is to provide sparse representation of a signal with respect to a given dictionary. Of particular interest is the challenging multiple-dictionary setting as given by morphological component analysis (MCA), a generalization of sparse coding in which the goal is to decompose a signal into additive parts such that each part has distinct sparse representation within an appropriately chosen corresponding dictionary. Our algorithm demonstrates vast acceleration over common baselines in both settings. Finally, we present a theoretical framework that shows the resulting algorithm is still an instance of ADMM, but applied to a new, learned cost function. By extending a recently proposed Stochastic Alternating Optimization analysis framework, we show that a gradient descent step along this learned loss landscape is equivalent to a modified gradient descent step along the original loss landscape. In this framework, the achieved acceleration could potentially be explained by the network's ability to learn a correction to the gradient direction of steeper descent.
|Commitee:||Choromanska, Anna, Voltz, Peter|
|School:||New York University Tandon School of Engineering|
|Department:||Electrical and Computer Engineering|
|School Location:||United States -- New York|
|Source:||DAI-B 81/2(E), Dissertation Abstracts International|
|Subjects:||Applied Mathematics, Artificial intelligence, Electrical engineering|
|Keywords:||Convex analysis, Deep learning, Machine learning, Optimization, Signal processing, Sparsity|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be