In today's world, data continues to grow in size and complexity. Thus, big data analysis is an imperative process. A popular approach for big data analysis is the use of low-dimensional matrix representations which reduce data complexity and highlight significant patterns. One technique that has recently gained popularity and success is the Nonnegative Matrix Factorization (NMF). The Nonnegative Matrix Factorization is not an exact factorization, but a decomposition of data into low-rank components and residual components. It is a representation of a data array in the form of two low-rank factor matrices with nonnegative entries. In this thesis, we will investigate the NMF as a data analysis method for the general class of data, extend NMF analysis, and explore new applications.
First, we discuss the NMF as a reduced representation and describe the standard NMF algorithms by Seung and Lee. These are the algorithms that concretized the concept of NMF. However, the standard NMF are slow to converge, and may not reach a desirable solution. We develop an algorithm that finds a better and more accurate representation based on the primal-dual active set method. Second, a significant aspect of the NMF problem is determination of rank for the nonnegative factors. For this purpose, we develop a method that takes advantage of the concept of NMF-singular values, and we compare this method to the statistical Akaike Information Criterion.
In summary, we advance NMF analysis conceptually, algorithmically, and extend to new applications. Particularly, in the case of the convolution, the two factors have the clear roles: convolution kernel and signal. Atoms are the prior information that classify the convolution kernel. For the case of the point-spread function, atoms are the weights that describe the kernel. Using proper atoms, we develop a method for the blind deconvolution based on a NMF representation so that we obtain an estimate of the signal as well as the kernel. In addition, we examine the triple NMF representation to increase the capability of the bilinear NMF for clustering. We advance the representation by incorporating sparsity on a third factor such that the nonzeros then highlight significant features inferring more meaning on clusters. Furthermore, we address the Principal Component Pursuit problem in terms of the NMF. That is, we develop an NMF method to find the decomposition that separates low-rank components and sparse components from data.
|School:||North Carolina State University|
|School Location:||United States -- North Carolina|
|Source:||DAI-B 78/08(E), Dissertation Abstracts International|
|Subjects:||Applied Mathematics, Operations research, Computer science|
|Keywords:||Big data, Low-rank approximation, Machine learning, Nonnegative matrix factorization, Optimization|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be