# Dissertation/Thesis Abstract

The Nonnegative Matrix Factorization: Methods and Applications
by Landi, Amanda Kim, Ph.D., North Carolina State University, 2015, 120; 10586642
Abstract (Summary)

In today's world, data continues to grow in size and complexity. Thus, big data analysis is an imperative process. A popular approach for big data analysis is the use of low-dimensional matrix representations which reduce data complexity and highlight significant patterns. One technique that has recently gained popularity and success is the Nonnegative Matrix Factorization (NMF). The Nonnegative Matrix Factorization is not an exact factorization, but a decomposition of data into low-rank components and residual components. It is a representation of a data array in the form of two low-rank factor matrices with nonnegative entries. In this thesis, we will investigate the NMF as a data analysis method for the general class of data, extend NMF analysis, and explore new applications.

First, we discuss the NMF as a reduced representation and describe the standard NMF algorithms by Seung and Lee. These are the algorithms that concretized the concept of NMF. However, the standard NMF are slow to converge, and may not reach a desirable solution. We develop an algorithm that finds a better and more accurate representation based on the primal-dual active set method. Second, a significant aspect of the NMF problem is determination of rank for the nonnegative factors. For this purpose, we develop a method that takes advantage of the concept of NMF-singular values, and we compare this method to the statistical Akaike Information Criterion.

In summary, we advance NMF analysis conceptually, algorithmically, and extend to new applications. Particularly, in the case of the convolution, the two factors have the clear roles: convolution kernel and signal. Atoms are the prior information that classify the convolution kernel. For the case of the point-spread function, atoms are the weights that describe the kernel. Using proper atoms, we develop a method for the blind deconvolution based on a NMF representation so that we obtain an estimate of the signal as well as the kernel. In addition, we examine the triple NMF representation to increase the capability of the bilinear NMF for clustering. We advance the representation by incorporating sparsity on a third factor such that the nonzeros then highlight significant features inferring more meaning on clusters. Furthermore, we address the Principal Component Pursuit problem in terms of the NMF. That is, we develop an NMF method to find the decomposition that separates low-rank components and sparse components from data.

Indexing (document details)
 Advisor: Ito, Kazufumi Commitee: School: North Carolina State University School Location: United States -- North Carolina Source: DAI-B 78/08(E), Dissertation Abstracts International Source Type: DISSERTATION Subjects: Applied Mathematics, Operations research, Computer science Keywords: Big data, Low-rank approximation, Machine learning, Nonnegative matrix factorization, Optimization Publication Number: 10586642 ISBN: 978-1-369-66206-1