Latent variable models are widely used to capture the underlying structures of the data, for example, Gaussian mixture models for speech recognition, stochastic block models for community detection and topic models for information retrieval. While alternative minimization based algorithms such as EM algorithm and Lloyd's algorithm performs well in practice, there has been little theoretical advancement in explaining the effectiveness of these algorithms. In this thesis, we investigate the performance of Lloyd's algorithm and EM algorithm on clustering two-mixture of Gaussians. With an initializer slightly better than random guess, we are able to show the linear converge of Lloyd's and EM iterations to the statistical optimal estimator. These results shed light on the global convergence of more general non-convex optimizations.
We generalized the results to arbitrary number of sub-Gaussian mixtures. Motivated by the Lloyd's algorithm, we propose new algorithms for other latent variable models including sparse gaussian mixture model, stochastic block model. biclustering model and Dawid-Skene model. The proposed algorithms are computationally efficient and shown to be rate-optimal under mild signal-to-noise ratio conditions. The highlight of our theoretical analysis is to develop new proof techniques to handle the dependency between iterations, which can be applied to other iterative algorithms with explicit iteration formulas.
|Advisor:||Zhou, Harrison H.|
|School Location:||United States -- Connecticut|
|Source:||DAI-B 79/05(E), Dissertation Abstracts International|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be