Data clustering, the search for hidden structure in data sets, is a field with many different methodologies, all of which work well in some situations and poorly in others. Because of this, there is growing interest in finding a consensus clustering solution that combines the results from a large number of clusterings of a particular data set. These large number of solutions can be stored in a square matrix that is often nearly uncoupled, and through clever use of theory regarding dynamical systems first published in 1961 by Herbert Simon and Albert Ando, a clustering method can be developed.
This thesis will explain the rationale behind this new clustering method and then make sure it has a solid mathematical foundation. One of the key steps in this new method is converting a nearly uncoupled matrix to doubly stochastic form. Among the contributions of this thesis is a measure of near uncoupledness that can be applied to matrices both before and after that conversion and rigorous proofs that the conversion to doubly stochastic form does not destroy the symmetry, irreducibility, or near uncoupledness of the original matrix.
Additionally, the connection between the second eigenvalue of an irreducible, symmetric, doubly stochastic matrix and the nearly uncoupled structure of that matrix will be rigorously proven, with the result being that examination of the second eigenvalue will play an essential role in the new clustering algorithm.
Actual clustering results will be presented to show that the intuitive notions and mathematical theory that constructed this method do indeed produce high quality clustering results.
|Advisor:||Meyer, Carl D.|
|School:||North Carolina State University|
|School Location:||United States -- North Carolina|
|Source:||DAI-B 72/12, Dissertation Abstracts International|
|Keywords:||Data clustering, Markov chains, Matrix balancing|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be