With PQDT Open, you can read the full text of open access dissertations and theses free of charge.
About PQDT Open
Search
Clustering is a data analysis method which is used in a large variety of research fields. Many different algorithms exist for clustering, and none of them can be considered universally better than the others. Different methods of clustering are expounded upon, including hierarchical clustering and k-means clustering. Topological data analysis is also described, showing how topology can be used to infer structural information about the data set. We discuss how one finds the validity of clusters, as well as an optimal clustering method, and conclude with how we used various clustering methods to analyze transcriptome data from the ciliate Oxytricha trifallax. We discuss the structure of the data set, how an optimal clustering was chosen for this data set, how the validity of the clusters was confirmed, and how biological information can be extracted using gene ontology.
Advisor: | Jonoska, Natasha |
Commitee: | Saito, Masahiko, Molla, Theodore |
School: | University of South Florida |
Department: | Mathematics and Statistics |
School Location: | United States -- Florida |
Source: | MAI 81/11(E), Masters Abstracts International |
Source Type: | DISSERTATION |
Subjects: | Mathematics |
Keywords: | Ciliates, Gene ontology, Hierarchical, K-means, Topology |
Publication Number: | 27741172 |
ISBN: | 9798643170372 |