Dissertation/Thesis Abstract

Clustering Methods for Gene Expression Data of Oxytricha Trifallax
by Houfek, Kyle, M.A., University of South Florida, 2020, 40; 27741172
Abstract (Summary)

Clustering is a data analysis method which is used in a large variety of research fields. Many different algorithms exist for clustering, and none of them can be considered universally better than the others. Different methods of clustering are expounded upon, including hierarchical clustering and k-means clustering. Topological data analysis is also described, showing how topology can be used to infer structural information about the data set. We discuss how one finds the validity of clusters, as well as an optimal clustering method, and conclude with how we used various clustering methods to analyze transcriptome data from the ciliate Oxytricha trifallax. We discuss the structure of the data set, how an optimal clustering was chosen for this data set, how the validity of the clusters was confirmed, and how biological information can be extracted using gene ontology.

Indexing (document details)
Advisor: Jonoska, Natasha
Commitee: Saito, Masahiko, Molla, Theodore
School: University of South Florida
Department: Mathematics and Statistics
School Location: United States -- Florida
Source: MAI 81/11(E), Masters Abstracts International
Subjects: Mathematics
Keywords: Ciliates, Gene ontology, Hierarchical, K-means, Topology
Publication Number: 27741172
ISBN: 9798643170372
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy