Recent advances in microarray technology have enabled scientists to simultaneously gather data on thousands of genes. However, due to the complexity of genetic interactions, the functions of many genes remain unclear. The cause and progression of many diseases, like cancer and Alzheimer’s, is increasingly being attributed to the deregulation of critical genetic pathways. Data mining is now being extensively used in biological datasets to infer gene function, and to identify genetic biomarkers for disease prognosis and treatment. There is a considerable need to design algorithms that explore and interpret the underlying microarray data from a biological perspective.
In this thesis, three areas of data mining in biological datasets have been addressed. First, a new clustering algorithm has been designed that explores data from different biological perspectives. Most conventional clustering algorithms generate one set of clusters, irrespective of the biological context of the analysis. The new model generates multiple versions of different clusters from a single dataset, each of which highlights a different biological context. Second, a new classification algorithm has been designed that uses gene pairings for cancer classification. This exploits the concept that gene pairs may be a better metric for cancer classification compared to single genes. Third, a meta-analysis of human and mouse cancer datasets is integrated with existing knowledge to highlight pathways that are closely associated with cancer.
|Advisor:||Bitzer, Donald L., Heber, Steffen|
|School:||North Carolina State University|
|School Location:||United States -- North Carolina|
|Source:||DAI-B 70/05, Dissertation Abstracts International|
|Subjects:||Bioinformatics, Computer science|
|Keywords:||Classification, Clustering, Data mining, Meta-analysis, Microarrays|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be