In this dissertation, computational approaches are developed which aim to characterize the behavior of transcription factor proteins (TFs) and their target genes. TFs are the main regulators of gene expression. Understanding their behavior is essential to developing novel strategies to fight genetic disease. The goals of this work were 1) to address gaps in the literature regarding the significance of DNA binding preferences of TFs, 2) to develop benchmarks and effective algorithms for predicting TF target genes, and 3) to integrate diverse knowledge to predict known and novel protein functions for TFs and their target genes in two types of cancer. Addressing the first goal, binding site similarity algorithms were compared to other types of similarity. This comparison found that binding site similarity is a good indication of structural similarity but a poor indicator of functional similarity. Addressing the second goal, data from the Encyclopedia of DNA Elements (ENCODE) was used to construct a set of target gene benchmarks to evaluate machine learning algorithms. These experiments established the Random Forest classifier as the most effective algorithm. Lastly addressing the third goal, a network flow simulation which integrated diverse data clustered proteins based on their likelihood of interacting. The inferred pathways of the clusters confirmed known biological pathway associations with the two cancer types and predicted novel pathway associations that could form the basis of further experimental research.
|Advisor:||Loganantharaj, Rasiah Raja|
|Commitee:||Chu, Chee-Hung, Raghavan, Vijay|
|School:||University of Louisiana at Lafayette|
|School Location:||United States -- Louisiana|
|Source:||DAI-B 74/12(E), Dissertation Abstracts International|
|Keywords:||Bioinformatics, Computational biology, Functional profiling, Heterogeneous datasets, Target genes, Transcription factors|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be