Dissertation/Thesis Abstract

Integrating Heterogeneous Datasets for Functional Profiling of Transcription Factors and Their Target Genes
by Bible, Paul W., Ph.D., University of Louisiana at Lafayette, 2013, 115; 3589955
Abstract (Summary)

In this dissertation, computational approaches are developed which aim to characterize the behavior of transcription factor proteins (TFs) and their target genes. TFs are the main regulators of gene expression. Understanding their behavior is essential to developing novel strategies to fight genetic disease. The goals of this work were 1) to address gaps in the literature regarding the significance of DNA binding preferences of TFs, 2) to develop benchmarks and effective algorithms for predicting TF target genes, and 3) to integrate diverse knowledge to predict known and novel protein functions for TFs and their target genes in two types of cancer. Addressing the first goal, binding site similarity algorithms were compared to other types of similarity. This comparison found that binding site similarity is a good indication of structural similarity but a poor indicator of functional similarity. Addressing the second goal, data from the Encyclopedia of DNA Elements (ENCODE) was used to construct a set of target gene benchmarks to evaluate machine learning algorithms. These experiments established the Random Forest classifier as the most effective algorithm. Lastly addressing the third goal, a network flow simulation which integrated diverse data clustered proteins based on their likelihood of interacting. The inferred pathways of the clusters confirmed known biological pathway associations with the two cancer types and predicted novel pathway associations that could form the basis of further experimental research.

Indexing (document details)
Advisor: Loganantharaj, Rasiah Raja
Commitee: Chu, Chee-Hung, Raghavan, Vijay
School: University of Louisiana at Lafayette
Department: Computer Science
School Location: United States -- Louisiana
Source: DAI-B 74/12(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: Bioinformatics, Computational biology, Functional profiling, Heterogeneous datasets, Target genes, Transcription factors
Publication Number: 3589955
ISBN: 9781303291364
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest