In the era of genomics, data analysis models and algorithms that provide the means to reduce large complex sets into meaningful information are integral to further our understanding of complex biological systems. Hidden Markov models comprise one such data analysis technique that has become the basis of many bioinformatics tools. Its relative success is primarily due to its conceptually simplicity and robust statistical foundation. Despite being one of the most popular data analysis modeling techniques for classification of linear sequences of data, researchers have few available software options to rapidly implement the necessary modeling framework and algorithms. Most tools are still hand-coded because current implementation solutions do not provide the required ease or flexibility that allows researchers to implement models in non-traditional ways. I have developed a free hidden Markov model C++ library and application, called StochHMM, that provides researchers with the flexibility to apply hidden Markov models to unique sequence analysis problems. It provides researchers the ability to rapidly implement a model using a simple text file and at the same time provide the flexibility to adapt the model in non-traditional ways. In addition, it provides many features that are not available in any current HMM implementation tools, such as stochastic sampling algorithms, ability to link user-defined functions into the HMM framework, and multiple ways to integrate additional data sources together to make better predictions. Using StochHMM, we have been able to rapidly implement models for R-loop prediction and classification of methylation domains. The R-loop predictions uncovered the epigenetic regulatory role of R-loops at CpG promoters and protein coding genes 3' transcription termination. Classification of methylation domains in multiple pluripotent tissues identified epigenetics gene tracks that will help inform our understanding of epigenetic diseases.
Some files may require a special program or browser plug-in. More Information
|Advisor:||Korf, Ian F.|
|Commitee:||Chedin, Frederic L., LaSalle, Janine M., Segal, David J.|
|School:||University of California, Davis|
|School Location:||United States -- California|
|Source:||DAI-B 75/03(E), Dissertation Abstracts International|
|Keywords:||Hidden markov model, Methylation domains, R-loops|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be