Genome sequencing technologies have revolutionized biology in the past two decades, yet data analysis has lagged behind data production. In this thesis, we present a framework for analyzing genomic data in more flexible ways than previous techniques. First, the framework allows researchers to design analyses that compare genomic samples directly instead of relying on reference-relative variant calls, as most current tools do. Second, we provide utilities to look at both assembly data and resequencing data in the same analysis, where previous tools were restricted to either looking at an assembly or at resequencing data. Finally, our framework allows researchers to flexibly incorporate alignments to arbitrarily many reference sequences into their analysis.
We describe FlexReseq, the software implementation of this framework. FlexReseq allows researchers to easily customize resequencing analyses using a simple configuration file to define positions of interest. We give results from applications of these tools such as genotyping strains of Plasmodium falciparum, finding diversity and divergence between strains of Anopheles gambiae, detecting inversions based on assembly and alignment information from A. gambiae, and exploring resequencing analysis using alignments to multiple reference sequences.
|Advisor:||Emrich, Scott J.|
|School:||University of Notre Dame|
|School Location:||United States -- Indiana|
|Source:||DAI-B 73/05, Dissertation Abstracts International|
|Subjects:||Bioinformatics, Computer science|
|Keywords:||Anopheles gambiae, Comparative genomics, Heterogeneous sequence data, Plasmodium falciparum|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be