Given the amino acid sequence of a protein, researchers often infer its structure and function by finding homologous, or evolutionarily-related, proteins of known structure and function. Since structure is typically more conserved than sequence over long evolutionary distances, recognizing remote protein homologs from their sequence poses a challenge.
We first consider all proteins of known three-dimensional structure, and explore how they cluster according to different levels of homology. An automatic computational method reasonably approximates a human-curated hierarchical organization of proteins according to their degree of homology.
Next, we return to homology prediction, based only on the one-dimensional amino acid sequence of a protein. Menke, Berger, and Cowen proposed a Markov random field model to predict remote homology for beta-structural proteins, but their formulation was computationally intractable on many beta-strand topologies.
We show two different approaches to approximate this random field, both of which make it computationally tractable, for the first time, on all protein folds. One method simplifies the random field itself, while the other retains the full random field, but approximates the solution through stochastic search. Both methods achieve improvements over the state of the art in remote homology detection for beta-structural protein folds.
|Commitee:||Berger, Bonnie, Hescott, Benjamin, Lin, Yu-Shan, Slonim, Donna|
|School Location:||United States -- Massachusetts|
|Source:||DAI-B 74/09(E), Dissertation Abstracts International|
|Subjects:||Bioinformatics, Computer science|
|Keywords:||Homology detection, Protein structure prediction|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be