Proteins are made of amino acid sequences that fold into complex 3-D structures. These 3-D structures make proteins functional therefore, structure comparison provides a more sensitive way of homology database search. Two of the greatest challenges towards comparing protein structures is attributable to the complexity of the protein macro-molecule in addition to its size. To add to the complexity is the fact that, while obtaining the physical coordinates for each atom, the reference frame selected is usually based on the most stable atom/amino acid, resulting in an arbitrary reference frame based coordinate system.
The most popular 3-D structure comparison algorithms are computationally expensive, distance-based methods that require translation and rotation to achieve perfect alignment, before calculating the root mean square distance between the two macro molecules. Furthermore, adding to the limitations of these algorithms is their inability to identify local structural similarity between proteins having very different global structures.
In this work, I propose an algorithm for 3-D protein structure representation based on Triangular Spatial Relationships (TSR) between the structural constituents (amino acids) of the proteins. These structural representations act as structural fingerprints and are invariant to rotation and translation of the protein macro-molecule and, therefore, do not depend upon the choice of the reference frame. The structural fingerprints can be used to compare two structures locally and globally. In the dissertation, a novel, global structural comparison method using TSR 3-D, is introduced. TSR 3-D has been experimentally validated to show comparable results with respect to some other standard methods, but with far less computation. A local structural comparison method to obtain insight into structural motifs is also proposed. These structural motifs are responsible for functional similarity between different proteins. TSR 3-D-based functional/structural motif discovery is shown to give better results when compared to the standard, sequence-based methods. TSR 3-D-based structural features are shown to generate better hierarchical classification results on protein SCOP classification, when compared to the commonly used flat classification. Finally a tool based on a NoSQL database using TSR 3-D has been designed for fast protein structure comparison.
|Advisor:||Raghavan, Vijay V.|
|Commitee:||Chu, Chee-Hung Henry, Miao, Jin, Xu, Wu|
|School:||University of Louisiana at Lafayette|
|School Location:||United States -- Louisiana|
|Source:||DAI-B 77/06(E), Dissertation Abstracts International|
|Subjects:||Biochemistry, Bioinformatics, Computer science|
|Keywords:||Hierarchical classification, Nosql mongodb, Protein structure comparison, Proteomics, Structural motif, Triangular spatial relationships|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be