Dissertation/Thesis Abstract

Integrating protein similarity networks and orthogonal information for understanding protein origins and function
by Barber, Alan Edgel, II, Ph.D., University of California, San Francisco, 2012, 602; 3553805
Abstract (Summary)

Biology's entrance into the genomic age has meant dramatic changes. Biologists once carried out painstaking, low-throughput experiments, but now often rely on massive high-throughput experimental centers and `big data'. In modern biology, the quantity of data scientists can create vastly outstrips their corresponding ability to analyze and understand its full meaning. This means that one of most pressing current challenges is to create methods that can manage, organize and visualize massive datasets with the goal of assisting biologists in creating and testing hypothesis.

The computational solution presented in this dissertation is that of the protein similarity network (PSNs) and its implementation and usage. These networks are constructed by using an all-by-all pairwise comparison of a protein entity or feature, of which a network can be visualized. These networks assist in showing proteins of interest within their context, whether it is in a sequence, structure or functional context; and in creating hypothesis about how the data of interest relate to the much larger whole.

First, Pythoscape will be presented which is a novel software framework for the creation, modification and output of large PSNs. It will be described along with an overview and description of the architecture of the framework, as well as an example using the glutathione transferase superfamily to show the power of the framework in investigating the sequence and structure relationships of large protein superfamilies.

Second, an application of Pythoscape to the alkaline phosphatase superfamily is presented. PSNs are used to generate evolutionary hypothesis for this large protein superfamily. These networks, in conjunction with phylogenetic trees, are used to propose an evolutionary model that can annotate protein function more accurately and which also demonstrates the complexity of evolution in large mechanistically diverse enzyme superfamilies.

Finally, an application of Pythoscape to the kinase superfamily is presented. We use PSNs to study how members of this superfamily are targeted by caspases, proteases that are activated during apoptosis. This preliminary research demonstrates that sequence similarity and function do not always track and that other orthogonal sources of information may be necessary for accurate annotation.

Indexing (document details)
Advisor: Babbitt, Patricia C.
Commitee: Sali, Andrej, Wells, James A.
School: University of California, San Francisco
Department: Pharmaceutical Sciences and Pharmacogenomics
School Location: United States -- California
Source: DAI-B 74/06(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Biochemistry, Bioinformatics
Keywords: Alkaline phosphatase superfamily, Apoptosis, Caspase, Computational biology, Protein evolution, Protein origins, Protein similarity networks
Publication Number: 3553805
ISBN: 9781267934628
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest