Metabolomics, a relatively late entrant in the ’omics’ pyramid, aims to capture a complete snapshot of the metabolome of an organism at any given point in time. Recent advances in mass spectrometry techniques have allowed for the simultaneous detection of hundreds of metabolites in a given sample. However, metabolomics data suffers from high dimensionality, high correlations, and the presence of unknown metabolites. In my Ph.D. dissertation, I have employed machine learning techniques and graphical models to analyze and deconstruct some of the complexities in metabolomics data in Drosophila melanogaster.
In chapter 1, I introduce the challenges in metabolomics data analysis and outline my dissertation. In chapter 2, I employed the Random Forest algorithm, to identify essential metabolites that best differentiate between the high-fat diet and normal diet. I found that flies on a high-fat diet had an upregulated omega fatty acid oxidation pathway. Furthermore, I analyzed the network structure differences between the high-fat diet and normal diet-fed flies using Gaussian Graphical Models. The edge symmetric difference between the two networks was 0.786, indicating very different topology.
Chapter 3 shows the use of Bayesian networks to predict metabolic networks from the untargeted metabolomics data. The networks obtained were then compared to known metabolic networks in various organisms present in KEGG. I found that the generated Bayesian networks showed a similar degree distribution, had similar secondary motif composition, and similar short path length distribution as the known KEGG metabolic networks. Thus, I demonstrate that Bayesian network analysis can be successfully utilized for untargeted metabolomics data to generate data-driven network models that have similar underlying characteristics as known metabolic networks.
In chapter 4, we present FlyNet, a multilayer network database conceptualized and constructed for storing and visualizing complex network data. FlyNet integrates the metabolome with the genome and the proteome to facilitate integrative studies in Drosophila melanogaster. As an example, I show how the betweenness of gene and protein nodes changes in a multilayer setting compared to a single layer analysis. Furthermore, I show how using FlyNet, one can query a possible relationship between genes and metabolites across different biological layers.
|Advisor:||Reed, Laura K.|
|Commitee:||Staudhammer, Christina, Chen, Yuhui, Kocot, Kevin, Pienaar, Jason|
|School:||The University of Alabama|
|School Location:||United States -- Alabama|
|Source:||DAI-B 82/1(E), Dissertation Abstracts International|
|Keywords:||Bayesian network, Biological networks, Drosophila melanogaster, Metabolic networks, Metabolomics, Probabilistic graphical models|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be