The goal of untargeted metabolomics is to profile metabolism by measuring as many metabolites as possible. A major advantage of the untargeted approach is the detection of unexpected or unknown metabolites. These metabolites have chemical structures, metabolic pathways, or cellular functions that have not been previously described. Hence, they represent exciting opportunities to advance our understanding of biology. This beneficial approach, however, also adds considerable complexity to the analysis of metabolomics data — an individual signal cannot be readily identified as a unique metabolite. As such, a major challenge faced by the untargeted metabolomic workflow is extracting the analyte content from a dataset. Successful applications of metabolomics bypass this limitation by throwing away the 99% of the dataset that is not statistically altered between sample groups. This widely accepted approach to untargeted metabolomics is functional for a very narrow set of applications, but critically, it fails to provide a comprehensive view of metabolism.
The primary thrust of this dissertation work is to overcome this fundamental barrier in metabolomic experiments and extract the unique analyte content from metabolomic datasets. To this end, three algorithms were developed.
(i) We first developed the Warpgroup algorithm to refine the features detected in replicate samples. Peak detection performed on replicate samples is highly inconsistent. Warpgroup considers all replicates in concert to determine a set of consensus signals or features – integrations that are supported by all replicates. This process improves quantitation and significantly reduces the artifact content of the dataset.
(ii) Mz.unity was then developed so that one can search for any specified mass-peak relationship. Features in metabolomic data are highly degenerate and available annotation approaches have been limited to a small subset of possible degeneracies. Mz.unity addresses this deficiency. This advance enabled the systematic evaluation of complex and cross polarity adducts as well as a context-based relationship recovery approach.
(iii) The credentialing approach was developed to experimentally filter non-biological features and recovers a reproducible set of biological features. While great effort had been undertaken to minimize the contribution of contaminants and informatic error to features, it was clear that many mistakes were still being made.
The developed algorithms were then applied, in concert, to an untargeted analysis of Escherichia coli. Together, the application of these algorithms provided the first comprehensive picture of metabolomic dataset composition. Strikingly, the technologies suggest that the tens of thousands of signals detected in a typical untargeted metabolomic data set correspond to less than 1,500 analytes – a result that has large implications for the design and interpretation of untargeted metabolomic experiments.
This work constitutes a key advance in our understanding of metabolomic science, and the contributions enable more robust untargeted analyses of metabolism. Together, these concepts establish a clear course for the future development of a comprehensive metabolomic data analysis platform and bring the promise of truly untargeted metabolomics into view.
|Advisor:||Patti, Gary J.|
|Commitee:||Gross, Michael L., Johnson, Steven L., Pless, Robert, Schaefer, Jacob, Schedl, Tim|
|School:||Washington University in St. Louis|
|School Location:||United States -- Missouri|
|Source:||DAI-B 78/09(E), Dissertation Abstracts International|
|Subjects:||Analytical chemistry, Biochemistry, Bioinformatics|
|Keywords:||Chromatography, Degeneracy, Mass spectrometry, Metabolite, Metabolomics, Untargeted|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be