Dissertation/Thesis Abstract

The author has requested that access to this graduate work be delayed until 2018-04-21. After this date, this graduate work will be available on an open access basis.
Mapping Analyte-Signal Relations in LC-MS Based Untargeted Metabolomics
by Mahieu, Nathaniel Guy, Ph.D., Washington University in St. Louis, 2017, 202; 10270783
Abstract (Summary)

The goal of untargeted metabolomics is to profile metabolism by measuring as many metabolites as possible. A major advantage of the untargeted approach is the detection of unexpected or unknown metabolites. These metabolites have chemical structures, metabolic pathways, or cellular functions that have not been previously described. Hence, they represent exciting opportunities to advance our understanding of biology. This beneficial approach, however, also adds considerable complexity to the analysis of metabolomics data — an individual signal cannot be readily identified as a unique metabolite. As such, a major challenge faced by the untargeted metabolomic workflow is extracting the analyte content from a dataset. Successful applications of metabolomics bypass this limitation by throwing away the 99% of the dataset that is not statistically altered between sample groups. This widely accepted approach to untargeted metabolomics is functional for a very narrow set of applications, but critically, it fails to provide a comprehensive view of metabolism.

The primary thrust of this dissertation work is to overcome this fundamental barrier in metabolomic experiments and extract the unique analyte content from metabolomic datasets. To this end, three algorithms were developed.

(i) We first developed the Warpgroup algorithm to refine the features detected in replicate samples. Peak detection performed on replicate samples is highly inconsistent. Warpgroup considers all replicates in concert to determine a set of consensus signals or features – integrations that are supported by all replicates. This process improves quantitation and significantly reduces the artifact content of the dataset.

(ii) Mz.unity was then developed so that one can search for any specified mass-peak relationship. Features in metabolomic data are highly degenerate and available annotation approaches have been limited to a small subset of possible degeneracies. Mz.unity addresses this deficiency. This advance enabled the systematic evaluation of complex and cross polarity adducts as well as a context-based relationship recovery approach.

(iii) The credentialing approach was developed to experimentally filter non-biological features and recovers a reproducible set of biological features. While great effort had been undertaken to minimize the contribution of contaminants and informatic error to features, it was clear that many mistakes were still being made.

The developed algorithms were then applied, in concert, to an untargeted analysis of Escherichia coli. Together, the application of these algorithms provided the first comprehensive picture of metabolomic dataset composition. Strikingly, the technologies suggest that the tens of thousands of signals detected in a typical untargeted metabolomic data set correspond to less than 1,500 analytes – a result that has large implications for the design and interpretation of untargeted metabolomic experiments.

This work constitutes a key advance in our understanding of metabolomic science, and the contributions enable more robust untargeted analyses of metabolism. Together, these concepts establish a clear course for the future development of a comprehensive metabolomic data analysis platform and bring the promise of truly untargeted metabolomics into view.

Indexing (document details)
Advisor: Patti, Gary J.
Commitee: Gross, Michael L., Johnson, Steven L., Pless, Robert, Schaefer, Jacob, Schedl, Tim
School: Washington University in St. Louis
Department: Chemistry
School Location: United States -- Missouri
Source: DAI-B 78/09(E), Dissertation Abstracts International
Subjects: Analytical chemistry, Biochemistry, Bioinformatics
Keywords: Chromatography, Degeneracy, Mass spectrometry, Metabolite, Metabolomics, Untargeted
Publication Number: 10270783
ISBN: 9781369717556