Dissertation/Thesis Abstract

Combining Experimental and <i>In Silico</i> Methods for Comprehensive Compound Dereplication of Natural Products for Mass Spectrometry Based Metabolomics
by Vaniya, Arpana, Ph.D., University of California, Davis, 2017, 257; 10624215
Abstract (Summary)

Metabolomics is a rapidly growing field in “omics” research where metabolites are analyzed in biological systems. Over the past decade, mass spectrometry (MS) based metabolomics has been used for its superior analytical performance to reveal how these biological systems respond to genetic and environmental changes. MS is both sensitive and selective and is capable for providing comprehensive information for metabolic profiling by combining separation methods such as liquid chromatography (LC-MS) or gas chromatography (GC-MS). However, in untargeted metabolomics identification of small molecules is the bottleneck. In the research described here, I have combined both in silico and experimental methods for compound dereplication of natural products using MS-based metabolomics.

Chapter 1 addresses the advancement of fragmentation and mass spectral trees used for unknown metabolite identification. Tools used for metabolite identification from the past 10 years are discussed, including algorithms, software, mass spectral libraries, and databases that implement fragmentation and mass spectral trees. Due to the inherent complexity of natural products in plants and microbes, unknown compound identification is increasingly difficult and limiting. Resolving this problem requires better computational tools and informative data such as those acquired by multi-stage mass spectrometry (MSn). MSn yields more fragmentation data and allows for more complex structural elucidation as needed for compounds with positional isomers. The limitation with using tandem mass spectrometry (MS/MS) only is that many ions are shared between positional isomers and full structural information is not available to elucidate an unknown metabolite. Fragmentation and mass spectral trees both describe the fragmentation processes of a metabolite and aid in fragmentation rule generation and substructure identification. The major difference between fragmentation and mass spectral trees is that fragmentation trees use elemental compositions to describe the fragmentation process and mass spectral trees or ion trees use precursor and product ion spectra from MSn mass spectral acquisition. As a result, there has been a large increase in efforts to develop MSn > 2 data and tools for both structure elucidation and spectral annotations with the use of fragmentation and mass spectral trees in recent years.

Chapter 2 describes research and development of iTree, a MSn mass spectral tree library of plant natural products and its aid in compound identification of natural products. In metabolomics, mass spectral library searching is a standard method for compound identification, correctly known as compound dereplication. Mass spectral libraries are either freely or commercially available and can contain both experimental and in silico MS/MS reference spectra. The coverage of MSn > 2 reference spectra is much smaller in many of these MS/MS libraries and databases. To date the largest MSn > 2 libraries are HighChem and mzCloud, which also support mass spectral trees. The chemical coverage of such libraries and databases are very low in comparison to the number of known compounds. iTree was developed to expand the coverage of fragmentation spectra for natural products. iTree contains more than 2,000 natural products and more than 9,000 ion tree spectra annotated with in silico generated substructures from both Mass Frontier 7.0 and CFM-ID. iTree is freely available through MassBank of North America (MoNA), an open-access mass spectral database. As a result of the high number of natural products, and specifically flavonoid aglycones, previously published fragmentation rules were studied and validated. A new rule for flavanonols was proposed as a loss of –CCO to occur specifically for this class. In addition, iTree was used to profile secondary metabolites in the roots and nodules of the host plant Datisca glomerata. More than 100 natural products were identified by combining LC-MSn, high resolution LC-MS/MS, and ion tree analysis using iTree. Overall, iTree has shown to provide a method to facilitate metabolite identification for plant natural products.

Although MSn > 2 data is more useful for complex structural elucidation, the predominant data used in untargeted metabolomics is MS/MS. For this reason, in silico tools that focus on the interpretation of MS and MS/MS spectra alone must be evaluated. In Chapters 3 through 5, I discuss how the Critical Assessment of Small Molecule Identification (CASMI) has allowed for such an evaluation by presenting unknown challenge data sets to the metabolomics community to evaluate the tools and methods they currently use for unknown compound identification. The results submitted by each user are compared and discussed to provide greater insight into how in silico tools can be further improved to aid in the advancement and accuracy of unknown compound identification methods.

Chapter 3 focuses specifically on the performance of MS-FINDER, a software that uses MS and MS/MS spectra for structural elucidation of unknown compounds, presented in the CASMI 2016 Category 1. (Abstract shortened by ProQuest.)

Indexing (document details)
Advisor: Fiehn, Oliver
Commitee: Lebrilla, Carlito B., Siegel, Justin B.
School: University of California, Davis
Department: Chemistry
School Location: United States -- California
Source: DAI-B 79/03(E), Dissertation Abstracts International
Subjects: Chemistry, Analytical chemistry
Keywords: Fragmentation trees, Mass spectral trees, Mass spectrometry, Metabolomics, Multi-stage mass spectrometry (MSN), Unknown compound identification
Publication Number: 10624215
ISBN: 9780355462043