Dissertation/Thesis Abstract

Denoising amplicon-based metagenomic data
by Gaspar, John M., Ph.D., University of New Hampshire, 2014, 118; 3581214
Abstract (Summary)

Reducing the effects of sequencing errors and PCR artifacts has emerged as an essential component in amplicon-based metagenomic studies. Denoising algorithms have been written that can reduce error rates in mock community data, in which the true sequences are known, but they were designed to be used in studies of real communities. To evaluate the outcome of the denoising process, we developed methods that do not rely on a priori knowledge of the correct sequences, and we applied these methods to a real-world dataset. We found that the denoising algorithms had substantial negative side-effects on the sequence data. For example, in the most widely used denoising pipeline, AmpliconNoise, the algorithm that was designed to remove pyrosequencing errors changed the reads in a manner inconsistent with the known spectrum of these errors, until one of the parameters was increased substantially from its default value.

With these shortcomings in mind, we developed a novel denoising program, FlowClus. FlowClus uses a systematic approach to filter and denoise reads efficiently. When denoising real datasets, FlowClus provides feedback about the process that can be used as the basis to adjust the parameters of the algorithm to suit the particular dataset. FlowClus produced a lower error rate compared to other denoising algorithms when analyzing a mock community dataset, while retaining significantly more sequence information. Among its other attributes, FlowClus can analyze longer reads being generated from current protocols and irregular flow orders. It has processed a full plate (1.5 million reads) in less than four hours; using its more efficient (but less precise) trie analysis option, this time was further reduced, to less than seven minutes.

Indexing (document details)
Advisor:
Commitee:
School: University of New Hampshire
School Location: United States -- New Hampshire
Source: DAI-B 75/10(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Bioinformatics
Keywords: 16S rRNA, Denoising, Metagenomics, Microbiome, Pyrosequencing
Publication Number: 3581214
ISBN: 978-1-321-10118-8
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest