Dissertation/Thesis Abstract

Algorithms for Assembly Consolidation and Prediction of Large-Scale Genome Structures
by Zhu, Shenglong, Ph.D., University of Notre Dame, 2019, 168; 28187463
Abstract (Summary)

Genome structure is the order and orientation of pieces of DNA comprising a genome, which contains the information of life. With advances in DNA sequencing technology and now massive availability of sequence data, the study of genome structure cannot be easily carried out without efficient and expressly designed algorithms. In this dissertation, we study three genome structure-related problems: structural error correction of draft genome assemblies, inversion prediction, and predicting operons. Our work with draft genome assemblies explores a novel Maximum Alternating Path Cover (MAPC) model to improve genome correctness and downstream analysis. Our work on inversion prediction aims to predict and catalog inversions by exploring the well-known Range Maximum Query model and Max-Cut model for what we call “global” inversions, and the novel Rectangle Clustering model and Representative Rectangle Prediction model for more localized inversions. For operon prediction, we again apply the MAPC model (with improved algorithms and theoretical analysis), coupled with a novel Intro-Column Exclusive Clustering model, to predict and catalog operons in closely related species. Evaluated using both simulated and real genome data, our algorithms and implementations have shown substantial promise for accurate computational analysis of genome structure in significantly shorter time.

Indexing (document details)
Advisor: Chen, Danny Z., Emrich, Scott J.
Commitee:
School: University of Notre Dame
School Location: United States -- Indiana
Source: DAI-A 82/3(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science, Information science, Genetics
Keywords: DNA sequencing, Maximum Alternating Path Cover, Operon prediction
Publication Number: 28187463
ISBN: 9798664790856
Copyright © 2020 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest