Dissertation/Thesis Abstract

Comparative Analysis of Long-Read Transcriptome Assembly Pipelines
by Dubocanin, Danilo , M.S., University of California, Santa Cruz, 2020, 38; 28092891
Abstract (Summary)

Long-read sequencing can overcome some of the barriers in transcriptome assembly that plague short-read based technologies. Due to their short length, short-reads fail to span entire transcripts, and this leads to difficulties in discerning proper splice junctions. Conversely, long-read sequencing can span entire transcripts end-to-end, and thus can circumvent issues in inferring splice junctions. Multiple long-read transcriptome assembly pipelines have been developed in recent years but there is no comprehensive analysis comparing the various pipelines. Some of these pipelines implement novel approaches to generating transcriptomes using long-reads, while other pipelines adapted methods originally developed for short-read based transcriptome assembly. We show that there are significant differences in transcriptomes assembled on the same data, using different assembly pipelines. Our analysis further shows that high-level summary statistics can be misleading about transcriptome quality, as well as the importance of using internalized controls to validate transcriptome assemblies.

Indexing (document details)
Advisor: Vollmers, Chris
Commitee: Corbett-Detig, Russ, Shariati, Ali
School: University of California, Santa Cruz
Department: Biomolecular Engineering and Bioinformatics
School Location: United States -- California
Source: MAI 82/4(E), Masters Abstracts International
Subjects: Bioinformatics, Cellular biology, Health sciences, Genetics, Histology
Keywords: 3rd-generation sequencing, Isoforms, Long-read sequencing, RNA-seq, Transcriptomics, Assembly pipelines
Publication Number: 28092891
ISBN: 9798684694240
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy