Long-read sequencing can overcome some of the barriers in transcriptome assembly that plague short-read based technologies. Due to their short length, short-reads fail to span entire transcripts, and this leads to difficulties in discerning proper splice junctions. Conversely, long-read sequencing can span entire transcripts end-to-end, and thus can circumvent issues in inferring splice junctions. Multiple long-read transcriptome assembly pipelines have been developed in recent years but there is no comprehensive analysis comparing the various pipelines. Some of these pipelines implement novel approaches to generating transcriptomes using long-reads, while other pipelines adapted methods originally developed for short-read based transcriptome assembly. We show that there are significant differences in transcriptomes assembled on the same data, using different assembly pipelines. Our analysis further shows that high-level summary statistics can be misleading about transcriptome quality, as well as the importance of using internalized controls to validate transcriptome assemblies.
|Commitee:||Corbett-Detig, Russ, Shariati, Ali|
|School:||University of California, Santa Cruz|
|Department:||Biomolecular Engineering and Bioinformatics|
|School Location:||United States -- California|
|Source:||MAI 82/4(E), Masters Abstracts International|
|Subjects:||Bioinformatics, Cellular biology, Health sciences, Genetics, Histology|
|Keywords:||3rd-generation sequencing, Isoforms, Long-read sequencing, RNA-seq, Transcriptomics, Assembly pipelines|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be