COMING SOON! PQDT Open is getting a new home!

ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at

Questions? Please refer to this FAQ.

Dissertation/Thesis Abstract

The role of read depth in the design and analysis of sequencing experiments
by Robinson, David Garrett, Ph.D., Princeton University, 2015, 164; 3713946
Abstract (Summary)

The development of quantitative sequencing technologies, such as RNA-Seq, Bar-Seq, ChIP-Seq, and metagenomics, has offered great insight into molecular biology. Proper design and analysis of these experiments require statistical models and techniques that consider the specific nature of sequencing data, which typically consists of a matrix of read counts per feature. An issue of particular importance to the development of these methods is the role of read depth in statistical accuracy and power. The depth of an experiment affects the power to make biological conclusions, meaning an experiment design must consider the tradeoff between cost, power, and the number of samples that are examined. Similarly, per-gene read depth affects each gene's power and accuracy, and must be taken into account in any downstream analysis.

Here I explore many facets of the role of read depth in the design and analysis of sequencing experiments, and offer computational and statistical methods for addressing them. To assist in the design of sequencing experiments, I present subSeq, which examines the effect of depth in an experiment by subsampling reads to simulate lower depths. I use this method to examine the extent of read saturation across a variety of RNA-Seq experiments, and demonstrate a statistical model for predicting the effect of increasing depth in any experiment. I consider intensity-dependence in a technology comparison between microarrays and RNA-Seq, and show that the variance added by RNA-Seq depends more on depth than the variance in microarray depends on fluorescence intensity. I demonstrate that Bar-Seq data shares these depth-dependent properties with RNA-Seq and can be analyzed by the same tools, and further provide suggestions on the appropriate depth for Bar-Seq experiments. Finally, I show that per-gene read depth can be taken into account in multiple hypothesis testing to improve power, and introduce the method of functional false discovery rate (fFDR) control.

Indexing (document details)
Advisor: Storey, John D.
Commitee: Singh, Mona, Troyanskaya, Olga
School: Princeton University
Department: Quantitative Computational Biology
School Location: United States -- New Jersey
Source: DAI-B 76/12(E), Dissertation Abstracts International
Subjects: Genetics, Statistics, Bioinformatics
Keywords: Differential expression, Experimental design, False discovery rate, RNA-Seq, Read depth, Sequencing
Publication Number: 3713946
ISBN: 978-1-321-90912-8
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy