The advent of high throughput DNA sequencing has vastly accelerated transcriptome-wide profiling of RNA, revealing thousands of new noncoding RNA genes in humans and across the phylogenetic tree. Many of these noncoding RNAs are similar in length and processing to messenger RNAs and are referred to as long noncoding RNAs (lncRNAs). Some lncRNAs had been identified decades earlier and have genetic and biochemical evidence for function, e.g. the Xist RNA, which is the master regulator of X-chromosome inactivation in female mammals. Meanwhile, the functions (or lack thereof) of many lncRNA genes are unclear, and the detailed mechanisms of lncRNAs with known functions are also often unknown.
Beyond identification of new RNA genes, high throughput sequencing has also enabled the adaptation of biochemical methods that were traditionally read out for one target RNA at a time to a transcriptome-wide scale, while sometimes revealing new types of information or making possible the study of RNAs within complex or in vivo samples. This enables unprecedented characterization of the activities of both noncoding RNA genes and regulatory regions within messenger RNAs, providing potentially critical information. Each new assay brings specific analysis challenges, including data normalization, scale of interpretation, statistical overdispersion, and limited numbers of replicate experiments.
In this thesis, I have developed and applied computational and statistical methods to aid the interpretation of new technologies for the study of noncoding RNA. In the first chapter, I review the state of the field for the study of lncRNAs and general analysis challenges presented in the interpretation of high throughput sequencing data. In the second and third chapters, I describe preliminary work in my PhD analyzing two technologies developed by collaborators: Capture Hybridization of RNA Targets (CHART) to reveal the spreading pattern of the Xist RNA across the X chromosome (ch. 2); and separation of labeled RNA populations using improved disulfide chemistry for the study of RNA dynamics (ch. 3). In the fourth chapter, I develop a new analysis method to model the statistical overdispersion of RNA chemical probing data and apply this model to investigate the contribution of variability in chemical probing data on resulting RNA secondary structure predictions. The methods described here may facilitate the use of the described technologies for integrative analysis to help distinguish candidate lncRNAs and specific regions within them for further study, as well as RNA regulatory regions in which mutations may cause disease.
|Advisor:||Gerstein, Mark, Simon, Matthew D.|
|Commitee:||Kluger, Yuval, Breaker, Ronald R.|
|Department:||Molecular Biophysics and Biochemistry|
|School Location:||United States -- Connecticut|
|Source:||DAI-B 81/3(E), Dissertation Abstracts International|
|Subjects:||Bioinformatics, Molecular biology, Biochemistry|
|Keywords:||Chemical probing, lncrna, Noncoding, Overdispersion, Rna|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be