This thesis focuses on methodologies and applications of module-based analysis (MBA) in omics studies to investigate the relationships of phenotypes and biomarkers, e.g., SNPs, genes, and metabolites. As an alternative to traditional single–biomarker approaches, MBA may increase the detectability and reproducibility of results because biomarkers tend to have moderate individual effects but significant aggregate effect; it may improve the interpretability of findings and facilitate the construction of follow-up biological hypotheses because MBA assesses biomarker effects in a functional context, e.g., pathways and biological processes. Finally, for exploratory “omics” studies, which usually begin with a full scan of a long list of candidate biomarkers, MBA provides a natural way to reduce the total number of tests, and hence relax the multiple-testing burdens and improve power.
The first MBA project focuses on genetic association analysis that assesses the main and interaction effects for sets of genetic (G) and environmental (E) factors rather than for individual factors. We develop a kernel machine regression approach to evaluate the complete effect profile (i.e., the G, E, and G-by-E interaction effects separately or in combination) and construct a kernel function for the Gene-Environmental (GE) interaction directly from the genetic kernel and the environmental kernel. We use simulation studies and real data applications to show improved performance of the Kernel Machine (KM) regression method over the commonly adapted PC regression methods across a wide range of scenarios. The largest gain in power occurs when the underlying effect structure is involved complex GE interactions, suggesting that the proposed method could be a useful and powerful tool for performing exploratory or confirmatory analyses in GxE-GWAS.
In the second MBA project, we extend the kernel machine framework developed in the first project to model biomarkers with network structure. Network summarizes the functional interplay among biological units; incorporating network information can more precisely model the biological effects, enhance the ability to detect true signals, and facilitate our understanding of the underlying biological mechanisms. In the work, we develop two kernel functions to capture different network structure information. Through simulations and metabolomics study, we show that the proposed network-based methods can have markedly improved power over the approaches ignoring network information.
Metabolites are the end products of cellular processes and reflect the ultimate responses of biology system to genetic variations or environment exposures. Because of the unique properties of metabolites, pharmcometabolomics aims to understand the underlying signatures that contribute to individual variations in drug responses and identify biomarkers that can be helpful to response predictions. To facilitate mining pharmcometabolomic data, we establish an MBA pipeline that has great practical value in detection and interpretation of signatures, which may potentially indicate a functional basis for the drug response. We illustrate the utilities of the pipeline by investigating two scientific questions in aspirin study: (1) which metabolites changes can be attributed to aspirin intake, and (2) what are the metabolic signatures that can be helpful in predicting aspirin resistance. Results show that the MBA pipeline enables us to identify metabolic signatures that are not found in preliminary single-metabolites analysis.
|School:||North Carolina State University|
|School Location:||United States -- North Carolina|
|Source:||DAI-B 76/07(E), Dissertation Abstracts International|
|Subjects:||Genetics, Statistics, Bioinformatics|
|Keywords:||Kernel machine, Module based analysis, Network structure, Omics, Statistical analysis|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be