Cancer research has made tremendous progress in understanding the basic biology of tumors. One of the key insights that has informed work in this area is the recognition that a tumor is an evolutionary system, in which individual cells undergo a process of rapid mutation and selection leading to a progression in phenotypes and, typically, aggressiveness of the tumor. Tumor phylogenetics is a strategy for interpreting the evolution of tumors using computer algorithms for phylogenetics, i.e., the inference of evolutionary trees. The approach takes advantage of a large body of phylogenetic theory and algorithms, developed primarily for inferring evolution among species, to interpret complex tumor data sets as evidence for evolutionary processes. The result is a tumor phylogeny, or phylogenetic tree, a reconstruction of the sequences of mutations that cells within a tumor or class of tumors accumulate over the course of their progression. The goals of finding such trees are to better interpret heterogeneity within and among tumors, identify and classify tumor subtypes with possible underlying mechanisms of action, learn markers of progression for key steps in tumor evolution, and enable predictive modeling of likely tumor progression steps that may ultimately assist in diagnosis and treatment.
In this dissertation, we discuss a computational framework for reconstructing phylogenies from genome-scale tumor array and sequencing data. We first present a novel phylogenetic pipeline for building tumor phylogenies from whole-genome copy number variation data. The steps included computational unmixing for resolving heterogeneity in genomic data from tumors, a statistical method for progression marker discovery, a statistical method for data discretization, application of character-based phylogeny reconstruction, and analyses of the resulting trees to draw biological significance. We then describe HMM-CNA, an improved model for discovering progression markers from cohorts of patient tumor copy number data that are especially relevant for phylogeny reconstruction via a custom multi-sample Hidden Markov model (HMM). We next present a novel strategy for phylogeny building from single cell sequencing data by inferring features that can accurately capture the composition of the individual genome sequences and distinguish among stages of tumor progression. We demonstrate these contributions on both simulated and human breast tumor biopsy and cell line data assuming a maximum parsimony model of evolution. Finally, we discuss future directions for building a more realistic model of tumor evolution by integrating patterns in genome structural changes with the functional elements they encode. We close with a discussion of recent research, current trends, and challenges and opportunities facing the field.
|Commitee:||Kingsford, Carl, Murphy, Robert, Shackney, Stanley|
|School:||Carnegie Mellon University|
|School Location:||United States -- Pennsylvania|
|Source:||DAI-B 75/02(E), Dissertation Abstracts International|
|Subjects:||Biostatistics, Bioinformatics, Oncology|
|Keywords:||Copy number variation, Phylogenetics, Tumor evolution, Tumor progression|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be