Dissertation/Thesis Abstract

Using the Birth-Death Process to Infer Changes in the Pattern of Lineage Gain and Loss
by Hallinan, Nathaniel Malachi, Ph.D., University of California, Berkeley, 2011, 203; 3498974
Abstract (Summary)

The birth-death process has been used to study the evolution of a wide variety of biological entities from genes to species. Here, I develop methods to investigate how the birth-death process varies under three very different circumstances: changes in the pattern of taxon diversification through time; the effect of whole genome duplications on the pattern of chromosome gain and loss; and changes in the pattern of gene gain and loss on branches of a taxon tree.

Previous work had shown how to calculate the distributions of number of lineages and branching times for a reconstructed constant rate birth-death process that started with one or two reconstructed lineages at some time or ended with some number of lineages in the present. In chapter 2 I expand that work to include any time variable birth-death process that starts with any number of reconstructed lineages and/or ends with any number of reconstructed lineages at any time. I also introduce the discrete time birth-death process which operates as an efficient and accurate numerical solution to any time-variable birth-death process and allows for the analytical incorporation of sampling and mass extinctions. Furthermore, I show how to simulate random trees under any of these models.

In order to compare phylogenetic trees to these models, I use these methods to calculate two statistics that describe the effect of a set of branching times to any time variable birth-death model: the maximum likelihood, which can be compared to the distribution of the maximum likelihood for a random sample of trees or to that the maximum likelihood of other birth-death models using the Akaike Information Criterion; and the Komolgorov-Smirnov test, which is based on the fact that the branching times should be independently and identically distributed under many time variable birth-death models. I also demonstrate two new methods for visualizing the distribution of branching times: the lineage through time null plot uses a heat map to show the distribution of the number of lineages at different times; and the waiting time null plot does the same for waiting times between branching times.

In chapter 3 I describe a likelihood model in which the number of chromosomes in a genome evolves according to a Markov process with three stochastic rates: a rate of chromosome duplication and a rate of chromosome loss that are proportional to the number of chromosomes in the genome; and a rate of whole genome duplication that is constant. I implemented software that calculates the maximum likelihood under this model for a phylogeny of taxa in which the chromosome counts are known. I compared the maximum likelihoods of a model in which the genome duplication rate varies to one in which it is fixed at zero using the Akaike information criterion, in order to determine if a model with whole genome duplications is a good fit for the data.

In chapter 4, I develop a method that uses the gene family tree to infer changes in the process of gene gain and loss on a taxonomic tree. This method relies on calculating the probability of a gene tree given a taxon tree and a set of birth-death parameters by which that gene tree evolves on the taxon tree. I use a reversible-jump MCMC to sample from the joint posterior distribution of a set of birth-death parameters and assignments of those parameters to the branches of a taxon tree given a gene tree and a taxon tree. Different assignments are compared using Bayes factors. I use simulations to show that this method has much more power than a method which relies only on counts of gene family members to determine if a gene family evolved by a different process on a pair of taxon branches, and whether that difference is a consequence of differences in the birth rate or the death rate.

In section 4.5 I expand my method to include uncertainty in the gene tree topology, by using a set of gene alignments as my data rather than the fully resolved gene tree. Under this implementation I calculate the probability of those sequences given the gene tree, in addition to the probability of the gene tree given the taxon tree. I modify the reversible-jump MCMC so that it now samples from the posterior distribution of the nucleotide evolution parameters and the gene trees, in addition to the birth-death parameters and their assignments to the branches of the taxon tree. I demonstrate the use of this method on two real gene families found in the Bilateria. (Abstract shortened by UMI.)

Indexing (document details)
Advisor: Lindberg, David R.
Commitee: Aldous, David, Huelsenbeck, John P.
School: University of California, Berkeley
Department: Integrative Biology
School Location: United States -- California
Source: DAI-B 73/07(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Biostatistics, Systematic
Keywords: Gene duplication, Gene family diversificatio, Genome duplication, Lineage through time, Phylogeny
Publication Number: 3498974
ISBN: 9781267228086
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest