COMING SOON! PQDT Open is getting a new home!

ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at

Questions? Please refer to this FAQ.

Dissertation/Thesis Abstract

Latent Variable Modeling and Causal Inference in Population-Structured Genetics
by Cabreros, Irineo C., Ph.D., Princeton University, 2020, 219; 27546470
Abstract (Summary)

Nonrandomly mating populations, referred to as structured populations, are commonly encountered in genetic studies. A common characteristic of structured populations is that separate subpopulations differ systematically in their genetic attributes. In a global sample of unrelated individuals, for example, allele frequencies typically differ between geographically-defined subpopulations. Two analytical goals when studying datasets exhibiting population structure are: (i) characterizing population structure and (ii) identifying causal gene-trait relationships in its presence. This work is comprised of two complementary projects, corresponding to each of these goals.

In the first project, we introduce a computationally efficient algorithm for fitting the admixture model of population structure. The central strategy of our algorithm, which we call ALStructure, is to first estimate the latent linear subspace of admixture components and then search for models within this subspace that satisfy the probabilistic constraints of the admixture model. We find that ALStructure typically outperforms preexisting methods both in accuracy and speed under a wide array of simulated and real datasets.

In the second project, we show how the random process of meiosis can be leveraged as a form of experimental randomization capable of uncovering causal relationships between genes and traits in the presence of population structure. We introduce novel tests based on parent-child trio data developed within the causal framework of potential outcomes. Additionally, we evaluate the causal properties of the popular transmission-disequilibrium test (TDT). We describe and assess the feasibility of assumptions under which each of these procedures are tests of a causal property, which we define as causal linkage. To enable this project, we first provide a detailed discussion of the connection between causality and measure theoretic probability by constructing causal models on probability spaces.

Indexing (document details)
Advisor: Storey, John D
Commitee: Fan, Jianqing, Engelhardt, Barbara, Akey, Joshua
School: Princeton University
Department: Applied and Computational Mathematics
School Location: United States -- New Jersey
Source: DAI-B 81/8(E), Dissertation Abstracts International
Subjects: Statistics, Genetics, Biostatistics
Keywords: Causal inference, Latent variables, Population genetics, Population structure, Statistical genetics, Statistics
Publication Number: 27546470
ISBN: 9781392529638
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy