Washington, DC is experiencing a generalized HIV-1 epidemic, defined by the World Health Organization as affecting >1% of the population, despite recent reductions in HIV-1 from 2.5% in 2013 to 1.9% in 2016 and 2017. Next generation sequencing (NGS) technologies have been used to sequence many viruses, including HIV-1, and have facilitated the direct sequencing of individual strains (i.e., intra-host population). NGS can augment phylodynamics, the interaction between epidemiological and evolutionary processes within and among populations, by facilitating the identification and interrogation of newly emerging and rapidly expanding transmission clusters. Furthermore, because of the direct sequencing of variants (or haplotypes), NGS can enhance the identification of drug resistant mutations circulating within the population; haplotypes can be reconstructed from the NGS sequencing reads. Viral populations may contain a pool of variants that are resistant to antiretroviral drugs or help the virus evade the immune system. Assessing the phylodynamics with haplotypes may display additional or unique transmission clusters present between individuals or identify viral strains that are dominating a local viral population.
Limited research concerning viral transmission, accumulation and/or spread of drug resistant mutations, and viral diversity estimates within the population has been completed on the HIV-1 epidemic in Washington, DC. To date, no studies have implemented NGS techniques to study the virus in this region. Two important limitations have prevented phylodynamics from being integrated into clinical viral studies when using NGS data. First, NGS creates a massive amount of data which can be computationally expensive and cumbersome to handle, especially when assembling the short NGS reads (75-300bp) into viral genomes. Second, the current practice for assembling viral NGS data is to summarize thousands of these short reads in a single consensus sequence, thus confounding valuable intra-patient genome diversity. While a number of tools exist to reconstruct haplotypes from viral NGS data, there has been minimal evaluation of their performance, and they have rarely been applied to viral phylodynamic studies. A computational method that integrates processing the viral NGS reads with haplotype reconstruction to capture sequence variants present within the intra-host population has yet to be developed. In this dissertation, I will address these challenges regarding intra-patient genome diversity by developing a computational approach to process viral NGS data. I will evaluate the existing viral haplotype reconstruction tools and incorporate the best-performing tool into this computational approach to provide insights into the HIV-1 molecular epidemiology in Washington, DC.
|Advisor:||Crandall, Keith A., Pérez-Losada, Marcos|
|Commitee:||Nixon, Douglas F., Hahn, Andrea, Ortí, Guillermo, Chiappinelli, Katherine B.|
|School:||The George Washington University|
|School Location:||United States -- District of Columbia|
|Source:||DAI 81/11(E), Dissertation Abstracts International|
|Subjects:||Bioinformatics, Molecular biology, Public health|
|Keywords:||Computational biology, Haplotypes, HIV, Phylodynamics, Viral evolution|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be