Drug-induced long QT syndrome (diLQTS) can lead to seemingly healthy patients experiencing cardiac arrest, specifically Torsades de Pointes (TdP), which may lead to death. Clinical decision support systems (CDSS) assist better prescribing of drugs, in part by issuing alerts that warn of the drug’s potential harm. LQTS may be either genetic or acquired. Thirteen distinct genetic mutations have already been identified for hereditary LQTS. Since hereditary and acquired LQTS both share similar clinical symptoms, it is reasonable to assume that they both have some sort of genetic component. The goal of this study is to identify genetic risk markers for diLQTS and TdP. These markers will be used to develop a statistical DSS for clinical applications and prevention of genetic-related heart disease. We will use data from a genome-wide associate study conducted by the Pharmacogenomics of Arrhythmia Therapy subgroup of the Pharmacogenetics Research Network, focused on subjects with a history of diLQTS or TdP after taking medication. The data was made available for general research use by National Center for Biotechnology Information (NCBI). The data consists of 831 total patients, with 172 diLQTS and TdP case patients. Out of 620,901 initial markers, variable screening is done by a preliminary t-test (α=0.01), and the resulting feasible set of 5,754 markers associated with diLQTS to prevent TdP were used to create an appropriate predictive model. Methods used to create a predictive model were ensemble logistic regression, elastic net, random forests, artificial neural networks, and linear discriminant analysis. Of these methods using all 5,754 markers, accuracy ranged from 76.84% to 90.29%, with artificial neural networks as the most accurate model. Finally, variable importance algorithms were applied to extract a feasible set of markers from the ensemble logistic regression, elastic net, and random forests methods, and used to produce a subset of genetic markers suitable to build a proposed DSS. Of the methods using a subset of 61 markers, accuracy ranged from 76.59% to 87.00%, with ensemble logistic regression as the most accurate model. Of the methods using a subset of 22 markers, accuracy ranged from 74.24% to 82.87%, with the single hidden layer neural network (using the subset of markers extracted from the ensemble bagged logistic model) as the most accurate model.
|Commitee:||Suaray, Kagba, Zhou, Tianni|
|School:||California State University, Long Beach|
|Department:||Mathematics and Statistics|
|School Location:||United States -- California|
|Source:||MAI 58/01M(E), Masters Abstracts International|
|Keywords:||Machine learning, Torsades de Pointes|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be