Dissertation/Thesis Abstract

Studying Low Complexity Structures in Bioinformatics Data Analysis of Biological and Biomedical Data
by Causey, Jason L., Ph.D., University of Arkansas at Little Rock, 2017, 100; 10750808
Abstract (Summary)

Biological, biomedical, and radiological data tend to be large, complex, and noisy. Gene expression studies contain expression levels for thousands of genes and hundreds or thousands of patients. Chest Computed Tomography images used for diagnosing lung cancer consist of hundreds of 2-D image ”slices”, each containing hundreds of thousands of pixels. Beneath the size and apparent complexity of many of these data are simple and sparse structures. These low complexity structures can be leveraged into new approaches to biological, biomedical, and radiological data analyses. Two examples are presented here. First, a new framework SparRec (Sparse Recovery) for imputation of GWAS data, based on a matrix completion (MC) model taking advantage of the low-rank and low number of co-clusters of GWAS matrices. SparRec is flexible enough to impute meta-analyses with multiple cohorts genotyped on different sets of SNPs, even without a reference panel. Compared with Mendel-Impute, another MC method, our low-rank based method achieves similar accuracy and efficiency even with up to 90% missing data; our co-clustering based method has advantages in running time. MC methods are shown to have advantages over statistics-based methods, including Beagle and fastPhase. Second, we demonstrate NoduleX, a method for predicting lung nodule malignancy from chest Computed Tomography (CT) data, based on deep convolutional neural networks. For training and validation, we analyze >1000 lung nodules in images from the LIDC/IDRI cohort and compare our results with classifications provided by four experienced thoracic radiologists who participated in the LIDC project. NoduleX achieves high accuracy for nodule malignancy classification, with an AUC of up to 0.99, commensurate with the radiologists’ analysis. Whether they are leveraged directly or extracted using mathematical optimization and machine learning techniques, low complexity structures provide researchers with powerful tools for taming complex data.

Indexing (document details)
Advisor: Huang, Xiuzhen
Commitee: Hong, Huixiao, Su, Hung-Chi, Walker, Karl, Yang, Mary
School: University of Arkansas at Little Rock
Department: Bioinformatics
School Location: United States -- Arkansas
Source: DAI-B 79/10(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Bioinformatics
Keywords: Convolutional neural network, Gwas imputation, Low rank, Lung cancer
Publication Number: 10750808
ISBN: 9780355969870
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest