This study compared several resampling methods for both the minority and majority classes to address the problematic issue of a highly skewed unbalanced data. In the initial experiments, deep learning models outperformed other machine learning approaches, these final experiments all used the same deep learning architecture of 9 layers (1 input, 7 hidden, 1 output). While Experiment 1 used the adam optimizer, 100 epochs, batch size 1000, 75%/25% train-test split, Experiment 2 used the nadam optimizer, 700 epochs, batch size 1000, 64%/16%/20% train-validation-test split. In addition, Experiment 2 found results on the original data produced by the techniques unlike Experiment 1. The purpose of Experiment 1 was to build basic models to initially test all the sampling methods. Experiment 2, which included a validation set, was structured to be a more rigorous examination.
These experiments demonstrate that the simple resampling methods, minority resampling and the minority and majority bootstrap, outperform the pre-built methods in python. These results suggest that majority of the prebuilt methods, as well as doing nothing (‘No balancing’) over-train on one of the two classes.
|School:||The George Washington University|
|Department:||Bioinformatics and Molecular Biochemistry|
|School Location:||United States -- District of Columbia|
|Source:||MAI 82/7(E), Masters Abstracts International|
|Subjects:||Bioinformatics, Medicine, Artificial intelligence|
|Keywords:||Bioinformatics, Deep neural network, Machine learning, Medicine, Neural network|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be