Dissertation/Thesis Abstract

Automated Feature Engineering for Deep Neural Networks with Genetic Programming
by Heaton, Jeff, Ph.D., Nova Southeastern University, 2017, 201; 10259604
Abstract (Summary)

Feature engineering is a process that augments the feature vector of a machine learning model with calculated values that are designed to enhance the accuracy of a model's predictions. Research has shown that the accuracy of models such as deep neural networks, support vector machines, and tree/forest-based algorithms sometimes benefit from feature engineering. Expressions that combine one or more of the original features usually create these engineered features. The choice of the exact structure of an engineered feature is dependent on the type of machine learning model in use. Previous research demonstrated that various model families benefit from different types of engineered feature. Random forests, gradient-boosting machines, or other tree-based models might not see the same accuracy gain that an engineered feature allowed neural networks, generalized linear models, or other dot-product based models to achieve on the same data set.

This dissertation presents a genetic programming-based algorithm that automatically engineers features that increase the accuracy of deep neural networks for some data sets. For a genetic programming algorithm to be effective, it must prioritize the search space and efficiently evaluate what it finds. This dissertation algorithm faced a potential search space composed of all possible mathematical combinations of the original feature vector. Five experiments were designed to guide the search process to efficiently evolve good engineered features. The result of this dissertation is an automated feature engineering (AFE) algorithm that is computationally efficient, even though a neural network is used to evaluate each candidate feature. This approach gave the algorithm a greater opportunity to specifically target deep neural networks in its search for engineered features that improve accuracy. Finally, a sixth experiment empirically demonstrated the degree to which this algorithm improved the accuracy of neural networks on data sets augmented by the algorithm's engineered features.

Indexing (document details)
Advisor: Cannady, James D.
Commitee: Cerkez, Paul, Mukherjee, Sumitra
School: Nova Southeastern University
Department: Computer Science (CISC, CISD)
School Location: United States -- Florida
Source: DAI-B 78/08(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Artificial intelligence, Computer science
Keywords: Deep neural network, Feature engineering, Genetic programming
Publication Number: 10259604
ISBN: 9781369660012
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest