With PQDT Open, you can read the full text of open access dissertations and theses free of charge.
About PQDT Open
Search
COMING SOON! PQDT Open is getting a new home!
ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at www.proquest.com.
Questions? Please refer to this FAQ.
Feature engineering is a process that augments the feature vector of a machine learning model with calculated values that are designed to enhance the accuracy of a model's predictions. Research has shown that the accuracy of models such as deep neural networks, support vector machines, and tree/forest-based algorithms sometimes benefit from feature engineering. Expressions that combine one or more of the original features usually create these engineered features. The choice of the exact structure of an engineered feature is dependent on the type of machine learning model in use. Previous research demonstrated that various model families benefit from different types of engineered feature. Random forests, gradient-boosting machines, or other tree-based models might not see the same accuracy gain that an engineered feature allowed neural networks, generalized linear models, or other dot-product based models to achieve on the same data set.
This dissertation presents a genetic programming-based algorithm that automatically engineers features that increase the accuracy of deep neural networks for some data sets. For a genetic programming algorithm to be effective, it must prioritize the search space and efficiently evaluate what it finds. This dissertation algorithm faced a potential search space composed of all possible mathematical combinations of the original feature vector. Five experiments were designed to guide the search process to efficiently evolve good engineered features. The result of this dissertation is an automated feature engineering (AFE) algorithm that is computationally efficient, even though a neural network is used to evaluate each candidate feature. This approach gave the algorithm a greater opportunity to specifically target deep neural networks in its search for engineered features that improve accuracy. Finally, a sixth experiment empirically demonstrated the degree to which this algorithm improved the accuracy of neural networks on data sets augmented by the algorithm's engineered features.
Advisor: | Cannady, James D. |
Commitee: | Cerkez, Paul, Mukherjee, Sumitra |
School: | Nova Southeastern University |
Department: | Computer Science (CISC, CISD) |
School Location: | United States -- Florida |
Source: | DAI-B 78/08(E), Dissertation Abstracts International |
Source Type: | DISSERTATION |
Subjects: | Artificial intelligence, Computer science |
Keywords: | Deep neural network, Feature engineering, Genetic programming |
Publication Number: | 10259604 |
ISBN: | 978-1-369-66001-2 |