Dissertation/Thesis Abstract

Fast Optimization Algorithms for AUC Maximization
by Natole, Michael, Jr., Ph.D., State University of New York at Albany, 2020, 119; 27835831
Abstract (Summary)

Stochastic optimizations algorithms like stochastic gradient descent (SGD) are favorable for large-scale data analysis because they update the model sequentially and with low per-iteration costs. Much of the existing work focuses on optimizing accuracy, however, it is known that accuracy is not an appropriate measure for class imbalanced data. Area under the ROC curve (AUC) is a standard metric that is used to measure classification performance for such a situation. Therefore, developing stochastic learning algorithms that maximize AUC in lieu of accuracy is of both theoretical and practical interest. However, AUC maximization presents a challenge since the learning objective function is defined over a pair of instances of opposite classes. Existing methods can overcome this issue and achieve online processing but with higher space and time complexity. In this thesis, we will develop two novel stochastic algorithms for AUC maximization. The first is an online method which is referred to as SPAM. In comparison to the previous literature, the algorithm can be applied to non-smooth penalty functions while achieving a convergence rate of O(log T / T). The second is a batch learning method which is referred to as SPDAM. We establish a linear convergence rate for a sufficiently large batch size. We demonstrate the effectiveness of such algorithms on standard benchmark data sets as well as data sets for anomaly detection tasks.

Indexing (document details)
Advisor: Ying, Yiming
Commitee: Feng, Yunlong, Reinhold, Karin, Goldfarb, Boris
School: State University of New York at Albany
Department: Mathematics and Statistics
School Location: United States -- New York
Source: DAI-B 81/11(E), Dissertation Abstracts International
Subjects: Mathematics
Keywords: AUC optimization, Batch learning, Binary classification, Machine learning, Online learning
Publication Number: 27835831
ISBN: 9798643174950
Copyright © 2020 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy