COMING SOON! PQDT Open is getting a new home!

ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at

Questions? Please refer to this FAQ.

Dissertation/Thesis Abstract

Nonconvex Min-Max Optimization in Deep Learning: Algorithms and Applications
by Liu, Mingrui, Ph.D., The University of Iowa, 2020, 186; 28031308
Abstract (Summary)

Nonconvex min-max optimization receives increasing attention in modern machine learning, especially in the context of deep learning. Examples include stochastic AUC maximization with deep neural networks and Generative Adversarial Nets (GANs), which correspond to a nonconvex-concave and nonconvex-nonconcave min-max problem respectively. The classical theory of min-max optimization mainly focuses on convex-concave setting, which is not applicable for deep learning applications with nonconvex min-max formulation. A natural question is proposed---how to design provably efficient algorithms for nonconvex min-max problems in deep learning? To answer this question, this dissertation focuses on the following four concrete aspects:

First, we consider the problem of stochastic AUC maximization problem with deep neural networks as preditive models. Building on the saddle point reformulation of a surrogated loss of AUC, the problem can be cast into a non-convex concave min-max problem. We explore Polyak-Lojasiewicz (PL) condition that has been proved and observed in deep learning, and develop new stochastic algorithms with even faster convergence rate and more practical step size scheme.

Second, we consider the first-order convergence theory for weakly-convex-weakly-concave min-max problems. We propose an algorithmic framework motivated by the inexact proximal point method, where the weakly monotone variational inequality (VI) corresponding to the original min-max problem is solved through approximately solving a sequence of strongly monotone VIs constructed by adding a strongly monotone mapping to the original gradient mapping. We prove first-order non-asymptotic convergence to a nearly stationary point of the original min-max problem in polynomial time.

Third, we consider a class of nonconvex-nonconcave min-max problems with the focus on GAN training. Although adaptive gradient methods with alternate update empirically work well in training GANs, it requires expensive tuning efforts, lacks theoretical convergence guarantees and might diverge in practice. To bridge the gap, we design an adaptive gradient algorithm for training GANs with provably faster convergence than its non-adaptive counterpart.

Fourth, we aim to consider large-scale GAN training in a decentralized distributed manner. Decentralized parallel algorithms are robust to network bandwidth and latency compared with its centralized counterpart and it has merits for protecting users' privacy, but decentralized algorithms for nonconvex-nonconcave min-max optimization are not considered in the existing literature. We propose and analyze a decentralized algorithm for train GANs, and show its provable convergence to first-order stationary point in polynomial time.

Indexing (document details)
Advisor: Yang, Tianbao
Commitee: Yang, Tianbao, Lin, Qihang, Varadarajan, Kasturi, Oliveira, Suely, Xu, Weiyu
School: The University of Iowa
Department: Computer Science
School Location: United States -- Iowa
Source: DAI-B 82/3(E), Dissertation Abstracts International
Subjects: Computer science, Mathematics, Operations research
Keywords: Algorithms, Applications, Deep learning, Min-max, Nonconvex, Optimization
Publication Number: 28031308
ISBN: 9798672127811
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy