COMING SOON! PQDT Open is getting a new home!

ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at

Questions? Please refer to this FAQ.

Dissertation/Thesis Abstract

Order and Learning in Sequential Neural Structured Prediction
by Welleck, Sean, Ph.D., New York University, 2021, 150; 28257225
Abstract (Summary)

Structured objects such as sets, trees, and sequences appear in a variety of scientific and industrial domains. Developing machine learning methods that generate these objects is of interest for both scientific understanding and practical applications. One approach to generating structured objects, called sequential neural structured prediction, decomposes generation into a sequence of predictions, with each prediction made by a deep neural network. Choosing an appropriate sequential representation of each structured object and selecting an effective learning objective are key to adopting this approach. The standard method for learning specifies a canonical ordering of elements in the sequential representation and maximizes the likelihood of the resulting sequences. This thesis develops two streams of research that explore alternatives to this fixed-order, maximum likelihood approach for sequentially generating sets, trees, and sequences, with a focus on natural language processing applications.

In the first part of the thesis, we focus on text generation and study degenerate properties of fixed-order maximum-likelihood learning that are surfaced in practice, motivating new learning methods. We characterize the degeneracy using three properties that are observed in generated text: non-termination, logical incoherence, and repetition. To study non-termination, we develop theory that allows us to formally prove that conventional text generation methods can generate infinite-length sequences with high probability. To study logical incoherence, we create a dataset for investigating the degree to which a model logically contradicts its preceding statements. For reducing the three types of degeneration, we develop unlikelihood training, a new learning method which penalizes task-specific textual properties.

In the second part of the thesis, we remove the requirement of a fixed generation order by developing a learning framework, called non-monotonic generation, that yields models capable of selecting input-dependent generation orders. This flexibility is natural for set-structured objects, which lack an inherent order. For ordered objects, such as text, the selected orders induce an interpretable latent structure and allow us to study whether canonical orders such as left-to-right are optimal for learning. We use non-monotonic generation for generating multisets, parse trees, and text. The investigations and techniques presented in this thesis lead to promising directions for future work.

Indexing (document details)
Advisor: Cho, Kyunghyun, Zhang, Zheng
Commitee: Ross, Keith, He, He, Weston, Jason
School: New York University
Department: Computer Science
School Location: United States -- New York
Source: DAI-B 82/9(E), Dissertation Abstracts International
Subjects: Computer science, Artificial intelligence
Keywords: Deep Learning, Machine Learning, Sequence modeling, Sequential Neural Structured Prediction
Publication Number: 28257225
ISBN: 9798582575702
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy