Dissertation/Thesis Abstract

Improving Top-Down Generation of Natural Language with Hierarchy
by Donahue, David J., M.S., University of Massachusetts Lowell, 2020, 65; 28088276
Abstract (Summary)

Current autoregressive language generative models in the deep learning literature have achieved impressive results in a number of areas such as machine translation, summarization, question answering and dialogue. These “sequence-to-sequence” models operate by mapping an input sequence of tokens to an appropriate output sequence via a series of learned transformations on the input tokens. Thus, understanding of the sequence occurs via interactions between individual words without an explicit location to store high-level knowledge about the input sequence. Likewise, these autoregressive models produce each output sequence token-by-token via sampling over the next-token distribution produced by the model. This method of sequence generation means the model does not know the remainder of the sequence it plans to generate as it samples each next word. In this thesis, we explore two methods which attempt to add top-down reasoning to both the processing of the input sequence and the generation of the output sequence in these standard models. In the first set of experiments, we construct a hierarchical variant of the popular Transformer architecture to process the input sequence and show improved performance over competitive baselines. In our second set of experiments, we utilize the recent normalizing flow architecture to generate a latent summary of the sequence to be generated prior to production. We evaluate on multiple datasets with a variety of automatic evaluation metrics including a novel GPT relevance score for measuring response coherence.

Indexing (document details)
Advisor: Rumshisky, Anna
Commitee: Amiri, Hadi, Yu, Hong
School: University of Massachusetts Lowell
Department: Computer Science
School Location: United States -- Massachusetts
Source: MAI 82/3(E), Masters Abstracts International
Source Type: DISSERTATION
Subjects: Artificial intelligence, Language, Computer science
Keywords: BERT, Deep learning, Hierarchy, Normalizing flow, Transformer
Publication Number: 28088276
ISBN: 9798664795714
Copyright © 2020 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest