Dissertation/Thesis Abstract

Towards Effective and Controllable Neural Text Generation
by Fang, Le, Ph.D., State University of New York at Buffalo, 2020, 88; 27994869
Abstract (Summary)

Deep neural networks have recently achieved remarkable empirical success in text generation tasks. Users demand both effective and controllable generation, which mean respectively to generate high-quality human-level sequences of words, and to control text output in certain aspects to accommodate different practical needs. Researchers have therefore adopted various neural architectures and techniques. Deep latent variable models, especially variational auto-encoders, have played important roles in representation learning of languages, owing to which promising performances have been shown in several tasks. In recent years, self-attention architectures become the main-stream workhorses and have promoted state-of-the-art generation performances by large margins. Given such context, this thesis presents several studies towards effective and controllable neural text generation.

The first part studies the main challenge in using traditional RNN or LSTM based deep latent variable models for text generation, which is the notorious “posterior collapse” issue that leads to ineffective usage of input representations. Sample-based representation for natural language is proposed to mitigate the issue and learn better latent features for downstream generation applications. More effective deep latent variable models are trained with mutual information taking into account.

The second part deals with controllable generation of open-domain long text in the era of self-attention architectures. When large scale pre-trained language models have shown surprising generation capabilities and the dominant paradigm of “pre-training + fine-tuning” emerges to shine in natural language understanding, the thesis presents long text generation based on pre-trained model with fine-grained control. A new task named “outline to story” is introduced as a test bed and a simple yet powerful baseline model named “fine-tune with special tokens” is proposed.

The third part explores deep latent variable models, specifically variational auto-encoders with self-attention architectures. By integrating latent representation vectors with self-attention based pre-trained components, the conditional variational auto-encoder shows effective representation learning performance as expected, and provides a principled way for controlled and conditional generation. This part takes conditional story generation as main test bed and presents satisfying empirical results.

Overall, this manuscript advocates to revive the power of representation learning in the era of large-scale pre-trained Transformer-based modeling, in order to achieve both effective and controllable text generation in a principled way.

Indexing (document details)
Advisor: Chen, Changyou, Dong, Wen
Commitee: Gao, Jing
School: State University of New York at Buffalo
Department: Computer Science and Engineering
School Location: United States -- New York
Source: DAI-B 81/12(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science, Artificial intelligence
Keywords: NLP, Text generation
Publication Number: 27994869
ISBN: 9798617018181
Copyright © 2020 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest