Dissertation/Thesis Abstract

Effects of Topic Structure on Automatic Summarization
by Schrimpf, Natalie Margaret, Ph.D., Yale University, 2018, 229; 10957338
Abstract (Summary)

Automatic summarization involves finding the most important information in a text in order to create a reduced version of that text that conveys the same meaning as the original. In this dissertation, I present a method for using topic information to influence which content is selected for a summary.

This dissertation addresses questions such as how to represent the meaning of a document for automatic tasks. For tasks such as automatic summarization, there is a tradeoff between using sophisticated linguistic methods and using methods that can easily and efficiently be used by automatic systems. This research seeks to find a balance between these two goals by using linguistically-motivated methods that can be used to improve automatic summarization performance. Another question addressed in this work is the balance between summary coverage and length. A summary must be long enough to convey the information from the original text but short enough to be useful in place of the original document. This dissertation explores the use of topics to increase coverage while reducing redundancy.

There are several issues that affect summary quality. These include information coverage, redundancy, and coherence. This dissertation focuses on achieving coverage of all distinct concepts in a text by incorporating topic structure. During the summarization process, emphasis is placed on including information from all topics in order to produce summaries that cover the range of information present in the original documents. In this work, several notions of what constitutes a topic are explored, with particular focus on defining topics using information from Rhetorical Structure Theory (Mann and Thompson 1988). The results of incorporating topics into a summarization system show that topic structure improves automatic summarization performance.

The contributions of this dissertation include demonstrating that focusing on coverage of the different topics in a text improves summaries, and topic structure is an effective way to achieve this coverage. This research also shows the effectiveness of a simple modular method for incorporating topics into summarization that allows for comparison of different notions of topic and summarization techniques.

Indexing (document details)
Advisor: Frank, Robert
School: Yale University
School Location: United States -- Connecticut
Source: DAI-A 79/12(E), Dissertation Abstracts International
Subjects: Linguistics
Keywords: Automatic Summarization, Discourse Structure, Rhetorical Structure Theory, Summarization, Topic Structure
Publication Number: 10957338
ISBN: 978-0-438-27383-2
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy