Automatic summarization involves finding the most important information in a text in order to create a reduced version of that text that conveys the same meaning as the original. In this dissertation, I present a method for using topic information to influence which content is selected for a summary.
This dissertation addresses questions such as how to represent the meaning of a document for automatic tasks. For tasks such as automatic summarization, there is a tradeoff between using sophisticated linguistic methods and using methods that can easily and efficiently be used by automatic systems. This research seeks to find a balance between these two goals by using linguistically-motivated methods that can be used to improve automatic summarization performance. Another question addressed in this work is the balance between summary coverage and length. A summary must be long enough to convey the information from the original text but short enough to be useful in place of the original document. This dissertation explores the use of topics to increase coverage while reducing redundancy.
There are several issues that affect summary quality. These include information coverage, redundancy, and coherence. This dissertation focuses on achieving coverage of all distinct concepts in a text by incorporating topic structure. During the summarization process, emphasis is placed on including information from all topics in order to produce summaries that cover the range of information present in the original documents. In this work, several notions of what constitutes a topic are explored, with particular focus on defining topics using information from Rhetorical Structure Theory (Mann and Thompson 1988). The results of incorporating topics into a summarization system show that topic structure improves automatic summarization performance.
The contributions of this dissertation include demonstrating that focusing on coverage of the different topics in a text improves summaries, and topic structure is an effective way to achieve this coverage. This research also shows the effectiveness of a simple modular method for incorporating topics into summarization that allows for comparison of different notions of topic and summarization techniques.
|School Location:||United States -- Connecticut|
|Source:||DAI-A 79/12(E), Dissertation Abstracts International|
|Keywords:||Automatic Summarization, Discourse Structure, Rhetorical Structure Theory, Summarization, Topic Structure|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be