Dissertation/Thesis Abstract

Citation handling: Processing citation texts in scientific documents
by Whidby, Michael Alan, M.S., University of Maryland, College Park, 2012, 77; 1529438
Abstract (Summary)

Citation sentences (sentences that cite other papers) play a key role in the summarization of scientific articles. However, a citation-based summarization system that depends on generic natural language processing components, such as parsers or sentence compressors, will perform poorly if those components cannot handle citations correctly.

In this thesis, I examine the effect of citation handling on parsing, sentence compression, and multi-document summarization. There are two types of citations that occur in citation sentences: constituent citations and parenthetical citations. I propose an automatic citation classifier based on training data created through Mechanical Turk tasks. I demonstrate that the use of type-specific citation handling as pre-processing improves the performance of a state-of-the-art generic parser, both for quality of the parse trees and running time. Extrinsic evaluations demonstrate that improving the performance of a parser on citation sentences in turn improves the performance of a sentence compressor, Trimmer (Zajic et al., 2007), and a multi-document summarization system, MASCS, according to several summarization measures.

Indexing (document details)
Advisor: Dorr, Bonnie, Zajic, David
Commitee: Daume, Hal, III
School: University of Maryland, College Park
Department: Computer Science
School Location: United States -- Maryland
Source: MAI 51/03M(E), Masters Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: Citation, Multi-document summarization, Parsing, Sentence compression
Publication Number: 1529438
ISBN: 9781267726278
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest