The ability to extract general, reusable semantic representations of sentences is a longstanding goal in natural language processing. Semantic role labeling (SRL) is an approach to the extraction of such representations in which predicate-argument relations (semantic roles) are identified and classified. Lexicons such as PropBank and VerbNet define predicate senses and corresponding roles, affording ontological grounding and facilitating a broad range of applications such as question answering and dialog state tracking. Despite recent advances in neural network-based approaches to SRL, generalization performance degrades on out-of-domain test data and rare predicates. To address these problems, we investigate improvements to SRL systems through the integration of three distinct but related sources of linguistic knowledge: polysemy and predicate representations, syntactic structure, and role granularity.
Because predicates often have multiple senses, determination of the correct sense of a predicate for a given context, through a process known as word sense disambiguation (WSD), is a critical step towards ontological grounding. Despite this, SRL is often performed independently from WSD. We find that joint learning of VerbNet predicate senses and SRL improves WSD accuracy, and that features from VerbNet senses further improve VerbNet role labeling, with the largest gains on rare predicates and out-of-domain data. Recent advances using language model pre-training and neural networks have challenged the need for explicit syntactic representations in SRL. To further investigate this, we apply shallow syntactic structure to SRL by learning with and constraining inference to syntactic chunks instead of words, finding that this approach improves performance most in the absence of large amounts of training data. We also investigate the use of auxiliary supervision from syntax by performing multitask learning of syntactic dependency parsing and SRL, finding that this improves SRL, particularly on low-frequency predicates. Ontological choices have bearing on not only the utility of the resulting representations but also practical consequences for ease of extraction, balancing tradeoffs between informativeness and generalizability. We investigate the impact of role annotation schemes on SRL generalization performance, comparing PropBank and VerbNet. We find that learning from grouped VerbNet roles improves generalization. Combining insights from this investigation, we find that these three sources of prior linguistic knowledge are complementary, providing cumulative improvements in VerbNet semantic role labeling. Finally, we describe and release a tool for VerbNet semantic parsing intended to encourage further research in this area.
|Commitee:||Martin, James, Hulden, Mans, Tan, Chenhao, Zettlemoyer, Luke|
|School:||University of Colorado at Boulder|
|School Location:||United States -- Colorado|
|Source:||DAI-A 81/11(E), Dissertation Abstracts International|
|Subjects:||Computer science, Linguistics|
|Keywords:||Computational semantics, Natural language processing, PropBank, Semantic roles, VerbNet, Word sense disambiguation|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be