In this paper, a variety of lexical expansion approaches were evaluated using the Medpedia corpus and MiPACQ queries in order to improve the MiPACQ system's retrieval performance. The heart of the MiPACQ system is a document reranking component, and this component utilizes the results from a baseline information retrieval system. However, the baseline IR system used in MiPACQ has poor paragraph level recall performance which limits the reranker's overall performance. To help solve these issues, three broad term expansion approaches are outlined in this paper with the purpose of increasing recall over the baseline Lucene retrieval system without introducing a significant amount of noise. Two of the three expansion approaches only rely on the corpus being indexed, while the last expansion technique requires a domain specific ontology to expand query terms. First, automatic thesaurus generation based on co-occurrences is evaluated as an expansion methodology along side other co-occurrence based expansion methods. Next, a resource based approach that uses the UMLS Metathesaurus for expansion is used to evaluate knowledge rich expansion methods. Finally, latent semantic indexing is evaluated as an alternative to the baseline vector space retrieval model. These methods are compared and tweaked and the best method is recommended to the MiPACQ authors to improve Q & A results.
|Commitee:||Nielsen, Rodeny D., Ward, Wayne H.|
|School:||University of Colorado at Boulder|
|School Location:||United States -- Colorado|
|Source:||MAI 50/06M, Masters Abstracts International|
|Subjects:||Linguistics, Information Technology, Computer science|
|Keywords:||Information retrieval, Latent semantic indexing, Lexical expansion, Query expansion, Question and answering, Umls metathesaurus|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be