We are all aware of the rise of information age: heterogeneous sources of information and the ability to publish rapidly and indiscriminately are responsible for information chaos. In this work, we are interested in a system which can separate the "wheat" of vital information from the chaff within this information chaos. An efficient filtering system can accelerate meaningful utilization of knowledge. Consider Wikipedia, an example of community-driven knowledge synthesis. Facts about topics on Wikipedia are continuously being updated by users interested in a particular topic. Consider an automatic system (or an invisible robot) to which a topic such as "President of the United States" can be fed. This system will work ceaselessly, filtering new information created on the web in order to provide the small set of documents about the "President of the United States" that are vital to keeping the Wikipedia page relevant and up-to-date. In this work, we present an automatic information filtering system for this task. While building such a system, we have encountered issues related to scalability, retrieval algorithms, and system evaluation; we describe our efforts to understand and overcome these issues.
|Advisor:||Carterette, Benjamin A.|
|School:||University of Delaware|
|Department:||Department of Computer and Information Sciences|
|School Location:||United States -- Delaware|
|Source:||MAI 54/01M(E), Masters Abstracts International|
|Keywords:||Information filtering, Information retrieval, Modeling, Natural language processing, Search, Statistical information retrieval|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be