COMING SOON! PQDT Open is getting a new home!

ProQuest Open Access Dissertations & Theses will remain freely available as part of a new and enhanced search experience at

Questions? Please refer to this FAQ.

Dissertation/Thesis Abstract

Topic modeling for Wikipedia link disambiguation
by Skaggs, Bradley Alan, M.S., University of Maryland, College Park, 2011, 58; 1506659
Abstract (Summary)

Many articles in the online encyclopedia Wikipedia have hyperlinks to ambiguous article titles. To improve the reader experience, any link to an ambiguous title should be replaced with a link to one of the unambiguous meanings. We propose a novel statistical topic model, which we refer to as the Link Text Topic Model (LTTM), that can suggest new link targets for existing ambiguous links in Wikipedia articles. For evaluation, we develop a method for extracting ground truth from snapshots of Wikipedia at different points in time. We evaluate LTTM on this ground truth, and demonstrate its superiority over existing link- and content-based approaches. Finally, we build a web service that uses LTTM to suggest unambiguous articles for human editors wanting to fix ambiguous links.

Indexing (document details)
Advisor: Getoor, Lise C.
Commitee: Boyd-Graber, Jordan, Daume, Hal, III
School: University of Maryland, College Park
Department: Computer Science
School Location: United States -- Maryland
Source: MAI 50/04M, Masters Abstracts International
Subjects: Web Studies, Information science, Computer science
Keywords: Disambiguation, Link prediction, Topic modeling, Wikipedia
Publication Number: 1506659
ISBN: 978-1-267-19295-0
Copyright © 2021 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy