In recent years, link prediction has been applied to a wide range of real-world applications which often generate massive dynamic networks that require an effective real-time approach to predicting the formation of future links. Traditionally, link prediction approaches utilize a single snapshot of a network to predict future links. However, real-world network data often evolves dynamically at a rapid pace by adding and removing links. Therefore, there is a need for a dynamic and online link prediction framework. This dissertation focuses on challenges and solutions with the aim of advancing a link prediction framework for use in real-time analytics.
For real-time link prediction, the framework should 1) be reliable and accurate, 2) maintain learning models, and 3) calculate node similarities in real time. In a real-world application that deals with time-varying networks, it is important to understand predictive models in a time-varying context. In this work, we develop several guidelines for using prediction models in a dynamic network. We also propose an incremental support vector machine method for link prediction, which updates the model using the latest data available as well as historical information.
While being able to forecast future links accurately is vital, another equally important problem is to identify the most important and relevant links among large numbers of future links. To address this problem, we propose a domain-independent, supervised method that predicts the rank of future links using objective interestingness measures.
We also propose an iterative link classification method, which updates the network using only predicted links with a high confidence level at each iteration. Using this method, we observed a significant improvement in accuracy and recall over the baseline link prediction method.
Our proposed solutions address two out of the three requirements defined above, by focusing on maintaining the learning models and increasing the reliability and accuracy of link prediction in a dynamic network. In our future work, we plan to extend this research to address the final requirement by developing the approximation algorithms for computing similarity measures in large dynamic and streaming networks, in real time, using distributed computing frameworks.
|Advisor:||Raghavan, Vijay V.|
|Commitee:||Benton, Ryan G., Chu, Chee-Hung Henry, Farmer-Kaiser, Mary, Gottumukkala, Raju N., Salehi, Mohsen Amini|
|School:||University of Louisiana at Lafayette|
|School Location:||United States -- Louisiana|
|Source:||DAI-B 80/08(E), Dissertation Abstracts International|
|Keywords:||Data mining, Graph mining, Incremental learning, Interestingness, Link prediction|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be