Markov random walk models are powerful analytical tools for multiple areas in machine learning, numerical optimizations and data mining tasks. The key assumption of a first-order Markov chain is memorylessness, which restricts the dependence of the transition distribution to the current state only. However in many applications, this assumption is not appropriate. We propose a set of higher-order random walk techniques and discuss their applications to tensor co-clustering, user trails modeling, and solving linear systems. First, we develop a new random walk model that we call the super-spacey random surfer, which simultaneously clusters the rows, columns, and slices of a nonnegative three-mode tensor. This algorithm generalizes to tensors with any number of modes. We partition the tensor by minimizing the exit probability between clusters when the super-spacey random walk is at stationary. The second application is user trails modeling, where user trails record sequences of activities when individuals interact with the Internet and the world. We propose the retrospective higher-order Markov process as a two-step process by first choosing a state from the history and then transitioning as a first-order chain conditional on that state. This way the total number of parameters is restricted and thus the model is protected from overfitting. Lastly we propose to use a time-inhomogeneous Markov chain to approximate the solution of a linear system. Multiple simulations of the random walk are conducted to approximate the solution. By allowing the random walk to transition based on multiple matrices, we decrease the variance of the simulations, and thus increase the speed of the solver.
|Advisor:||Gleich, David F.|
|Commitee:||Grama, Ananth, Maji, Hemanta K., Ribeiro, Bruno|
|School Location:||United States -- Indiana|
|Source:||DAI-B 79/10(E), Dissertation Abstracts International|
|Subjects:||Statistics, Artificial intelligence, Computer science|
|Keywords:||Higher-order data analysis, Machine learning, Markov chain, Monte Carlo, Tensor clustering|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be