Dissertation/Thesis Abstract

On the identification of statistically significant network topology
by Michaelson, Gregory Vincent, Ph.D., The University of Alabama, 2010, 130; 3409121
Abstract (Summary)

Determining the structure of large and complex networks is a problem that has stirred great interest in many fields including mathematics, computer science, sociology, biomedical research, and epidemiology. Despite this high level of interest, though, there still exists no procedure for formal hypothesis testing to measure the significance of detected community structure in an observed network. First, this work proposes three, more general alternatives to modularity, the most common measure of community structure, which allow for the detection of more general structure in networks. An approach based upon the likelihood ratio test is shown not only to be as effective as modularity in detecting modular structure but also able to detect a wide variety of other network topologies. Second, this work proposes a general and novel test, the Likelihood Ratio Cluster (LRC) test, for assessing the statistical significance of the output of clustering algorithms. This technique is demonstrated by applying it to the sample partitions generated by both network and conventional clustering algorithms. Finally, a method for evaluating the capability of heuristic clustering techniques to detect the optimal sample partition is developed. This technique is used to evaluate several common community detection algorithms. Surprisingly, the most popular community detection algorithm is found to be largely ineffective at detecting the optimal partition of a random network. Also surprisingly, Clauset’s fast algorithm (Clauset et al., 2004), which is commonly thought to be fast but inaccurate, is found to be the most effective of the algorithms examined at detecting the optimal partition in random networks.

Indexing (document details)
Advisor: Perry, Marcus B.
Commitee: Conerly, Michael, Gray, Brian, McManus, Denise, Visscher, Pieter
School: The University of Alabama
Department: Applied Statistics
School Location: United States -- Alabama
Source: DAI-B 71/07, Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Statistics
Keywords: Clustering, Hypothesis testing, Likelihood ratio test, Modularity, Networks
Publication Number: 3409121
ISBN: 9781124061580
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest