A key issue in software evolution analysis is being able to compute evolutionary change accurately and with rich semantics. This dissertation describes a mathematical framework for enabling accurate computation of semantic evolutionary change. It is based on graphs for representing software semantics, graph transformations for modeling evolution, and effects of graph transformations for capturing evolutionary change.
We formulate the notions of evolution sets and the evolution distance to measure evolutionary change. Then, we define an appropriate notion of optimal graph alignment to compute evolutionary change accurately. Establishing a rigorous foundation for computing evolutionary change is important for developing powerful automated tools for software evolution analysis. Cost estimation, software merging, reliability analysis, clone detection, incremental testing, validation and other software applications can benefit from precise computation of evolutionary change. A rigorous foundation also allows leveraging the extensive research on graph alignments to advance software engineering.
We have created a framework for experimental evaluation of graph alignment algorithms. The framework includes a graph testbed, an accuracy metric, and a graph alignment visualization (GAV) mechanism. The framework is targeted at applications where a precise computation of evolutionary change from one system to the next is needed to reveal valuable knowledge about the system and its evolution. The accuracy metric is based on a new measure for graph difference and a new notion of optimality of graph alignment. The metric is designed to measure the degree to which an alignment is inaccurate, that is the degree to which it reports spurious differences. Such a metric is meaningful for estimating the efficiency of and resources necessary for many software evolution analyses.
The Minimal Signature Tree (MST) is introduced as a new data structure that can be constructed for a graph G and can be used to design efficient heuristic graph analysis algorithms. The graphs are assumed to be attributed, directed, and acyclic.
The MST is a rooted tree that stores the minimal signatures which are defined to be sequences of labels that describe special subgraphs in G. A minimal signature serves as an artifact to pinpoint a node of G. The MST generalizes the notion of suffix tree from strings to graphs. If the graph G is a string then the MST is homomorphic to the suffix tree. The MST exposes the internal structure of a graph and makes it possible to design efficient algorithms.
A MST-based algorithm for differencing graphs G1 and G2 is presented. This algorithm involves constructing a combined MST, called the Co-MST, that stores minimal signatures that are common to G1 and G2. For each common minimal signature s, the node in G1 and the node in G2 pinpointed by s are aligned. The nodes not covered by common minimal signatures may be further aligned using an existing graph alignment technique.
We present an experimental study which includes a comparison of a MST-based graph differencing algorithm with two other algorithms. The experiments involve large graphs extracted from Linux, synthetic graphs reaching ten thousand nodes, variations of connectivity and the number of distinct labels, and a priori known differences to verify the accuracy of differencing.
|Advisor:||Kothari, Suraj C.|
|Commitee:||Aluru, Srinivas, Basu, Samik, Nguyen, Tien, Tyagi, Akhilesh|
|School:||Iowa State University|
|Department:||Electrical and Computer Engineering|
|School Location:||United States -- Iowa|
|Source:||DAI-B 70/03, Dissertation Abstracts International|
|Keywords:||Cost estimation, Evolution distance, Evolutionary change, Graph difference, Minimal signature trees, Software evolution|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be