Dissertation/Thesis Abstract

Online Management and Mining of Heteregenous and Dynamic Time Series
by Altiparmak, Fatih, Ph.D., The Ohio State University, 2008, 176; 10630894
Abstract (Summary)

In this Ph.D. dissertation, we propose database solutions for some of the major challenges in mining and managing time-series data. In particular, we propose a framework for mining heterogeneous time-series data, and a framework for online summarization and analysis of dynamic time-series data.

We propose a general framework, Information Mining, to acquire information from heterogeneous and potentially high dimensional time-series data. The framework consists of two major steps: first, significant, clean, and homogeneous subsets of data are identified and analyzed using a data mining algorithm, then the information gathered in the first step is further refined by identifying common (or distinct) patterns over the results of mining of the subsets. We extend our approach for a class of mining tasks over microarray and clinical trials time-series applications and show that Information Mining is an effective method for mining these datasets.

In a multiple data stream application, a new element for each data sequence, i.e. time-series, is periodically inserted into the database. The data in multiple streams is usually compressed due to storage limitations, and the data is reconstructed at the time of query. The quality of this reconstruction should be good enough to run general types of queries, i.e. range, and k-nn queries, on it. We present an online technique, PQ-Stream, which provides a high quality reconstruction. We showed that PQ-Stream significantly outperforms the current techniques for a wide variety of query types on both synthetic and real data sets. The interest of the queries is not uniformly distributed over the all time units; most queries involve the newest few time units. The storage can be assigned to the time units based on their order of the query interest. We propose Ladder Approach to mitigate the stress on storage in multi-stream systems by adding the element of age to the sliding window. The Ladder Approach was shown effective in two real streaming applications, i.e. weather and stock data.

Indexing (document details)
Advisor: Ferhatosmanoglu, Hakan
Commitee: Parthasarathy, Srinivasan, Pearl, Dennis
School: The Ohio State University
Department: Computer and Information Science
School Location: United States -- Ohio
Source: DAI-B 78/11(E), Dissertation Abstracts International
Subjects: Computer science
Keywords: Heteregenous, Management, Mining, Online
Publication Number: 10630894
ISBN: 978-0-355-01220-0
Copyright © 2020 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy