In this Ph.D. dissertation, we propose database solutions for some of the major challenges in mining and managing time-series data. In particular, we propose a framework for mining heterogeneous time-series data, and a framework for online summarization and analysis of dynamic time-series data.
We propose a general framework, Information Mining, to acquire information from heterogeneous and potentially high dimensional time-series data. The framework consists of two major steps: first, significant, clean, and homogeneous subsets of data are identified and analyzed using a data mining algorithm, then the information gathered in the first step is further refined by identifying common (or distinct) patterns over the results of mining of the subsets. We extend our approach for a class of mining tasks over microarray and clinical trials time-series applications and show that Information Mining is an effective method for mining these datasets.
In a multiple data stream application, a new element for each data sequence, i.e. time-series, is periodically inserted into the database. The data in multiple streams is usually compressed due to storage limitations, and the data is reconstructed at the time of query. The quality of this reconstruction should be good enough to run general types of queries, i.e. range, and k-nn queries, on it. We present an online technique, PQ-Stream, which provides a high quality reconstruction. We showed that PQ-Stream significantly outperforms the current techniques for a wide variety of query types on both synthetic and real data sets. The interest of the queries is not uniformly distributed over the all time units; most queries involve the newest few time units. The storage can be assigned to the time units based on their order of the query interest. We propose Ladder Approach to mitigate the stress on storage in multi-stream systems by adding the element of age to the sliding window. The Ladder Approach was shown effective in two real streaming applications, i.e. weather and stock data.
|Commitee:||Parthasarathy, Srinivasan, Pearl, Dennis|
|School:||The Ohio State University|
|Department:||Computer and Information Science|
|School Location:||United States -- Ohio|
|Source:||DAI-B 78/11(E), Dissertation Abstracts International|
|Keywords:||Heteregenous, Management, Mining, Online|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be