Hybrid systems for analyzing big data integrate an analytic tool and a dedicated data-management platform, storing data and operating on the data at both components. While hybrid systems have benefits over alternative architectures, in order to be effective, data movement between the two hybrid components must be minimized. Extant hybrid systems either fail to address performance problems stemming from inter-component data movement, or else require the user to explicitly reason about and manage data movement. My work presents the design, implementation, and evaluation of a hybrid analytic system for array-structured data that automatically minimizes data movement between the hybrid components.
The proposed research first motivates the need for automatic data-movement minimization in hybrid systems, demonstrating that under workloads whose inputs vary in size, shape, and location, automation is the only practical way to reduce data movement. I then present a prototype hybrid system that automatically minimizes data movement. The exposition includes salient contributions to the research area, including a partial semantic mapping between hybrid components, the adaptation of rewrite-based query transformation techniques to minimize data movement in array-modeled hybrid systems, and empirical evaluation of the approach's utility. Experimental results not only illustrate the hybrid system's overall effectiveness in minimizing data movement, but also illuminate contributions made by various elements of the design.
|Commitee:||Jones, Mark P., Monsere, Christopher M., Tufte, Kristin|
|School:||Portland State University|
|School Location:||United States -- Oregon|
|Source:||DAI-B 76/05(E), Dissertation Abstracts International|
|Keywords:||Array analytics, Big data, Hybrid systems, Query optimization, R, SciDB|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be