Dissertation/Thesis Abstract

Hybrid database: Dynamic selection of database infrastructure to improve query performance
by Williams, Michael, M.S., California State University, Long Beach, 2016, 27; 10195971
Abstract (Summary)

Distributed file systems have enabled storage and parsing of arbitrarily large datasets with linearly scaling to hardware resources, however the latency created for minor queries of large datasets becomes untenable in a production environment. By utilizing data storage on both a distributed file system and a traditional relational database, this product will achieve low latency data service to users while maintaining complete archiving.

The software stack will be utilizing the Apache Hadoop Distributed File System for distributed storage. Apache Hive will be used for queries of the distributed file system. A MySQL database backend will be used for the traditional database service. A J2EE web application will serve as the user interface.

Decisions on which data service will provide the requested data with the lowest latency will be determined by evaluating the query.

Indexing (document details)
Advisor: Hoffman, Michael
Commitee: Aliasgari, Mehrdad, Maples, Tracy
School: California State University, Long Beach
Department: Computer Engineering and Computer Science
School Location: United States -- California
Source: MAI 56/02M(E), Masters Abstracts International
Subjects: Computer science
Keywords: Database infrastructure, Query performance
Publication Number: 10195971
ISBN: 9781369320312
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy