Dissertation/Thesis Abstract

Database Streaming Compression on Memory-Limited Machines
by Bruccoleri, Damon, Ph.D., Nova Southeastern University, 2018, 208; 10751913
Abstract (Summary)

Dynamic Huffman compression algorithms operate on data-streams with a bounded symbol list. With these algorithms, the complete list of symbols must be contained in main memory or secondary storage. A horizontal format transaction database that is streaming can have a very large item list. Many nodes tax both the processing hardware primary memory size, and the processing time to dynamically maintain the tree.

This research investigated Huffman compression of a transaction-streaming database with a very large symbol list, where each item in the transaction database schema’s item list is a symbol to compress. The constraint of a large symbol list is, in this research, equivalent to the constraint of a memory-limited machine. A large symbol set will result if each item in a large database item list is a symbol to compress in a database stream. In addition, database streams may have some temporal component spanning months or years. Finally, the horizontal format is the format most suited to a streaming transaction database because the transaction IDs are not known beforehand. This research prototypes an algorithm that will compresses a transaction database stream.

There are several advantages to the memory limited dynamic Huffman algorithm. Dynamic Huffman algorithms are single pass algorithms. In many instances a second pass over the data is not possible, such as with streaming databases. Previous dynamic Huffman algorithms are not memory limited, they are asymptotic to O(n), where n is the number of distinct item IDs. Memory is required to grow to fit the n items. The improvement of the new memory limited Dynamic Huffman algorithm is that it would have an O(k) asymptotic memory requirement; where k is the maximum number of nodes in the Huffman tree, k < n, and k is a user chosen constant. The new memory limited Dynamic Huffman algorithm compresses horizontally encoded transaction databases that do not contain long runs of 0’s or 1’s.

Indexing (document details)
Advisor: Sun, Junping
Commitee: Li, Wei, Raigoza, Jamie
School: Nova Southeastern University
Department: Computer Science (CISC, CISD)
School Location: United States -- Florida
Source: DAI-B 79/08(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: FPGA compression, High frequency trading, Huffman compression, Minimum delay, Streaming compression
Publication Number: 10751913
ISBN: 9780355803051
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest