Dissertation/Thesis Abstract

High performance methods for frequent pattern mining
by Vu, Lan, Ph.D., University of Colorado at Denver, 2014, 190; 3667246
Abstract (Summary)

Current Big Data era is generating tremendous amount of data in most fields such as business, social media, engineering, and medicine. The demand to process and handle the resulting "big data" has led to the need for fast data mining methods to develop powerful and versatile analysis tools that can turn data into useful knowledge. Frequent pattern mining (FPM) is an important task in data mining with numerous applications such as recommendation systems, consumer market analysis, web mining, network intrusion detection, etc. We develop efficient high performance FPM methods for large-scale databases on different computing platforms, including personal computers (PCs), multi-core multi-socket servers, clusters and graphics processing units (GPUs). At the core of our research is a novel self-adaptive approach that performs efficiently and fast on both sparse and dense databases, and outperforms its sequential counterparts. This approach applies multiple mining strategies and dynamically switches among them based on the data characteristics detected at runtime. The research results include two sequential FPM methods (i.e. FEM and DFEM) and three parallel ones (i.e. ShaFEM, SDFEM and CGMM). These methods are applicable to develop powerful and scalable mining tools for big data analysis. We have tested, analysed and demonstrated their efficacy on selecting representative real databases publicly available at Frequent Itemset Mining Implementations Repository.

Indexing (document details)
Advisor: Alaghband, Gita
Commitee: Alaghband, Gita, Altman, Tom, Mannino, Michael, Ra, Ilkyeun, Vu, Tam
School: University of Colorado at Denver
Department: Computer Science
School Location: United States -- Colorado
Source: DAI-B 76/04(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: Database, Frequent pattern mining, GPGPU, High performance computing, Multi-core cluster, Shared memory multi-core systems
Publication Number: 3667246
ISBN: 9781321412291
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest