Dissertation/Thesis Abstract

Architecture and performance of runtime environments for data intensive scalable computing
by Ekanayake, Jaliya, Ph.D., Indiana University, 2010, 164; 3439561
Abstract (Summary)

In a world of data deluge, considerable computational power is necessary to derive knowledge from the mountains of raw data which surround us. This trend mandates the use of various parallelization techniques and runtimes to perform such analyses in a meaningful period of time. The information retrieval community has introduced a programming model and associated runtime architecture under the name of MapReduce, and it has demonstrated its applicability to several major operations performed by and within this community. Our initial research demonstrated that, although the applicability of MapReduce is limited to applications with fairly simple parallel topologies, with a careful set of extensions, the programming model can be extended to support more classes of parallel applications; in particular, this holds true for the class of Composable Applications.

This thesis presents our experiences in identifying a set of extensions for the MapReduce programming model, which expands its applicability to more classes of applications, including the iterative MapReduce computations; we have also developed an efficient runtime architecture, named Twister, that supports this new programming model. The thesis also includes a detailed discussion about mapping applications and their algorithms to MapReduce and its extensions, as well as performance analyses of those applications which compare different MapReduce runtimes. The discussions of applications demonstrates the applicability of the Twister runtime for large scale data analyses, while the empirical evaluations prove the scalability and the performance advantages one can gain from using Twister.

Indexing (document details)
Advisor: Fox, Geoffrey
Commitee: Gannon, Dennis, Leake, David, Lumsdaine, Andrew
School: Indiana University
Department: Computer Sciences
School Location: United States -- Indiana
Source: DAI-B 72/03, Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Computer science
Keywords: Composable, Data intensive, Distributed computing, Mapreduce, Parallel computing, Programming models, Scalable computing
Publication Number: 3439561
ISBN: 9781124447711
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest