In a world of data deluge, considerable computational power is necessary to derive knowledge from the mountains of raw data which surround us. This trend mandates the use of various parallelization techniques and runtimes to perform such analyses in a meaningful period of time. The information retrieval community has introduced a programming model and associated runtime architecture under the name of MapReduce, and it has demonstrated its applicability to several major operations performed by and within this community. Our initial research demonstrated that, although the applicability of MapReduce is limited to applications with fairly simple parallel topologies, with a careful set of extensions, the programming model can be extended to support more classes of parallel applications; in particular, this holds true for the class of Composable Applications.
This thesis presents our experiences in identifying a set of extensions for the MapReduce programming model, which expands its applicability to more classes of applications, including the iterative MapReduce computations; we have also developed an efficient runtime architecture, named Twister, that supports this new programming model. The thesis also includes a detailed discussion about mapping applications and their algorithms to MapReduce and its extensions, as well as performance analyses of those applications which compare different MapReduce runtimes. The discussions of applications demonstrates the applicability of the Twister runtime for large scale data analyses, while the empirical evaluations prove the scalability and the performance advantages one can gain from using Twister.
|Commitee:||Gannon, Dennis, Leake, David, Lumsdaine, Andrew|
|School Location:||United States -- Indiana|
|Source:||DAI-B 72/03, Dissertation Abstracts International|
|Keywords:||Composable, Data intensive, Distributed computing, Mapreduce, Parallel computing, Programming models, Scalable computing|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
supplemental files is subject to the ProQuest Terms and Conditions of use.