In the era of big data, many cluster platforms and resource management schemes are created to satisfy the increasing demands on processing a large volume of data. A general setting of big data processing jobs consists of multiple stages, and each stage represents generally defined data operation such as filtering and sorting. To parallelize the job execution in a cluster, each stage includes a number of identical tasks that can be concurrently launched at multiple servers. Practical clusters often involve hundreds or thousands of servers processing a large batch of jobs. Resource management, that manages cluster resource allocation and job execution, is extremely critical for the system performance.
Generally speaking, there are three main challenges in resource management of the new big data processing systems. First, while there are various pending tasks from different jobs and stages, it is difficult to determine which ones deserve the priority to obtain the resources for execution, considering the tasks' different characteristics such as resource demand and execution time. Second, there exists dependency among the tasks that can be concurrently running. For any two consecutive stages of a job, the output data of the former stage is the input data of the later one. The resource management has to comply with such dependency. The third challenge is the inconsistent performance of the cluster nodes. In practice, run-time performance of every server is varying. The resource management needs to dynamically adjust the resource allocation according to the performance change of each server.
The resource management in the existing platforms and prior work often rely on fixed user-specific configurations, and assumes consistent performance in each node. The performance, however, is not satisfactory under various workloads. This dissertation aims to explore new approaches to improving the efficiency of large-scale big data processing platforms. In particular, the run-time dynamic factors are carefully considered when the system allocates the resources. New algorithms are developed to collect run-time data and predict the characteristics of jobs and the cluster. We further develop resource management schemes that dynamically tune the resource allocation for each stage of every running job in the cluster. New findings and techniques in this dissertation will certainly provide valuable and inspiring insights to other similar problems in the research community.
|Commitee:||Ding, Wei, Sheng, Bo, Simovici, Dan, Zhang, Honggang|
|School:||University of Massachusetts Boston|
|School Location:||United States -- Massachusetts|
|Source:||DAI-B 78/10(E), Dissertation Abstracts International|
|Keywords:||Big data, Performance, Scheduling, System|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be