Many innovative scientific applications rely on high-performance computing to perform large computations. Tomosynthesis Mammography, which performs high-resolution image reconstruction, is one such computation-intensive application. Currently, it is difficult to launch MPI applications on multiple distributed and heterogeneous computing platforms, or computing grids, since application developers and users must address problems such as application scheduling, resource allocation and co-allocation, and inter-process communication. Our objective is to provide location-, topology-, and administrative-transparent grid computing for MPI applications, while hiding the physical details of computing platforms, and heterogeneous networks from the application developers and users.
In this dissertation, we introduced a resource allocation model, workflow structures to specify MPI applications involving multiple tasks, and message relay to enable communication across different networks. We developed the SGR framework, which integrates workflow scheduling, task grouping, and message relay services, while hiding resource allocation, heterogeneous networks, and decentralized resource management systems from application developers and users. The SGR system has been implemented on a Globus-enabled computing grid.
To investigate the effectiveness of our resource allocation model and framework design, we created a simulation environment for a computing grid and task schedulers. The simulation results show that the new dynamic task duplication approach can allow simple task scheduling algorithms to achieve performance similar to what would be achieved using a more sophisticated scheduling algorithm with accurate predictions of queuing times and the job preemption technique. Over 40% performance improvement is obtained by simple task schedulers using two duplicated requests.
We tested our SGR framework by conducting detailed experiments on a two-cluster grid, and observed that duplication can improve performance by more than 15%. These results validate our model. Moreover, we experimentally evaluated our new message relay service for cross-site message passing. The test results indicate that although the SGR’s message relay service has some communication overhead, the system is scalable with respect to the number of processes and the message size.
|Commitee:||Cooperman, Gene, Kaeli, David|
|Department:||Electrical and Computer Engineering|
|School Location:||United States -- Massachusetts|
|Source:||DAI-B 72/04, Dissertation Abstracts International|
|Subjects:||Computer Engineering, Computer science|
|Keywords:||Adaptive computing, Distributed computing, Grid computing, Message passing, Parallel computing, Resource allocation, Workflow execution|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
supplemental files is subject to the ProQuest Terms and Conditions of use.