Scientific work flows are common and powerful tools used to elevate small scale analysis to large scale distributed computation. They provide ease of use for domain scientists by supporting the use of applications as they are, partitioning the data for concurrency instead of the application. However, many of these work flows are written in a way that couples the scientific intention with the specificity of the execution environment. This coupling limits the flexibility and portability of the work flow, requiring the work ow to be re-engineered for each new dataset or site.
I propose that work flows can be written for pure scientific intent, with the idiosyncrasies of execution resolved at runtime using work flow abstractions. These abstractions would allow work flows to be quickly transformed for different configurations, specifically handling new datasets, diverse sites, and different configurations. I examine three methods for developing work flow abstraction on static work flows, apply these methods to a dynamic work flow, and propose an approach that separates the user from the distributed environment.
In developing these methods for static work flows I first explored Dynamic Work-Flow Expansion, which allows work flows to be quickly adapted for new and diverse datasets. Then I describe an algorithm for statically determining a work flow's storage needs, which is used at runtime to prevent storage deadlocks. Finally, I develop an algebra for transforming work flows, which isolates site and configuration specific designs to be applied to work flows as needed. These methods were combined and applied to a dynamic work flow, adapting a site bounds MPI application to a dynamic cloud work flow.
I combine these methods and formulated the Continuously Divisible Jobs abstraction to separate the domain scientist's application from the distributed logic of a dynamic work flow. This abstraction defines an API which applications can implement to allow for dynamic distributed computation, showcasing the flexibility and portability provided through work flow abstractions.
|School:||University of Notre Dame|
|School Location:||United States -- Indiana|
|Source:||DAI-B 82/3(E), Dissertation Abstracts International|
|Keywords:||Scientific work flows, Static workflow systems, Distributed computation|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be