Abstract
Grid Workflows are emerging as practical programming models
for solving large e-scientific problems on the Grid. However, it is
typically assumed that the workflow components either read or
write data to conventional files, which are copied from one
execution stage to another, or they are tightly coupled using IPC
libraries such as MPI or distributed streaming. More flexible
communication can be achieved by overloading conventional
READ and WRITE operations with advanced IO mechanisms
such as sockets, streams and pipes, as is done in the GriddLeS
environment. Such flexibility allows the pipelining of temporally
dependent components, or in contrast, delaying of tightly coupled
computations based on the current resource availability and
network connectivity. However, it is also harder to schedule the
workflow, because the communication mode may not be decide
until run time. In this paper, we propose a new scheduling model
that leverages such communication flexibility and allows us to
generate dynamic runtime schedules. The scheduler in this case,
not only allocates components to distributed Grid resources, but
also specifies the inter-component communication mechanism
(socket, pipe etc.) The current model is implemented as a dynamic
workflow scheduling tool called GridRod, which harnesses
Nimrod/G's [1] Grid services and GriddLeS [2] web services.