Data Intensive Workflows
STATUS
The content on this page is outdated. The page is archived for reference only. For more information about current work, please contact the Actor Workflow Design Group
Overview
This document is intended for Kepler developers. It is a DRAFT DESIGN DOCUMENT and does not reflect functionality as it currently exists in Kepler. Comments and feedback are appreciated.
Notes
May 25, 2005 (Bertram, Anne, Shawn)
- Use Cases: Workflow author doesn't care (isn't concerned by) how data is transferred versus Workflows designed explicitly for data transfer
- "Smart" channels and receivers
- Mail analogy:
- Bertram puts a letter in his mailbox to shawn
- It takes too long to get to shawn
- so he places a special stamp on the letter/mailbox, saying "express" or "fedex" etc.
- Input and Ouptut of Actor Receiver
- No notion of "place" or location
- Combine execution of actor and transport, as a constraint, where constraint satisfaction
- Mail analogy:
>| A |> -----(channel c)----- >| B |>
- A has a parameter that defines host or "environment" -- and environment condition
- Receiver is the implementation entity that does the communication for a port
- Port is more conceptual; receiver more implementation
- Associate with port, a similar constraint or parameter, e.g., say that "SRB
Put" and "GridFTP" is okay, but don't use "XYZ"
- Why: TSI workflow, e.g., does a lot of explicit moving of data, and you want to have some control ... ultimately, we want the system to take over planning and shuffling. Before we automate it we want to let users say it ourselves ... if we can't do it ourselves, a machine won't be able to
- Port/Receiver instead of in actors
- Channels have to have a "state" -- e.g., for retry, and so on