KAR Saving Process
This page will explain how the KAR save process works.
The save process for a KAR:
1) User initiates the save via a gui action (preferably a subclass of ExportArchiveAction) or a save is programmatically done by some code somewhere. Let's call this the "Save Context".- perhaps it's a right click on an actor on the canvas
- or a user selects "File->Save Archive" from the main menu
- or right clicking on a set of WorkflowRuns in the Workflow Run Manager
- or at the end of an execution of a workflow on a server a program
saves the results to a KAR file
2) The save context adds some NamedObjs to the SaveKAR object Let's call the objects in this list the "Save Initiator Object List". Only ComponentEntities are allowed in the "Save Initiator Object List".
3) The SaveKAR object then calls the save method of all KAREntryHandlers
that have been
registered in the system by different modules, passing to the save
method the "Save Initiator List"
4) Each KAREntryHandler generates a list of KAREntries that should be
saved in the KAR file based
on this Save Initiator list of objects that it received
and based on any information generated in the Save Context (e.g. a
user specified selection of WorkflowRuns).
5) The SaveKAR object then builds and saves the KAR. It includes all of
the KAREntries that were returned
by the KAREntryHandlers and nothing more.
It also adds all the KAREntries to the Cache after the save has
succeeded (provided the kar was saved in a local repository).
Extra Notes:
The "Save Context" and the "EntryHandler" are things
that exist in whatever module they are defined in and only the
SaveKAR object is in the core. By going through all the Handlers in
this way we can have many modules contributing objects to the KAR
without knowledge of what the other modules are doing. This is why we do not just add objects to a list from a particular save context to be added to the KAR, this would exclude any entries that should be included from other modules.
An example: The Save Initiator List has ComponentEntities
in it. These are passed to all of the EntryHandler save methods on the first pass. Call the KarEntries returned by this first
pass through the EntryHandlers, the "Pass 1" Kar Entries. Now the "Pass
1" Kar Entries are passed into the KAREntryHandler save methods on
the second pass, this returns another set of KarEntries that we'll
call the "Pass 2" Kar Entries. You can see here that we're now walking
down the dependency chain, the first pass had ComponentEntities as the
input, which returned any objects that were dependent on the
ComponentEntites, for example the WorkflowRuns, then the second pass had
the WorkflowRuns as the input which might return the ROML and RIOs
associated with the WorkflowRuns. This iterative process would go on
until the KAREntryHandlers were not returning any more KAREntries and
all of the dependencies had been added to the KAR.
For KAR uploading to Metacat:
A new metadata file will be generated for KARs to be uploaded to Metacat. This file will contain KAR manifest information in wrapper xml around the metadata for a particular KAR entry.
Here is an example of what the schema might look like:
<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.kepler-project.org/kar"
xmlns="http://www.kepler-project.org/kar">
<xs:element name="kar">
<xs:complexType>
<xs:sequence>
Here is an example of what a file might look like:
<kar> <mainAttributes> <Manifest-Version value="1.4.2" /> <KAR-Version value="" /> <module-dependencies value="" /> <lsid value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:14:1" /> </mainAttributes> <karEntry> <karEntryAttributes> <Name value="example.urn.lsid.gamma.msi.ucsb.edu.OpenAuth..3700.8.9.xml" /> <dependsOn value="" /> <type value="ptolemy.actor.TypedCompositeActor" /> <dependsOnModule value="provenance;core" /> <handler value="org.kepler.kar.handlers.ActorMetadataKAREntryHandler" /> <lsid value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:8:9" /> </karEntryAttributes> <karEntryXml> <entity name="example" class="org.kepler.moml.CompositeClassEntity"> <property name="entityId" value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:8:9" class="org.kepler.moml.NamedObjId" /> <property name="class" value="ptolemy.actor.TypedCompositeActor" class="ptolemy.kernel.util.StringAttribute"> <property name="id" value="null" class="ptolemy.kernel.util.StringAttribute" /> </property> <property name="derivedFrom" class="org.kepler.moml.NamedObjIdReferralList"> </property> <property name="TOP Provenance Recorder" class="org.kepler.provenance.ProvenanceRecorder"> </property> <property name="module-dependencies" class="ptolemy.kernel.util.StringAttribute" value="provenance;core"> </property> <property name="Reporting Listener" class="org.kepler.module.reporting.ReportingListener"> </property> <property name="SDFDirector" class="ptolemy.domains.sdf.kernel.SDFDirector"> <property name="iterations" class="ptolemy.data.expr.Parameter" value="2"> </property> <property name="vectorizationFactor" class="ptolemy.data.expr.Parameter" value="1"> </property> <property name="allowDisconnectedGraphs" class="ptolemy.data.expr.Parameter" value="false"> </property> <property name="allowRateChanges" class="ptolemy.data.expr.Parameter" value="false"> </property> <property name="constrainBufferSizes" class="ptolemy.data.expr.Parameter" value="true"> </property> <property name="period" class="ptolemy.data.expr.Parameter" value="0.0"> </property> <property name="synchronizeToRealTime" class="ptolemy.data.expr.Parameter" value="false"> </property> <property name="timeResolution" class="ptolemy.actor.parameters.SharedParameter" value="1E-10"> </property> <property name="Scheduler" class="ptolemy.domains.sdf.kernel.SDFScheduler"> <property name="constrainBufferSizes" class="ptolemy.data.expr.Parameter" value="constrainBufferSizes"> </property> </property> <property name="KeplerDocumentation" class="ptolemy.vergil.basic.KeplerDocumentationAttribute"> <property name="description" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>null</configure> </property> <property name="author" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Steve Neuendorffer</configure> </property> <property name="version" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>null</configure> </property> <property name="userLevelDocumentation" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure> <p>The SDF Director is often used to oversee fairly simple, sequential workflows in which the director can determine the order of actor invocation from the workflow. Types of workflows that would run well under an SDF Director include processing and reformatting tabular data, converting one data type to another, and reading and plotting a series of data points. A workflow in which an image is read, processed (rotated, scaled, clipped, filtered, etc.), and then displayed, is also an example of a sequential workflow that requires a director simply to ensure that each actor fires in the proper order (i.e., that each actor executes only after it receives its required inputs).</p> <p>The SDF Director is very efficient and will not tax system resources with overhead. However, this efficiency requires that certain conditions be met, namely that the data consumption and production rate of each actor in an SDF workflow be constant and declared. If an actor reads one piece of data and calculates and outputs a single result, it must always read and output a single token of data. This data rate cannot change during workflow execution and, in general, workflows that require dynamic scheduling and/or flow control cannot use this director. Additionally, the SDF Director has no understanding of passing time (at least by default), and actors that depend on a notion of time may not work as expected. For example, a TimedPlotter actor will plot all values at time zero when used in SDF. </p> <p>By default, the SDF Director requires that all actors in its workflow be connected. Otherwise, the director cannot account for concurrency between disconnected workflow parts. Usually, a PN Director should be used for workflows that contain disconnected actors; however, the SDF Director's allowDisconnectedGraphs parameter may also be set to true. The SDF Director will then schedule each disconnected "island" independently. The director cannot infer the sequential relationship between disconnected actors (i.e., nothing forces the director to finish executing all actors on one island before firing actors on another). However, the order of execution within each island should be correct. Usually, disconnected graphs in an SDF model indicate an error.</p> <p>Because SDF Directors schedule actors to fire only after they receive their inputs, workflows that require loops (feeding an actor's output back into its input port for further processing) can cause "deadlock" errors. The deadlock errors occur because the actor depends on its own output value as an initial input. To fix this problem, use a SampleDelay actor to generate and inject an initial input value into the workflow.</p> <p>The SDF Director determines the order in which actors execute and how many times each actor needs to be fired to complete a single iteration of the workflow. This schedule is calculated BEFORE the director begins to iterate the workflow. Because the SDF Director calculates a schedule in advance, it is quite efficient. However, SDF workflows must be static. In other words, the same number of tokens must be consumed/produced at every iteration of the workflow. Workflows that require dynamic control structures, such as a BooleanSwitch actor that sends output on one of two ports depending on the value of a 'control', cannot be used with an SDF Director because the number of tokens on each output can change for each execution.</p> <p>Unless otherwise specified, the SDF Director assumes that each actor consumes and produces exactly one token per channel on each firing. Actors that do not follow the one-token-per-channel firing convention (e.g., Repeat or Ramp) must declare the number of tokens they produce or consume via the appropriate parameters. </p> <p>The number of times a workflow is iterated is controlled by the director's iterations parameter. By default, this parameter is set to "0". Note that "0" does not mean "no iterations." Rather, "0" means that the workflow will iterate forever. Values greater than zero specify the actual number of times the director should execute the entire workflow. A value of 1, meaning that the director will run the workflow once, is often the best setting when building an SDF workflow. </p> <p>The amount of data processed by an SDF workflow is a function of both the number of times the workflow iterates and the value of the director's vectorizationFactor parameter. The vectorizationFactor is used to increase the efficiency of a workflow by increasing the number of times actors fire each time the workflow iterates. If the parameter is set to a positive integer (other than 1), the director will fire each actor the specified number of times more than normal. The default is 1, indicating that no vectorization should be performed. Keep in mind that changing the vectorizationFactor parameter changes the meaning of a nested SDF workflow and may cause deadlock in a workflow that uses it. </p> <p>The SDF Director has several advanced parameters that are generally only relevant when an SDF workflow contains composite components. In most cases the period, timeResolution, synchronizeToRealTime, allowRateChanges, timeResolution, and constrainBufferSizes parameters can be left at their default values.</p> <p>For more information about the SDF Director, see the Ptolemy documentation (http://ptolemy.eecs.berkeley.edu/papers/05/ptIIdesign3-domains/ptIIdesign3-domains.pdf).</p> </configure> </property> <property name="prop:allowDisconnectedGraphs" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Specify whether to allow disconnected actors in the workflow (by default, all actors are required to be connected). If disconnected actors are permitted, the SDF Director will schedule each disconnected 'island' independently. Nothing "forces" the director to finish executing all actors on one island before firing actors on another. However, the order of execution within each island should be correct. Usually, disconnected graphs in an SDF workflow indicate an error.</configure> </property> <property name="prop:allowRateChanges" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Specify whether dynamic rate changes are permitted or not. By default, rate changes are not permitted, and the director will perform a check to disallow such workflows. If the parameter is selected, then workflows that require rate parameters to be modified during execution are valid, and the SDF Director will dynamically compute a new schedule at runtime. This is an advanced parameter that can usually be left at its default value.</configure> </property> <property name="prop:timeResolution" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The time precision used by this director. All time values are rounded to the nearest multiple of this number. The value is a double that defaults to "1E-10" (which is 10-10). This is an advanced parameter that can usually be left at its default value.</configure> </property> <property name="prop:constrainBufferSizes" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Specify whether buffer sizes are fixed. By default, buffers are fixed, and attempts to write to the buffer that cause the buffer to exceed its scheduled size result in an error. This is an advanced parameter that can usually be left at its default value.</configure> </property> <property name="prop:iterations" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Specify the number of times a workflow is iterated. By default, this parameter is set to "0". Note that "0" does not mean "no iterations." Rather, "0" means that the workflow will iterate forever. Values greater than zero specify the actual number of times the director should execute the entire workflow. A value of 1, meaning that the director will run the workflow once, is often the best setting when building an SDF workflow. </configure> </property> <property name="prop:vectorizationFactor" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The vectorizationFactor is used to increase the efficiency of a workflow by increasing the number of times actors fire each time the workflow iterates. If the parameter is set to a positive integer (other than 1), the director will fire each actor the specified number of times more than normal. The default is 1, indicating that no vectorization should be performed. Keep in mind that changing the vectorizationFactor parameter changes the meaning of a nested SDF workflow and may cause deadlock in a workflow that uses it. </configure> </property> <property name="prop:synchronizeToRealTime" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Specify whether the execution should synchronize to real time or not. By default, the director does not synchronize to real time. If synchronize is selected, the director will only process the workflow when elapsed real time matches the product of the period parameter and the iteration count. Note: if the period parameter has a value of 0.0 (the default), then selecting this parameter has no effect. This is an advanced parameter that can usually be left at its default value. </configure> </property> <property name="prop:period" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The time period of each iteration. The value is a double that defaults to 0.0, which means that the director does not increment workflow time. If the value greater than 0.0, the actor will increment workflow time each time it fires. This is an advanced parameter that can usually be left at its default value. </configure> </property> </property> <property name="entityId" class="org.kepler.moml.NamedObjId" value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:11:1"> </property> <property name="class" class="ptolemy.kernel.util.StringAttribute" value="ptolemy.domains.sdf.kernel.SDFDirector"> <property name="id" class="ptolemy.kernel.util.StringAttribute" value="urn:lsid:kepler-project.org:directorclass:1:1"> </property> </property> <property name="semanticType00" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:1:1#Director"> </property> <property name="semanticType11" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:2:1#Director"> </property> <property name="_location" class="ptolemy.kernel.util.Location" value="{135, 75}"> </property> <property name="derivedFrom" class="org.kepler.moml.NamedObjIdReferralList" value="urn:lsid:kepler-project.org:director:1:1"> </property> </property> <property name="semanticType" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:2:1#FileSystem"> </property> <relation name="relation" class="ptolemy.actor.TypedIORelation"> </relation> <entity name="StringConstant" class="ptolemy.actor.lib.StringConst"> <property name="firingCountLimit" class="ptolemy.data.expr.Parameter" value="NONE"> </property> <property name="NONE" class="ptolemy.data.expr.Parameter" value="0"> </property> <property name="value" class="ptolemy.data.expr.Parameter" value="Hello"> </property> <property name="KeplerDocumentation" class="ptolemy.vergil.basic.KeplerDocumentationAttribute"> <property name="description" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>null</configure> </property> <property name="author" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Edward Lee</configure> </property> <property name="version" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>null</configure> </property> <property name="userLevelDocumentation" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure><p>The StringConstant actor outputs a string specified via the actor's value parameter.</p> <p>Specifying strings with the StringConstant actor is convenient, as the actor does not require that strings be surrounded by quotes. The actor is often used to specify file paths, which can be selected using the Browse button available in the actor's parameters.</p> <p>Specified string values can include references to parameters within scope (i.e., parameters defined at the same level of the hierarchy or higher). </p> <p>NOTE: If using a PN Director, the 'firingCountLimit' parameter is often set to a finite integer (e.g. '1') so that the workflow will terminate. </p> </configure> </property> <property name="port:output" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>An output port that broadcasts a string constant specified by the value parameter. </configure> </property> <property name="port:trigger" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>A multiport that has no declared type (in other words, the port can accept any data type: double, int, array, etc.) If the port is connected, the actor will not fire until the trigger port receives an input token. Connecting the port is optional, but useful when scheduling the actor to perform at a certain time. </configure> </property> <property name="prop:firingCountLimit" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The limit on the number of times the actor will fire. The default value is 'NONE', meaning there is no limit on the number of time the constant will be provided to the output port. Any integer can be provided as a value for this parameter. </configure> </property> <property name="prop:value" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The value produced by the actor. Specified strings do not require enclosing quotes. (To include a '$' sign in the string, enter '$$'.)</configure> </property> </property> <property name="entityId" class="org.kepler.moml.NamedObjId" value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:10:1"> </property> <property name="class" class="ptolemy.kernel.util.StringAttribute" value="ptolemy.actor.lib.StringConst"> <property name="id" class="ptolemy.kernel.util.StringAttribute" value="urn:lsid:kepler-project.org:class:1052:1"> </property> </property> <property name="semanticType00" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:1:1#StringFunctionActor"> </property> <property name="semanticType11" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:2:1#Constant"> </property> <property name="_icon" class="ptolemy.vergil.icon.BoxedValueIcon"> <property name="attributeName" class="ptolemy.kernel.util.StringAttribute" value="value"> </property> <property name="displayWidth" class="ptolemy.data.expr.Parameter" value="60"> </property> </property> <property name="_location" class="ptolemy.kernel.util.Location" value="{100, 225}"> </property> <property name="derivedFrom" class="org.kepler.moml.NamedObjIdReferralList" value="urn:lsid:kepler-project.org:actor:204:1"> </property> </entity> <entity name="Display" class="ptolemy.actor.lib.gui.Display"> <property name="_windowProperties" class="ptolemy.actor.gui.WindowPropertiesAttribute" value="{bounds={478, 357, 484, 186}, maximized=false}"> </property> <property name="_paneSize" class="ptolemy.actor.gui.SizeAttribute" value="[484, 164]"> </property> <property name="rowsDisplayed" class="ptolemy.data.expr.Parameter" value="10"> </property> <property name="columnsDisplayed" class="ptolemy.data.expr.Parameter" value="40"> </property> <property name="suppressBlankLines" class="ptolemy.data.expr.Parameter" value="false"> </property> <property name="title" class="ptolemy.data.expr.StringParameter" value=""> </property> <property name="KeplerDocumentation" class="ptolemy.vergil.basic.KeplerDocumentationAttribute"> <property name="description" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>null</configure> </property> <property name="author" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Yuhong Xiong, Edward A. Lee</configure> </property> <property name="version" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>null</configure> </property> <property name="userLevelDocumentation" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure><p>The Display actor reads tokens of any type via its input multiport, and displays each token on a separate line in a text display window.</p> <p>Specify the size of the text display window with the rowsDisplayed and columnsDisplayed parameters. Simply resizing the window onscreen does not persistently change the size when the workflow is saved, closed, and then re-opened. </p> <p>If the input is a string token, then the actor strips the surrounding quotation marks before displaying the value.</p> <p>Select the suppressBlankLines parameter to specify that the actor not add blank lines to the display. By default, the actor will add blank lines.</p> <p>Note: this actor can consume large amounts of memory. It is not advisable to use it to display large output streams.</p></configure> </property> <property name="port:input" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>A multiport that accepts tokens of any type. </configure> </property> <property name="prop:suppressBlankLines" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>Specify whether the actor should display blank lines (the default) or suppress them.</configure> </property> <property name="prop:rowsDisplayed" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The vertical size of the display, in rows. The value is an integer that defaults to 10.</configure> </property> <property name="prop:columnsDisplayed" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The horizontal size of the display, in columns. The value is an integer that defaults to 40.</configure> </property> <property name="prop:title" class="ptolemy.kernel.util.ConfigurableAttribute"> <configure>The title of the text display window. If specified, the value will appear in the title bar of the text display window.</configure> </property> </property> <property name="entityId" class="org.kepler.moml.NamedObjId" value="urn:lsid:kepler-project.org:actor:7:1"> </property> <property name="class" class="ptolemy.kernel.util.StringAttribute" value="ptolemy.actor.lib.gui.Display"> <property name="id" class="ptolemy.kernel.util.StringAttribute" value="urn:lsid:kepler-project.org:class:883:1"> </property> </property> <property name="semanticType00" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:1:1#TextualOutputActor"> </property> <property name="semanticType11" class="org.kepler.sms.SemanticType" value="urn:lsid:localhost:onto:2:1#TextualOutput"> </property> <property name="_location" class="ptolemy.kernel.util.Location" value="[345.0, 145.0]"> </property> </entity> <link port="StringConstant.output" relation="relation" /> <link port="Display.input" relation="relation" /> </entity> </karEntryXml> </karEntry> <karEntry> <karEntryAttributes> <Name value="example_ROML.xml" /> <dependsOn value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:8:9" /> <type value="org.kepler.reporting.roml.ReportLayout" /> <dependsOnModule value="reporting" /> <handler value="org.kepler.kar.handlers.ReportLayoutKAREntryHandler" /> <lsid value="urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:9:1" /> </karEntryAttributes> <karEntryXml> <report> <title>A report</title> <headerGraphic /> <lsid>urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:9:1</lsid> <workflow>urn:lsid:gamma.msi.ucsb.edu/OpenAuth/:3700:8:9</workflow> <item name=".Unnamed1.Display.input" type="" class="org.kepler.reporting.rio.DynamicReportItem"> <properties> <property name="ALIGNMENT" /> <property name="SCALE" /> </properties> <label position="north">Fantastic</label> <label position="south" /> <label position="west" /> <label position="east" /> </item> </report> </karEntryXml> </karEntry> </kar>