FAQ
This page has been designed to answer the questions most commonly asked about Kepler. If you have additional questions that aren't answered here, please post them to the kepler-users mailing list with as much detail as possible.
General
- What is the Kepler Collaboration?
- How is Kepler funded?
- What are the driving applications for Kepler?
- What is the relationship between Kepler and Ptolemy?
- Is there a web-based execution system for Kepler?
- Are there any Kepler mailing lists?
- Are there any other FAQs on Kepler and/or Ptolemy?
- How can I report bugs?
Downloading and Installing Kepler
- How can I get Kepler?
- What platforms does Kepler support?
- What are the system requirements for Kepler?
- Should I install Kepler with R or without R?
- What happens to my saved components when I reinstall Kepler?
Getting Started
- How do I start Kepler?
- Are there any tutorials for Kepler?
- Is there any user documentation for Kepler?
Common Usage Questions
- Can I execute a Kepler workflow from a command line without a graphical display?
- What is a director?
- Which director should I use with my workflow?
- What is the difference between the PN and SDF directors?
- How do I branch a workflow?
- Can I import Ptolemy actors into Kepler?
- Is there a way to use multi-line expressions in the Expression actor?
- Is it possible to put double quotes in a constant actor with a value that already begins and ends with quotes?
- Is it possible to create a simple (or complicated) workflow and use it later as an actor to build another workflow?
- Do you have any tips for using Kepler in a portal environment?
- The BooleanSwitch actor is not working. Why?
- Are there GIS actors in Kepler?
- What is the Kepler repository and how can I use it?
- Why do I have problems running some GARP workflows on the Macintosh?
- What about the Python Script actor?
General
What is Kepler?
Kepler is a software application for analyzing and modeling scientific data. Using Kepler's graphical interface and components, scientists with little background in computer science can create executable models, called "scientific workflows," for flexibly accessing scientific data (streaming sensor data, medical and satellite images, simulation output, observational data, etc.) and executing complex analyses on this data.
Kepler is developed by a cross-project collaboration led by the Kepler/CORE team. The software builds upon the mature Ptolemy II framework, developed at the University of California, Berkeley. Ptolemy II is a software framework designed for modeling, design, and simulation of concurrent, real-time, embedded systems.
What is the Kepler Collaboration?
The Kepler collaboration is an ongoing effort led by the NSF-funded Kepler/CORE team, and spanning several of the key institutions that started the Kepler project: UC Davis, UC Santa Barbara, and UC San Diego. The team is dedicated to refining, releasing, and supporting the Kepler software, as well as to enhancing the attributes and functions of the system most important for wide-scale adoption and long-term sustainability.
Current and past contributing members include collaborators from the following projects:
- SEEK: Science Environment for Ecological Knowledge
- SDM Center/SPA: SDM Center/Scientific Process Automation
- Ptolemy II: Heterogeneous Modeling and Design
- GEON: Cyberinfrastructure for the Geosciences
- ROADNet: Real-time Observatories, Applications, and Data Management Network
- EOL: Encyclopedia of Life
- Resurgence
- CIPRes: CyberInfrastructure for Phylogenetic Research
- More projects are listed on the Projects page...
How is Kepler funded?
Kepler is supported by the National Science Foundation under awards 0225676 for SEEK, 0225673 (AWSFL008-DS3) for GEON, 0619060 for REAP, 0722079 for Kepler/CORE, 1062565 for bioKepler, 0941692 for DISCOSci, and 1331615 for WIFIRE; by the National Institutes of Health P41 GM103426 for NBCR and R25 GM114821 for BBDTC; by the Department of Energy under Contract No. DE-FC02-01ER25486 for SciDAC/SDM and DE-SC0012630 for IPPD; and by the Gordon and Betty Moore Foundation award to Calit2 at UCSD for CAMERA.
The list of Ptolemy sponsors can be found here.
What are the driving applications for Kepler?
Kepler is designed to support research scientists in numerous domains, including bioinformatics, ecoinformatics, geoinformatics, and others. For examples of Kepler applied to various domains, please take a look at Chapter 9 of the Kepler User Manual.
What is the relationship between Kepler and Ptolemy?
Kepler builds upon the mature Ptolemy II framework, developed at the University of California, Berkeley. Ptolemy II is a Java-based component assembly framework with a graphical user interface called Vergil. The Ptolemy project focuses on modeling, designing, and simulating concurrent, real-time, embedded systems. For more information about Ptolemy, please see the Ptolemy II FAQ.
Kepler inherits modeling and design capabilities from Ptolemy, including the Vergil GUI and workflow scheduling and execution capabilities. Kepler also inherits from Ptolemy the actor-oriented modeling paradigm that separates workflow components ("actors") from the overall workflow orchestration (conducted by "directors"), making components more easily reusable. Through the actor-oriented and hierarchical modeling features built into Ptolemy, Kepler scientific workflows can operate at very different levels of granularity, from low-level "plumbing workflows" (that explicitly move data around, start and monitor remote jobs, for example) to high-level "conceptual workflows" that interlink complex, domain-specific data analysis steps.
Kepler extensions to Ptolemy include an ever increasing number of components aimed particularly at scientific applications, e.g., for remote data and metadata access, data transformations, data analysis, interfacing with legacy applications, Web service invocation and deployment, provenance tracking, etc. Target application areas include bioinformatics, cheminformatics, ecoinformatics, and geoinformatics workflows, among others.
Is there a web-based execution system for Kepler?
Yes, there are a few efforts in this area:
Hydrant provides the means for users to deploy their workflows to the web.
SciencePipes is a free service that lets users connect to real biodiversity data, use simple tools to create visualizations and feeds, and embed results on your own web site or blog.
Other web execution systems are currently under development as well, please visit the Web User Interface Interest Group page.
Are there any Kepler mailing lists?
Kepler has three mailing lists, one for users, one for developers, and one for announcements. Please see the Contact page for details.
Are there any other FAQs on Kepler and/or Ptolemy?
Yes. There is also a Developer FAQ in the Kepler Developer area of the website. You can find the Ptolemy FAQ here.
How can I report bugs?
The Kepler project maintains a bug base. If you come across a bug in the software, or have an idea for a new feature or enhancement, please report it to Kepler's Redmine bug base. To report a bug, you must first register yourself by requesting a Redmine account.
Downloading and Installing Kepler
How can I get Kepler?
You can download the latest installer from the downloads page of this website.
For the more adventurous who want to work with the very latest version, the Kepler nightly build has all the latest features and bug fixes, but is also essentially unchecked and may have assorted new problems. The nightly build is automatically built each night and is available as a zipped file in the Developers area of the website. We recommend that you check the status of the nightly build before downloading and running it.
Note: The nightly Kepler build comes as a large zip file (~100MB). Microsoft's built-in zip extractor has trouble extracting large archives--the process takes a very long time, if it works at all. If you are using a Window's platform, try using WinZip or another archive application to extract the files.
What platforms does Kepler support?
Kepler can run on Mac, Windows, and Linux systems. See the downloads page for more information.
What are the system requirements for Kepler?
Kepler is a large application that has substantial hardware requirements. These include 512MB of RAM (1 GB or more recommended), at least 300 MB of disk space, and at least a 2GHz CPU. Kepler runs on modern Windows, Macintosh (OS X), and Linux systems using Java 1.8 or greater.
Under Windows, when I double click on the installer jar file, the installer does not start. What do I do?
Under Windows, the Kepler installer jar file should start the Kepler installer using Java, it should not unzip.
If it unzips, then see http://www.wikihow.com/Run-a-.Jar-Java-File or
http://stackoverflow.com/questions/394616/running-jar-file-in-windows
See also Bug 5550 - Windows installer should be a .exe, not a .jar
Should I install Kepler with R or without R?
R —a language and environment for statistical computing—is required for some Kepler functionality, and we recommend you install it. See: http://www.r-project.org
Please note that you will likely have to append the path to R's bin directory to your PATH environment variable. On Windows this is accessible from System Properties=>Advanced=>Environment Variables.
What happens to my saved components when I reinstall Kepler?
Components and workflows stored in the Kepler cache (~/.kepler/) will not be saved when the application is reinstalled. Please back-up your files before reinstalling Kepler.
Getting Started
How do I start Kepler?
To start Kepler, follow the instructions for your platform:
Windows and Macintosh Platforms
To start Kepler on a PC, double-click the Kepler shortcut icon on the desktop. Kepler can also be started from the Start menu. Navigate to Start menu > All Programs, and select "Kepler" to start the application. On a Mac, the Kepler icon is created under Applications/Kepler. The icon can be dragged and dropped to the desktop or the dock if desired. For more information, see the Getting Started Guide
Linux Platform
To start Kepler on a Linux machine, use the following steps:
1. Open a shell window.
2. Navigate to the directory in which Kepler is installed. To change the directory, use the cd command (e.g., cd directory_name).
3. Type ./kepler.sh to run the application.
On some Linux systems, a shell can be opened by right clicking anywhere on the desktop and selecting "Open Terminal". Speak to your system administrator if you need information about your system.
Once the main Kepler application window opens, you can access and run sample scientific workflows and create your own custom scientific workflows. Each time you open an existing workflow or create a new workflow, a new application window will open. Multiple windows allow you to work on several workflows simultaneously and compare, copy, and paste components between workflows. For more information, see the Getting Started Guide.
Are there any tutorials for Kepler?
Yes! The Getting Started Guide introduces the main components and functionality of Kepler, and contains step-by-step instructions for using, modifying, and creating your own scientific workflows. The Guide provides a brief introduction to the application interface as well as to application-specific terminology and concepts.
Is there any user documentation for Kepler?
The Kepler Getting Started Guide, Actor Documentation, and User Manual ship with the Kepler 2.1 release. These documents are also available online.
Common Usage Questions
Can I execute a Kepler workflow from a command line without a graphical display?
Yes. See Executing Kepler from the Command Line for more details.What is a director?
Kepler uses a director/actor metaphor to visually represent the various components of a workflow. A director controls (or directs) the execution of a workflow, just as a film director oversees a cast and crew. The actors take their execution instructions from the director. In other words, actors specify what processing occurs while the director specifies when it occurs.
Every workflow must have a director that controls the execution of the workflow using a particular model of computation. For example, workflow execution can be synchronous, with processing occurring one component at a time in a pre-calculated sequence (SDF Director). Alternatively, workflow components can execute in parallel, with one or more components running simultaneously (which might be the case with a PN Director). For more information about directors in Kepler, please see section 3.2.1 of the Kepler User Manual.
Which director should I use with my workflow?
Each of the directors packaged with Kepler—Synchronous Dataflow (SDF), Process Networks (PN), Dynamic Dataflow (DDF), Continuous Time (CT) and Discrete Events (DE)—has a unique way of instructing the actors in a workflow. Which director to use under what circumstances is a question that should be answered during the initial stages of workflow design. As you sketch out the workflow steps and think about the types of processes the workflow will perform, keep the following questions in mind: Does the workflow depend on time? (if yes, you'll likely use a CT or DE director) Does the workflow require multiple threads or distributed execution? (PN) Does it perform a simple data transformation with constant data production and consumption rates? (SDF) Is the model described by differential equations? (CT) The answer to these questions often will indicate the best director to use. For more information about each director and when to use it, please see section 5.2 of the Kepler User Manual.
What is the difference between the PN and SDF directors?
The SDF Director executes a single actor at a time with one thread of execution. The SDF Director is very efficient and will not tax system resources with overhead. It achieves this efficiency by precalculating the schedule for actor execution. However, this efficiency requires that certain conditions be met, namely that the data consumption and production rate of each actor in an SDF workflow be constant and declared.
Under a PN Director, every actor gets an execution thread and the director does not statically calculate firing schedules. The workflow is driven by data availability: tokens are created on output ports whenever input tokens are available and output can be calculated. Because PN workflows are very loosely coupled, they are natural candidates for managing workflows that require parallel processing on distributed computing systems.
For more information about these and other Kepler directors, please see section 5.2 of the Kepler User Manual.
How do I branch a workflow?
The DDF Director is often used for workflows that require looping or branching or other control structures, but that do not require parallel processing (in which case a PN Director should be used). Use a BooleanSwitch actor under a DDF Director to direct tokens to either an "if" or an "else" output port. For an example workflow and more information, please see section 5.2.5 of the Kepler User Manual.
Can I import Ptolemy actors into Kepler?
Yes. Use Kepler's "Tools/Instantiate Component" menu item to import actors into Kepler. For more information, please see section 5.3.2 of the Kepler User Manual.
Is there a way to use multi-line expressions in the Expression actor?
In order to use a multi-line expression, you must change the type of widget that is used to set the expression. To do this, double click the Expression actor, which brings up the "Edit Parameters" dialog. Click the "Preferences" Button. Under "Edit preferences for Expression", go to the "expression" field and change the value from "line" to "text". Then hit OK to close the "Edit preferences for Expression" dialog and save your changes. The "Edit Parameters" dialog will now have a text area instead of a line.
Is it possible to put double quotes in a Constant actor with a value that already begins and ends with quotes?
Yes. Use backslashes to escape the internal quotes in the value of the Constant actor:
"my_binary --name \"toto and titi\" "
Note that the value of StringParameters, which are Parameters in 'String Mode', do not have to be surrounded by double quotes.
Is it possible to create a simple (or complicated) workflow and use it later as an actor to build another workflow?
Yes. Composite actors allow you to encapsulate a workflow and use it as a sub-workflow. For more information, please see section 3.2.3 of the Kepler User Manual.
Do you have any tips for using Kepler in a portal environment?
The Hydrant project has implemented a very good portal environment in which to run Kepler workflows. Actors that would normally display graphical results locally are replaced automatically by Hydrant before execution.
The BooleanSwitch actor is not working. Why?
The data consumption and production rates of the BooleanSwich actor are not constant. If the actor is used under an SDF Director, which requires that those conditions be met, the actor will not work. To solve this problem, use a DDF Director instead. For an example of a workflow that uses the BooleanSwitch actor, please see section 5.2.5 of the Kepler User Manual.
Are there GIS actors in Kepler?
Yes. The standard Kepler library contains a number of GIS actors used to capture, manage, analyze, and display all forms of geographically referenced information. For more information, please see section 8.5 of the Kepler User Manual.
What is the Kepler repository and how can I use it?
The Kepler Repository contains all of the over 350 actors shipped with Kepler as well as additional components shared by users. The repository allows users to upload and download workflow components to and from a centralized server. Placing components in the repository allows them to be searched and re-used easily.
New components can be added to the repository and shared via the "Upload to Repository..." actor menu item. Once components are uploaded to the Kepler repository, they can be searched and located by checking the "Search repository" option in the components area. Components stored remotely in the repository can be dragged onto the Workflow canvas for use just like a local component. For more information about the repository, please see Section 4.5.3 of the Kepler User Manual.Why do I have problems running some GARP workflows on the Macintosh?
If you build Kepler on a Macintosh directly from CVS using Ant, you may need to move one of the dynamic link libraries so that the Mac can find it. You need to create the ~/lib/ directory if it does not exist and copy the file 'libexpat.1.dylib' from $KEPLER/lib/ to this new '~/lib/' directory. Note that the installer should do this automatically.
What about the Python Script actor?
See Python and Kepler for details.