bioKepler 1.2 released
We are pleased to announce the release of bioKepler 1.2.
The bioKepler suite facilitates rapid development and scalable distributed execution of bioinformatics workflows in Kepler while simplifying access to a wide range of bioinformatics tools executed locally or distributedly. bioKepler contains a set of Kepler actors, called “bioActors”, which are specialized for running bioinformatics tools along with Kepler directors for distributed data-parallel (DDP) execution on Big Data platforms such as Hadoop and Spark.
bioKepler 1.2 includes the following new features:
- New bioActor and demo workflow for BLAST+
- New molecular dynamics demo workflows: AmberEquilibration, AmberHeating, AmberMinimization, and AmberProduction
- New Machine Learning actors and demos:
- KMeans-All with support for R, MLlib, KNIME, and Mahout
- MLlib actors: RandomForestModel, SVMApply, SVMModel, KMeansModel, RemoveNulls, CreatedLabeledPoint, CreateVectorRDD
- Updating to Spark 1.5.0
bioKepler 1.2 can be installed using the Module Manager in Kepler 2.5. If you have an older version of Kepler, we recommend that you first download and install Kepler 2.5, and then use the Module Manager to install bioKepler 1.2. Kepler 2.5 can be downloaded here. bioKepler is built on top of the DDP, provenance, reporting, and workflow run manager modules, and does not require a separate installation of these modules.
The bioKepler website (http://www.biokepler.org) provides more information about the project including the bioKepler User Guide and descriptions of the bioActors and demo workflows.
We are building virtual machine images with bioKepler 1.2 for Amazon EC2 and OpenStack. These images are based on CloudBioLinux and include many bioinformatics tools. A separate announcement will be sent when the virtual machine images are completed.