[Still to write: What is an Ondex workflow? Mention it's open source.]
It is assumed that the reader of this document possesses appropriate prior knowledge about writing Java 1.6 programs, as well as the usage of the tools Subversion and Maven2.
To follow the steps described in this manual it is required to have an installation of Sun JDK1.6, the latest version of Subversion and Maven2 ready on a computer.
The following steps are only required until the Ondex workflow engine is deployed to the Maven repository at Rothamsted Research.
svn co https://ondex.svn.sourceforge.net/svnroot/ondex/trunk/ondex_parent ondex_parent
mvn install
Your local maven repository now contains all required Ondex libraries.
pom file. Specify the following artifact as the pom's parent:net.sourceforge.ondexmodules0.0.1-SNAPSHOTpom:src/main/java)src/main/java:net.sourceforge.ondex
As a general rule one can make use of any external library by defining it as a dependency in the pom file. During compile time all dependencies will be automatically downloaded and merged into the resulting plug-in jar file.
If a required library is already used by the Ondex workflow engine,
it is of course not necessary to merge it into the jar file. In this
case, simply set its dependency scope to provided. Here is a list of all libraries that are already in use:
There are several different types of workflow components in Ondex:
For more information about the different workflow components, please see the plugins' documentation starts page 73 of the Ondex user guide.
[Still to do: Add class diagram.]
All workflow components have some important common technical aspects.
A component's type and ID are defined by its fully qualified class name. This is accomplished by obeying the following naming convention:
net.sourceforge.ondex.<lctype>.<id>.<Uctype>
where <lctype> represents the component's type name in lower case, <Uctype> its upper case equivalent, and <id> the components identifier.
For example: The class net.sourceforge.ondex.parser.foobar.Parser is a parser component identified as 'foobar'.
In future these restrictive definitions will become obsolete, as it is planned to introduce OSGi technology to solve this issue.
Each workflow component features two methods called String getName() and String getVersion().
The getName() method allows you to state a full name for your component.
For example: 'PSI Molecular Interaction Format Parser' would be the name for the Parser with ID psimi.
In the getVersion() method you can specify a version tag, usually the date of the last update.
All workflow components can require arguments. It is the choice of the developer whether or not and if so, how many arguments it requires. Argument requirements are defined using objects of the type 'ArgumentDefinition', which will be explained in the next section.
Ondex's workflow API provides a set of many different argument definition types, all of which can be found in the the package net.sourceforge.ondex.args in the artifact net.sourceforge.ondex:workflow-api.
If an argument you require does not suit any of these Argument
Definitions, it is possible to write a new one. Create the package net.sourceforge.ondex.args in your project and create a new class that implements the interface net.sourceforge.ondex.args.ArgumentDefinition.
Each of these argument definitions can be configured with certain properties:
Example:
String argname = "Query"; String argdesc = "The query string"; ArgumentDefinition<?>[] args = new ArgumentDefinition<?>[]{ // name , description, required, defaultValue, multipleInstancesAllowed new StringArgumentDefinition(argname, argdesc , true , "SELECT *" , false) }
Each workflow component features a method called ArgumentDefinition<?>[] getArgumentDefinitions()
which can be used to return a set of definitions like the one above. If
no arguments are required, simply let it return an empty array like
this:
return new ArgumentDefinition<?>[0];
To access the arguments that were specified by the user, each workflow component possesses a method getArguments(), which returns an object of the type net.sourceforge.ondex.AbstractArguments.
This object can be queried for the argument's values. Unique arguments (multipleInstancesAllowed = false) can be accessed using the method Object getUniqueValue(String name). Lists of values for non-unique arguments are returned by the method List<Object> getObjectValueList(String name). The parameter 'name'
refers to the name that was given to the corresponding Argument
Definition. Depending on the type of workflow component, the argument
access field can have different names. These names are introduced in
the respective sections below.
Example:
String query = (String) getArguments().getUniqueValue("Query");
The Ondex graph that the workflow component is working on is referenced as a field called ONDEXGraph graph. To find out how to access and manipulate the Ondex graph, please refer to the Ondex JavaDoc and the Ondex graph API manual.
Important: To ensure that you use the Ondex graph in a way that is consistent with other Ondex applications, please ensure that you follow the Ondex semantics guidelines.
Lookup functions (called validators in Ondex) are very similar to workflow components, and share the same identifier conventions.
Thus, you can indicate your need of assistance from any known
validators to the workflow engine by stating their identifiers. This is
done by implementing the method String[] requiresValidators(), where
the return value is an array of the IDs
of the required validators. Every workflow component features this
method. If you do not require any lookup functions, simply return an
empty String array:
return new String[0];
All Ondex validators are accessible through the static field net.sourceforge.ondex.config.ValidatorRegistry.validators, which is of the type HashMap<String,AbstractONDEXValidator>. The workflow engine provides all validators requested in the requiresValidators() method (see above) in this hash map. The validator's id is used as the hash key:
Validator taxLookup = ValidatorRegistry.validators.get("taxonomy");
To use the validator simply call its validate(Object o) method. It will return the converted Object.
String ncbiTaxID = (String) taxLookup.validate("yeast");
The Ondex workflow engine can provide a Lucene based indexing environment for fast searches if needed.
You can indicate that your component makes use of the search environment feature. Every workflow component possesses a method boolean requiresIndexedGraph(). Simply set its return value to true if you want to activate the search index.
If you have instructed the workflow engine to create a search index over the current graph by returning true for requiresIndexedGraph() an Apache Lucene Environment will be provided for you. To access it use the static field net.sourceforge.ondex.config.LuceneRegistry.sid2luceneEnv. It is a HashMap<Long,LuceneEnv> that uses the graph's super ID (SID) as key. Thus you access it as follows:
LuceneEnv env = LuceneRegistry.sid2luceneEnv.get(graph.getSID());
This field provides the following methods to search inside the graph:
scoredSearchInConcepts(Query q)scoredSearchInRelations(Query q)To learn more about how to use this feature refer to the Ondex Javadoc and the Apache Lucene documentation.
net.sourceforge.ondex.parser.<id>Parser inside that package that extends the class net.sourceforge.ondex.parser.AbstractONDEXParser and override its abstract methods.package net.sourceforge.ondex.parser.myparser; import net.sourceforge.ondex.args.ArgumentDefinition; import net.sourceforge.ondex.parser.AbstractONDEXParser; public class Parser extends AbstractONDEXParser { @Override public boolean readsDirectory() { return false; } @Override public boolean readsFile() { return false; } @Override public ArgumentDefinition<?>[] getArgumentDefinitions() { return null; } @Override public String getName() { return null; } @Override public String getVersion() { return null; } @Override public String[] requiresValidators() { return null; } @Override public void start() throws Exception { } }
You will want to specify whether your parser will read a single file, a
directory or both. Simply set the return values for the methods boolean readsDirectory() and boolean readsFile() accordingly.
In addition to the general argument values, the getArguments() method also allows access to the input file and/or input directory. This is done using the method getArguments().getInputDir() and getArguments().getInputFile(), respectively.
net.sourceforge.ondex.mapping.<id>Mapping inside that package that extends the class net.sourceforge.ondex.mapping.AbstractONDEXMapping and override its abstract methods.package net.sourceforge.ondex.mapping.mymapping; import net.sourceforge.ondex.args.ArgumentDefinition; import net.sourceforge.ondex.mapping.AbstractONDEXMapping; public class Mapping extends AbstractONDEXMapping { @Override public ArgumentDefinition<?>[] getArgumentDefinitions() { return null; } @Override public String getName() { return null; } @Override public String getVersion() { return null; } @Override public boolean requiresIndexedGraph() { return false; } @Override public String[] requiresValidators() { return null; } @Override public void start() throws Exception { } }
net.sourceforge.ondex.filter.<id>Filter inside that package that extends the class net.sourceforge.ondex.mapping.AbstractONDEXFilter and override its abstract methods.package net.sourceforge.ondex.filter.myfilter; import net.sourceforge.ondex.args.ArgumentDefinition; import net.sourceforge.ondex.core.ONDEXConcept; import net.sourceforge.ondex.core.ONDEXGraph; import net.sourceforge.ondex.core.ONDEXRelation; import net.sourceforge.ondex.core.ONDEXView; import net.sourceforge.ondex.filter.AbstractONDEXFilter; public class Filter extends AbstractONDEXFilter { @Override public void copyResultsToNewGraph(ONDEXGraph exportGraph) { } @Override public ONDEXView<ONDEXConcept> getVisibleConcepts() { return null; } @Override public ONDEXView<ONDEXRelation> getVisibleRelations() { return null; } @Override public ArgumentDefinition<?>[] getArgumentDefinitions() { return null; } @Override public String getName() { return null; } @Override public String getVersion() { return null; } @Override public boolean requiresIndexedGraph() { return false; } @Override public String[] requiresValidators() { return null; } @Override public void start() throws Exception { } }
Filters do not manipulate the graph directly, but rather provide a view on the data. This is done using the class ONDEXView, which contains a bitset over the IDs of the elements concerned. This bitset can be any implementation of the interface ONDEXBitSet, such as DefaultBitSet or SparseBitSet. You can find detailed information about this in the graph API manual.
Here is an example snippet of how to construct a new ONDEXView over some concepts:
DefaultBitSet bitset = new DefaultBitSet(); for (ONDEXConcept concept : conceptList) { bitset.set(concept.getId()); } ONDEXView<ONDEXConcept> conceptView = new ONDEXView<ONDEXConcept>(graph, ONDEXConcept.class, bitset);
The filter output is stated using three special methods:
ONDEXGraph copyResultsToNewGraph(ONDEXGraph)ONDEXView<ONDEXConcept> getVisibleConcepts()ONDEXView<ONDEXRelation> getVisibleRelations()
It is advisable to keep ONDEXBitSets of the
filtered concepts and relations as private fields in your Filter class.
This makes implementing the above methods much easier.
net.sourceforge.ondex.tranformer.<id>Transformer inside that package that extends the class net.sourceforge.ondex.transformer.AbstractONDEXTransformer and override its abstract methods.package net.sourceforge.ondex.transformer.mymapping; import net.sourceforge.ondex.args.ArgumentDefinition; import net.sourceforge.ondex.transformer.AbstractONDEXTransformer; public class Transformer extends AbstractONDEXTransformer { @Override public ArgumentDefinition<?>[] getArgumentDefinitions() { return null; } @Override public String getName() { return null; } @Override public String getVersion() { return null; } @Override public boolean requiresIndexedGraph() { return false; } @Override public String[] requiresValidators() { return null; } @Override public void start() throws Exception { } }
net.sourceforge.ondex.statistics.<id>Statistics inside that package that extends the class net.sourceforge.ondex.statistics.AbstractONDEXStatistics and override its abstract methods.package net.sourceforge.ondex.statistics.mymapping; import net.sourceforge.ondex.args.ArgumentDefinition; import net.sourceforge.ondex.statistics.AbstractONDEXStatistics; public class Statistics extends AbstractONDEXStatistics { @Override public ArgumentDefinition<?>[] getArgumentDefinitions() { return null; } @Override public String getName() { return null; } @Override public String getVersion() { return null; } @Override public boolean requiresIndexedGraph() { return false; } @Override public String[] requiresValidators() { return null; } @Override public void start() throws Exception { } }
Like parsers, statistics methods allow the definition of input files
and/or directories. Hence, in addition to the general argument values
the getArguments() method also allows access to the input file and/or input directory. This is done using the method getArguments().getInputDir() and getArguments().getInputFile(), respectively.
In order to build the new workflow module proceed as follows:
pom.xml file).mvn install
target/ directory under the name <artifactID>-<version>-jar-with-dependencies.jar (where <artifactID> and <version> are the ID and version you stated in your pom.xml file.plugins/ directory of your local Ondex installation. You can now run a workflow using your new components.