
Hi Nils, Currently most structure parallelism in Galaxy is at the between tool level, individual tool runs still usually run on a single node. There are certain tools that manage their own parallelism. We're currently working on better support for within job parallelism, and in particular extending our tool configuration to support tools that use different models for parallelism, from loosely coupled, to map reduce, to MPI. We'd definitely appreciate your suggestions. Thanks, James On May 4, 2010, at 4:47 PM, Nils Homer wrote:
In our manual pipeline, we parallelize across our cluster by splitting the data (in the map/reduce model) for each step (file-conversion/alignment/merging/duplicate/removal/variant-calling/ annotat ion etc.). There are many dependencies and merge and forks. Does galaxy handle this itself or how would I do this with galaxy?