On Thu, Aug 22, 2013 at 4:14 PM, Ketan Maheshwari <ketancmaheshwari@gmail.com> wrote:
Hi All,
I am trying to develop a generic Galaxy tool which could encapsulate any other Galaxy tool and run it. The motivation behind this development is to enable running ordinary Galaxy tools in parallel with multiple datasets and/or running ordinary tools on large-scale compute resources.
I was wondering about the Galaxy-way of doing this. Is there a natural pattern I could adapt here for this work? Are there any existing examples which does this?
Hi Ketan. This is a little broader than your initial question, http://lists.bx.psu.edu/pipermail/galaxy-user/2013-August/006511.html There has been a lot of work done already on running Galaxy on multiple datasets at once - some of which is already built into Galaxy, while other like John Chilton's multiple-file support is only available as an experimental branch, e.g. http://lists.bx.psu.edu/pipermail/galaxy-dev/2012-December/012265.html https://bitbucket.org/msiappdev/galaxy-extras/src/extras/README_EXTRAS.txt Galaxy also has the capability to split large jobs into many parts to take advantage of a cluster - I use this in the BLAST+ wrapper to break up searches into batches of 1000 queries for example. This isn't enabled by default, but we're using in on our instance. Regards, Peter