On Mon, Mar 19, 2012 at 8:41 PM, Mark Johnson <mjohnson@ncbi.nlm.nih.gov> wrote:
I'm writing some tools to integrate NCBI data resources with Galaxy. I have two questions.
The first is simple. I want to write a tool for a long-running process that is handled by some other scheduler, and that produces its own job ids. Some web services, like BLAST, for example, receive a request, and take a while to complete processing. The job id can be used to fetch either job status or results from the server, depending on whether it has completed. How do you make a Galaxy tool that polls the server, and produces an output set only when the process is complete?
Why do you need to do anything special at all for Galaxy here? I'd just write it as a single command line call which blocks. As far as Galaxy will know it is just a slow tool.
The second question is, besides this mailing list, and the Galaxy wiki, is there are good online video or text resource that explains the Galaxy architecture and how to use it? The docs are good as far as they go, but most of what's in the <command> scripts in the tool files isn't documented.
There are quite a few Galaxy videos... not sure if there are any aimed at potential developers. Are you asking about the Cheetah template language used inside the XML for the <command> which is almost a scripting language in itself, or the actual wrapper scripts used in some tools (which can be written in Python, Perl, etc)? Peter
On 03/19/2012 10:19 PM, Peter Cock wrote:
On Mon, Mar 19, 2012 at 8:41 PM, Mark Johnson<mjohnson@ncbi.nlm.nih.gov> wrote:
I'm writing some tools to integrate NCBI data resources with Galaxy. I have two questions.
The first is simple. I want to write a tool for a long-running process that is handled by some other scheduler, and that produces its own job ids. Some web services, like BLAST, for example, receive a request, and take a while to complete processing. The job id can be used to fetch either job status or results from the server, depending on whether it has completed. How do you make a Galaxy tool that polls the server, and produces an output set only when the process is complete?
Why do you need to do anything special at all for Galaxy here? I'd just write it as a single command line call which blocks. As far as Galaxy will know it is just a slow tool.
Yes, Galaxy is pretty good with handling 'slow' tools (ie: you can close the browser, and come back next morning) However, we have one tool where we just use Galaxy to execute a job which manipulates data outside of the Galaxy data directory. We do something similar to what you have asked for initially: We have a perl wrapper, which first submits the job using "IPC::Open3", followed by a 'while' loop checking the status of the job and finally produces a 'log' file which ends up as the new galaxy history item. Regards, Hans
The second question is, besides this mailing list, and the Galaxy wiki, is there are good online video or text resource that explains the Galaxy architecture and how to use it? The docs are good as far as they go, but most of what's in the<command> scripts in the tool files isn't documented.
There are quite a few Galaxy videos... not sure if there are any aimed at potential developers. Are you asking about the Cheetah template language used inside the XML for the<command> which is almost a scripting language in itself, or the actual wrapper scripts used in some tools (which can be written in Python, Perl, etc)?
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Why do you need to do anything special at all for Galaxy here? I'd just write it as a single command line call which blocks. As far as Galaxy will know it is just a slow tool. I suppose the tool could just poll the server, and only produce results when the process is complete. I could swear I read something somewhere in the Galaxy documentation that described two kinds of tools: one kind
On 3/19/12 5:19 PM, Peter Cock wrote: that finish quickly, and another that run for awhile (or a long while), and produce results when they're done. Maybe it was among the cloud-related documentation, about using the scheduler. Thanks for the suggestion. I'll try polling.
The second question is, besides this mailing list, and the Galaxy wiki, is there are good online video or text resource that explains the Galaxy architecture and how to use it? The docs are good as far as they go, but most of what's in the<command> scripts in the tool files isn't documented. There are quite a few Galaxy videos... not sure if there are any aimed at potential developers. Are you asking about the Cheetah template language used inside the XML for the<command> which is almost a scripting language in itself, or the actual wrapper scripts used in some tools (which can be written in Python, Perl, etc)?
Cheetah documentation is findable. I'm asking more about understanding how the inputs relate to what's available in Python in the Cheetah template in the <command> section. I guess there's not much for developers to get an overview: how inputs, parameters, outputs, the command, the template, and Python all work together. Seems the only way is trial, error, experiment, and trying to understand the existing tools. Thanks --Mark
participants (3)
-
Hans-Rudolf Hotz
-
Mark Johnson
-
Peter Cock