Running a workflow programatically
Hello folks, I am using a workflow that needs to be run many times on many different inputs. I have hacked around and figured out how to pull multiple inputs from a history, but I am a little baffled on how to run a galaxy workflow programatically. I have searched around fairly exhaustively and am wondering if this is something that anyone else has come across and accomplished. And links or pointers? Take care, Darren
Darren, While this is not currently possible, I'm currently finishing up a first pass on a workflow API that will allow this sort of interaction and hope to have an early version available by the end of this week. I can update you when that has been committed. -Dannon On Feb 24, 2011, at 9:29 PM, Darren Brown wrote:
Hello folks,
I am using a workflow that needs to be run many times on many different inputs. I have hacked around and figured out how to pull multiple inputs from a history, but I am a little baffled on how to run a galaxy workflow programatically. I have searched around fairly exhaustively and am wondering if this is something that anyone else has come across and accomplished. And links or pointers?
Take care,
Darren _______________________________________________ To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Dannon, That is surprising and excellent! Can't wait to hear more. Take care, Darren On Wed, Mar 2, 2011 at 11:38 AM, Dannon Baker <dannonbaker@me.com> wrote:
Darren,
While this is not currently possible, I'm currently finishing up a first pass on a workflow API that will allow this sort of interaction and hope to have an early version available by the end of this week. I can update you when that has been committed.
-Dannon
On Feb 24, 2011, at 9:29 PM, Darren Brown wrote:
Hello folks,
I am using a workflow that needs to be run many times on many different inputs. I have hacked around and figured out how to pull multiple inputs from a history, but I am a little baffled on how to run a galaxy workflow programatically. I have searched around fairly exhaustively and am wondering if this is something that anyone else has come across and accomplished. And links or pointers?
Take care,
Darren _______________________________________________ To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Darren, A first pass at adding workflows to the Galaxy API is now available. For an example of how to execute a basic workflow, see scripts/api/execute_workflow.py. An example of how this might integrate into a slightly more complex script using library uploads is available at scripts/api/example_watch_folder.py. This preliminary version doesn't allow runtime modification of tool parameters, so the workflow must not make use of the 'set at runtime' option, but that feature will be available soon. The API itself is relatively undocumented, though some basic hints and usage examples can be found in /scripts/api/README Let me know if you have questions or feedback! -Dannon On Mar 2, 2011, at 2:40 PM, Darren Brown wrote:
Hello Dannon,
That is surprising and excellent! Can't wait to hear more.
Take care,
Darren
On Wed, Mar 2, 2011 at 11:38 AM, Dannon Baker <dannonbaker@me.com> wrote:
Darren,
While this is not currently possible, I'm currently finishing up a first pass on a workflow API that will allow this sort of interaction and hope to have an early version available by the end of this week. I can update you when that has been committed.
-Dannon
On Feb 24, 2011, at 9:29 PM, Darren Brown wrote:
Hello folks,
I am using a workflow that needs to be run many times on many different inputs. I have hacked around and figured out how to pull multiple inputs from a history, but I am a little baffled on how to run a galaxy workflow programatically. I have searched around fairly exhaustively and am wondering if this is something that anyone else has come across and accomplished. And links or pointers?
Take care,
Darren _______________________________________________ To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Dannon, This looks exactly like how I would like to run a workflow. I'll be trying this out next week and will let you know how it goes. Thanks much! Darren On Sat, Mar 5, 2011 at 1:51 PM, Dannon Baker <dannonbaker@me.com> wrote:
Darren,
A first pass at adding workflows to the Galaxy API is now available. For an example of how to execute a basic workflow, see scripts/api/execute_workflow.py. An example of how this might integrate into a slightly more complex script using library uploads is available at scripts/api/example_watch_folder.py. This preliminary version doesn't allow runtime modification of tool parameters, so the workflow must not make use of the 'set at runtime' option, but that feature will be available soon.
The API itself is relatively undocumented, though some basic hints and usage examples can be found in /scripts/api/README
Let me know if you have questions or feedback!
-Dannon
On Mar 2, 2011, at 2:40 PM, Darren Brown wrote:
Hello Dannon,
That is surprising and excellent! Can't wait to hear more.
Take care,
Darren
On Wed, Mar 2, 2011 at 11:38 AM, Dannon Baker <dannonbaker@me.com> wrote:
Darren,
While this is not currently possible, I'm currently finishing up a first pass on a workflow API that will allow this sort of interaction and hope to have an early version available by the end of this week. I can update you when that has been committed.
-Dannon
On Feb 24, 2011, at 9:29 PM, Darren Brown wrote:
Hello folks,
I am using a workflow that needs to be run many times on many different inputs. I have hacked around and figured out how to pull multiple inputs from a history, but I am a little baffled on how to run a galaxy workflow programatically. I have searched around fairly exhaustively and am wondering if this is something that anyone else has come across and accomplished. And links or pointers?
Take care,
Darren _______________________________________________ To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Dannon, workflow I am finally getting around to the workflow api. I am having trouble with a couple things. First, I am trying to get the workflow_execute.py functioning for a workflow that I have here. I think I have everything down but how to define the dataset. Basically, I am trying to reverse engineer the long dataset, workflow and history ids, and I have been doing so by grabbing their respective urls and pulling out the ids. But the workflow execute hash is encoded like this for the script as far as I can tell: <hid>=<???>=<hashed dataset id> But I am kinda stuck at what these guys actually mean. Which brings me to my general question. While I appear to be close in selecting the correct history and workflow ids, it only works right now as a proof of concept since I would need to sort of generate these on my own for a user to run a workflow via the galaxy interface. It seems you are hashing these history, workflow and dataset ids, but I am not really sure what you are using to hash them. Looks like not a SHA1 sum. Given only access to the galaxy database, I would like to execute a workflow, so I would need to be able to generate the hashed values to throw at the api. Does that make sense? Finally, can I generate an api key programatically as well? Not the end of the world, but it would be nice. Thanks much for your help, Darren On Sat, Mar 5, 2011 at 2:04 PM, Darren Brown <brown@centerspace.net> wrote:
Hello Dannon,
This looks exactly like how I would like to run a workflow. I'll be trying this out next week and will let you know how it goes. Thanks much!
Darren
On Sat, Mar 5, 2011 at 1:51 PM, Dannon Baker <dannonbaker@me.com> wrote:
Darren,
A first pass at adding workflows to the Galaxy API is now available. For an example of how to execute a basic workflow, see scripts/api/execute_workflow.py. An example of how this might integrate into a slightly more complex script using library uploads is available at scripts/api/example_watch_folder.py. This preliminary version doesn't allow runtime modification of tool parameters, so the workflow must not make use of the 'set at runtime' option, but that feature will be available soon.
The API itself is relatively undocumented, though some basic hints and usage examples can be found in /scripts/api/README
Let me know if you have questions or feedback!
-Dannon
On Mar 2, 2011, at 2:40 PM, Darren Brown wrote:
Hello Dannon,
That is surprising and excellent! Can't wait to hear more.
Take care,
Darren
On Wed, Mar 2, 2011 at 11:38 AM, Dannon Baker <dannonbaker@me.com> wrote:
Darren,
While this is not currently possible, I'm currently finishing up a first pass on a workflow API that will allow this sort of interaction and hope to have an early version available by the end of this week. I can update you when that has been committed.
-Dannon
On Feb 24, 2011, at 9:29 PM, Darren Brown wrote:
Hello folks,
I am using a workflow that needs to be run many times on many different inputs. I have hacked around and figured out how to pull multiple inputs from a history, but I am a little baffled on how to run a galaxy workflow programatically. I have searched around fairly exhaustively and am wondering if this is something that anyone else has come across and accomplished. And links or pointers?
Take care,
Darren _______________________________________________ To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Mar 15, 2011, at 7:03 PM, Darren Brown wrote:
But I am kinda stuck at what these guys actually mean.
The execute_workflow.py command line inputs are indeed a little clunky for all the information that has to go into each dataset mapping parameter. I didn't imagine that this script would actually be used directly very often, but rather would serve as an example of how to execute a single workflow from code with particular inputs. The three parts are workflow step, source type, and input id. For the source component, use 'hda' with the encoded id you're getting from a history, or 'ldda' for an id from a library dataset.
Which brings me to my general question. While I appear to be close in selecting the correct history and workflow ids, it only works right now as a proof of concept since I would need to sort of generate these on my own for a user to run a workflow via the galaxy interface. It seems you are hashing these history, workflow and dataset ids, but I am not really sure what you are using to hash them. Looks like not a SHA1 sum. Given only access to the galaxy database, I would like to execute a workflow, so I would need to be able to generate the hashed values to throw at the api. Does that make sense?
You could definitely generate the hashed values on your own if you wanted. We use the blowfish implementation in pycrypto, with the 'id_secret' in your universe_wsgi.ini as the key. Given an object_id and the id_secret from Galaxy, you should be able to do something like (code directly from lib/galaxy/web/security/__init__.py): from Crypto.Cipher import Blowfish cipher = Blowfish.new( id_secret_from_galaxy ) str_id = str(object_id) padded_id = ( "!" * ( 8 - len(str_id) % 8 ) ) + str_id encoded_id = cipher.encrypt(padded_id).encode('hex') Ideally, however, this would all be done through the API and not reaching into the database directly. Dataset level operations for pushing files into Galaxy, listing them ( and retrieving ldda's for use in things like the workflows api component) are supported for datasets in data libraries, but not in individual histories, yet. I'd imagine this should be forthcoming soon. In the meanwhile, at least for this, you might want to consider using a data library at least as an initial import destination from which you can do further work. The example_watch_folder.py in scripts/api has a more comprehensive example of doing programmatic execution on many datasets at once, as well as importing of those files from the filesystem into galaxy. You should also be able to use the same approach I used there in finding or creating a data library to grab a workflow by name, instead of having to figure out the id ahead of time.
Finally, can I generate an api key programatically as well? Not the end of the world, but it would be nice.
No, though I suppose you could hack something together if you wanted, since you do have access directly to the database and don't seem to be opposed to poking around in there. All you'd need is a user's id and whatever you want the key to be, toss that in the api_keys table making sure that the user doesn't already have one set. Hope this helps, thanks for exploring all this new ground! Dannon
participants (2)
-
Dannon Baker
-
Darren Brown