workflow API: step_order vs step_id in bioblend
Hi all, Here’s a mail for heads up and googleable error message in case someone finds a similar error and scratches her/his head. So (some time) after the very nice API class we had at GCC2015 I am now trying my hand at running workflows using Bioblend. I had some frustration trying to invoke a simple WF, but now found out that the inputs parameter to invoke_workflow is not as I thought it would be. I’m on the latest galaxy-dist and using Bioblend 0.7 in Python3.4. So I thought I’d call e.g.: gi.workflows.invoke_workflow(workflow_id, inputs={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’) Where the keys in inputs dict represent the ids from: gi.workflows.get_workflow(workflow_id)[‘inputs’] But apparently the new standard is to use the step_order (so 0 for the first step) instead of the step_id, as shown in the code at lib/galaxy/workflow/run_request.py So this gave me the HTTP 400 error "Workflow cannot be run because an expected input step '84' has no input dataset." I have reverted to using legacy code with run_workflow and dataset_map, which circumvents the problem: gi.workflows.run_workflow(workflow_id, dataset_map={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’) Is there any way to specify inputs_by in the payload or am I on the wrong bioblend version? Otherwise I can file a request on the Bioblend github. cheers, — Jorrit Boekel Proteomics systems developer BILS / Lehtiö lab Scilifelab Stockholm, Sweden
The workflow API is the only place where we expose unencoded IDs and we really shouldn't be doing it. I would instead focus on adapting to using step_ids - they really should be more stable and usable. Order index has lots of advantages - You can build a request for a given workflow and apply it to multiple Galaxy instances - You can build a request by simply looking at the Galaxy workflow JSON you upload to create the workflow - If you don't add or remote inputs you can modify a workflow and the request will likely still be valid If get_workflow doesn't have the information you need - it is in workflows.export_workflow_json. Just filter the steps on input types. We should probably deprecate get_workflows because it exposes information it shouldn't and expose the same data but using the order index instead of the unencoded ID. Aysam Guerler pointed out to me yesterday that parameter overrides also still use unencoded IDs, we should change that also. Today, I'll try to add parameters to the get and run APIs to have these use order_index instead of unencoded step ids. Someday, that really should become the default behavior but we have to think carefully about how to deprecate the existing functionality. That said - I'd definitely merge a PR that exposed inputs_by for the workflow run endpoints in bioblend. It is useful - less for restoring the legacy behavior than allowing specifying inputs by label (functionality that is in bioblend objects on the client side but that Galaxy supports natively) and needs to be documented better. -John On Thu, Nov 19, 2015 at 9:50 AM, Jorrit Boekel <jorrit.boekel@scilifelab.se> wrote:
Hi all,
Here’s a mail for heads up and googleable error message in case someone finds a similar error and scratches her/his head.
So (some time) after the very nice API class we had at GCC2015 I am now trying my hand at running workflows using Bioblend. I had some frustration trying to invoke a simple WF, but now found out that the inputs parameter to invoke_workflow is not as I thought it would be. I’m on the latest galaxy-dist and using Bioblend 0.7 in Python3.4.
So I thought I’d call e.g.:
gi.workflows.invoke_workflow(workflow_id, inputs={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’)
Where the keys in inputs dict represent the ids from: gi.workflows.get_workflow(workflow_id)[‘inputs’]
But apparently the new standard is to use the step_order (so 0 for the first step) instead of the step_id, as shown in the code at lib/galaxy/workflow/run_request.py So this gave me the HTTP 400 error "Workflow cannot be run because an expected input step '84' has no input dataset."
I have reverted to using legacy code with run_workflow and dataset_map, which circumvents the problem: gi.workflows.run_workflow(workflow_id, dataset_map={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’)
Is there any way to specify inputs_by in the payload or am I on the wrong bioblend version? Otherwise I can file a request on the Bioblend github.
cheers, — Jorrit Boekel Proteomics systems developer BILS / Lehtiö lab Scilifelab Stockholm, Sweden
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Thanks John, I found indeed the step order ids in the result from export_workflow_json. Helps a lot and now I won’t need to use soon-deprecated stuff. cheers, — Jorrit Boekel Proteomics systems developer BILS / Lehtiö lab Scilifelab Stockholm, Sweden
On 19 Nov 2015, at 15:37, John Chilton <jmchilton@gmail.com> wrote:
The workflow API is the only place where we expose unencoded IDs and we really shouldn't be doing it. I would instead focus on adapting to using step_ids - they really should be more stable and usable. Order index has lots of advantages - You can build a request for a given workflow and apply it to multiple Galaxy instances - You can build a request by simply looking at the Galaxy workflow JSON you upload to create the workflow - If you don't add or remote inputs you can modify a workflow and the request will likely still be valid
If get_workflow doesn't have the information you need - it is in workflows.export_workflow_json. Just filter the steps on input types. We should probably deprecate get_workflows because it exposes information it shouldn't and expose the same data but using the order index instead of the unencoded ID. Aysam Guerler pointed out to me yesterday that parameter overrides also still use unencoded IDs, we should change that also.
Today, I'll try to add parameters to the get and run APIs to have these use order_index instead of unencoded step ids. Someday, that really should become the default behavior but we have to think carefully about how to deprecate the existing functionality.
That said - I'd definitely merge a PR that exposed inputs_by for the workflow run endpoints in bioblend. It is useful - less for restoring the legacy behavior than allowing specifying inputs by label (functionality that is in bioblend objects on the client side but that Galaxy supports natively) and needs to be documented better.
-John
On Thu, Nov 19, 2015 at 9:50 AM, Jorrit Boekel <jorrit.boekel@scilifelab.se> wrote:
Hi all,
Here’s a mail for heads up and googleable error message in case someone finds a similar error and scratches her/his head.
So (some time) after the very nice API class we had at GCC2015 I am now trying my hand at running workflows using Bioblend. I had some frustration trying to invoke a simple WF, but now found out that the inputs parameter to invoke_workflow is not as I thought it would be. I’m on the latest galaxy-dist and using Bioblend 0.7 in Python3.4.
So I thought I’d call e.g.:
gi.workflows.invoke_workflow(workflow_id, inputs={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’)
Where the keys in inputs dict represent the ids from: gi.workflows.get_workflow(workflow_id)[‘inputs’]
But apparently the new standard is to use the step_order (so 0 for the first step) instead of the step_id, as shown in the code at lib/galaxy/workflow/run_request.py So this gave me the HTTP 400 error "Workflow cannot be run because an expected input step '84' has no input dataset."
I have reverted to using legacy code with run_workflow and dataset_map, which circumvents the problem: gi.workflows.run_workflow(workflow_id, dataset_map={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’)
Is there any way to specify inputs_by in the payload or am I on the wrong bioblend version? Otherwise I can file a request on the Bioblend github.
cheers, — Jorrit Boekel Proteomics systems developer BILS / Lehtiö lab Scilifelab Stockholm, Sweden
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Just as a follow up on this for everyone using the workflow API, I have opened a WIP PR to completely replace step ids with the step order index in the workflows API so this confusion won't occur anymore and the deprecated and modern endpoints work much more similarly: https://github.com/galaxyproject/galaxy/pull/1137 By changing both the show and run endpoints together I think backward compatibility should be maintained except for in the limited scenarios mentioned in the PR. -John On Thu, Nov 19, 2015 at 2:52 PM, Jorrit Boekel <jorrit.boekel@scilifelab.se> wrote:
Thanks John, I found indeed the step order ids in the result from export_workflow_json. Helps a lot and now I won’t need to use soon-deprecated stuff.
cheers, — Jorrit Boekel Proteomics systems developer BILS / Lehtiö lab Scilifelab Stockholm, Sweden
On 19 Nov 2015, at 15:37, John Chilton <jmchilton@gmail.com> wrote:
The workflow API is the only place where we expose unencoded IDs and we really shouldn't be doing it. I would instead focus on adapting to using step_ids - they really should be more stable and usable. Order index has lots of advantages - You can build a request for a given workflow and apply it to multiple Galaxy instances - You can build a request by simply looking at the Galaxy workflow JSON you upload to create the workflow - If you don't add or remote inputs you can modify a workflow and the request will likely still be valid
If get_workflow doesn't have the information you need - it is in workflows.export_workflow_json. Just filter the steps on input types. We should probably deprecate get_workflows because it exposes information it shouldn't and expose the same data but using the order index instead of the unencoded ID. Aysam Guerler pointed out to me yesterday that parameter overrides also still use unencoded IDs, we should change that also.
Today, I'll try to add parameters to the get and run APIs to have these use order_index instead of unencoded step ids. Someday, that really should become the default behavior but we have to think carefully about how to deprecate the existing functionality.
That said - I'd definitely merge a PR that exposed inputs_by for the workflow run endpoints in bioblend. It is useful - less for restoring the legacy behavior than allowing specifying inputs by label (functionality that is in bioblend objects on the client side but that Galaxy supports natively) and needs to be documented better.
-John
On Thu, Nov 19, 2015 at 9:50 AM, Jorrit Boekel <jorrit.boekel@scilifelab.se> wrote:
Hi all,
Here’s a mail for heads up and googleable error message in case someone finds a similar error and scratches her/his head.
So (some time) after the very nice API class we had at GCC2015 I am now trying my hand at running workflows using Bioblend. I had some frustration trying to invoke a simple WF, but now found out that the inputs parameter to invoke_workflow is not as I thought it would be. I’m on the latest galaxy-dist and using Bioblend 0.7 in Python3.4.
So I thought I’d call e.g.:
gi.workflows.invoke_workflow(workflow_id, inputs={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’)
Where the keys in inputs dict represent the ids from: gi.workflows.get_workflow(workflow_id)[‘inputs’]
But apparently the new standard is to use the step_order (so 0 for the first step) instead of the step_id, as shown in the code at lib/galaxy/workflow/run_request.py So this gave me the HTTP 400 error "Workflow cannot be run because an expected input step '84' has no input dataset."
I have reverted to using legacy code with run_workflow and dataset_map, which circumvents the problem: gi.workflows.run_workflow(workflow_id, dataset_map={‘9678’: {’src’: ‘hda’, ‘id’: ‘abcdef12345’}} , history_id=‘abc1234’)
Is there any way to specify inputs_by in the payload or am I on the wrong bioblend version? Otherwise I can file a request on the Bioblend github.
cheers, — Jorrit Boekel Proteomics systems developer BILS / Lehtiö lab Scilifelab Stockholm, Sweden
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
John Chilton
-
Jorrit Boekel