Using workflows with steps with several inputs through the API
Hi, Looking at workflow_execute.py, it seems to me there can be only one input for step, is this correct? or is there a wait to map more than one input and match them to an specific position in a particularly step? This is the workaround I found. I created 'Input Dataset' boxes for all of my inputs, then I linked them to the step where they are needed. This is actually very convenient if the same input is used in more than one step, as when I run the workflow manually I only need to select each input once. I tried doing this, as each 'Input Dataset' would have his own step. This is an example workflow: http://main.g2.bx.psu.edu/u/cjav/w/fastq-joiner---input-datasets Initially I ran into this issue: $ ./workflow_execute.py my_key http://main.g2.bx.psu.edu/api/workflows fcfa9ae643d37c32 'Fastq joiner - Input Datasets' '1=ldda=b4c5be159c8057c2' '2=ldda=ea0e32961d454539' HTTP Error 400: Bad Request "Workflow cannot be run because an expected input step '517620' has no input dataset." I then found you can do: $ ./display.py my_key http://main.g2.bx.psu.edu/api/workflows/fcfa9ae643d37c32 Member Information ------------------ url: /api/workflows/fcfa9ae643d37c32 inputs: {'517620': {'value': '', 'label': 'Left-hand Reads'}, '517622': {'value': '', 'label': 'Right-hand Reads'}} id: fcfa9ae643d37c32 name: Fastq joiner - Input Datasets I see inputs get different keys and I can use them when executing the workflow: $ ./workflow_execute.py my_key http://main.g2.bx.psu.edu/api/workflows fcfa9ae643d37c32 'hist_id=9fbfd3a1be8b7f8e' '517620=hda=8d89405199368bb0' '517622=hda=5c93f30078e5ac25' Response -------- {'outputs': ['f7c4b0900d8f3b1b'], 'history': '9fbfd3a1be8b7f8e'} This worked perfectly. I still think it would be easier to be able to use the step number, but this works. Lastly, I did run into one issue where I had to import the datasets firsts, as trying to use ldda like this: $ ./workflow_execute.py my_key http://main.g2.bx.psu.edu/api/workflows fcfa9ae643d37c32 'Fastq joiner - Input Datasets' '517620=ldda=b4c5be159c8057c2' '517622=ldda=ea0e32961d454539' Response -------- {'outputs': ['44c033fad737acc5'], 'history': '9fbfd3a1be8b7f8e'} I got this error in the history: input data 1 (file: /galaxy/main_database/files/002/780/dataset_2780871.dat) was deleted before the job started This didn't happen in our local instance using galaxy-central. Thanks, Carlos
Carlos, The method you describe below for mapping inputs is precisely the intended approach. One reason for the long identifier being used instead of a simple step number is that step number (ordering as you would see in the usual run workflow dialog) can change without it being obvious to the user. The ordering is based on two factors- a topological sort of the workflow graph, and the distance within the editor from the top left corner. In other words, it's possible to edit a workflow in display position only (drag the boxes around) and change step numbers. One approach here would be to use the step label ('label': 'Left-hand Reads') in your case and programmatically fetch and use the input id. For your last issue-- that library dataset is actually marked as deleted on main, so the error is appropriate. I'm not yet sure why that is the case, but I'll look into it. If you retry the operation on a known non-deleted library dataset everything should work just fine. Thanks! Dannon On Jan 5, 2012, at 5:35 PM, Carlos Borroto wrote:
Hi,
Looking at workflow_execute.py, it seems to me there can be only one input for step, is this correct? or is there a wait to map more than one input and match them to an specific position in a particularly step?
This is the workaround I found.
I created 'Input Dataset' boxes for all of my inputs, then I linked them to the step where they are needed. This is actually very convenient if the same input is used in more than one step, as when I run the workflow manually I only need to select each input once. I tried doing this, as each 'Input Dataset' would have his own step.
This is an example workflow: http://main.g2.bx.psu.edu/u/cjav/w/fastq-joiner---input-datasets
Initially I ran into this issue: $ ./workflow_execute.py my_key http://main.g2.bx.psu.edu/api/workflows fcfa9ae643d37c32 'Fastq joiner - Input Datasets' '1=ldda=b4c5be159c8057c2' '2=ldda=ea0e32961d454539' HTTP Error 400: Bad Request "Workflow cannot be run because an expected input step '517620' has no input dataset."
I then found you can do: $ ./display.py my_key http://main.g2.bx.psu.edu/api/workflows/fcfa9ae643d37c32 Member Information ------------------ url: /api/workflows/fcfa9ae643d37c32 inputs: {'517620': {'value': '', 'label': 'Left-hand Reads'}, '517622': {'value': '', 'label': 'Right-hand Reads'}} id: fcfa9ae643d37c32 name: Fastq joiner - Input Datasets
I see inputs get different keys and I can use them when executing the workflow: $ ./workflow_execute.py my_key http://main.g2.bx.psu.edu/api/workflows fcfa9ae643d37c32 'hist_id=9fbfd3a1be8b7f8e' '517620=hda=8d89405199368bb0' '517622=hda=5c93f30078e5ac25' Response -------- {'outputs': ['f7c4b0900d8f3b1b'], 'history': '9fbfd3a1be8b7f8e'}
This worked perfectly. I still think it would be easier to be able to use the step number, but this works.
Lastly, I did run into one issue where I had to import the datasets firsts, as trying to use ldda like this: $ ./workflow_execute.py my_key http://main.g2.bx.psu.edu/api/workflows fcfa9ae643d37c32 'Fastq joiner - Input Datasets' '517620=ldda=b4c5be159c8057c2' '517622=ldda=ea0e32961d454539' Response -------- {'outputs': ['44c033fad737acc5'], 'history': '9fbfd3a1be8b7f8e'}
I got this error in the history: input data 1 (file: /galaxy/main_database/files/002/780/dataset_2780871.dat) was deleted before the job started
This didn't happen in our local instance using galaxy-central.
Thanks, Carlos ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Mon, Jan 9, 2012 at 9:12 AM, Dannon Baker <dannonbaker@me.com> wrote:
Carlos,
The method you describe below for mapping inputs is precisely the intended approach. One reason for the long identifier being used instead of a simple step number is that step number (ordering as you would see in the usual run workflow dialog) can change without it being obvious to the user. The ordering is based on two factors- a topological sort of the workflow graph, and the distance within the editor from the top left corner. In other words, it's possible to edit a workflow in display position only (drag the boxes around) and change step numbers. One approach here would be to use the step label ('label': 'Left-hand Reads') in your case and programmatically fetch and use the input id.
Thanks for your response, is nice to get a confirmation I'm on the right track. I'll look into using the label.
For your last issue-- that library dataset is actually marked as deleted on main, so the error is appropriate. I'm not yet sure why that is the case, but I'll look into it. If you retry the operation on a known non-deleted library dataset everything should work just fine.
Oh, I always try to use data from the shared libraries when testing my workflows on main or test server, this one is from "Illumina iDEA Datasets (sub-sampled)". I didn't realized it could have been deleted but still show in the data library. Thanks again, Carlos
participants (2)
-
Carlos Borroto
-
Dannon Baker