Correction. The above were not reliable methods to ensure the file was copied into the data library. Checking for file_size != 0 was also not effective for large files.

Dannon, can you tell me which field we should query and what state/message which will allow us to avoid race conditions?

The only solution that I can see that when the file_size != 0, you then ensure it that the file_size has not changed after a a short delay.


Rob Leclerc, PhD
P: (US) +1-(917)-873-3037
P: (Shanghai) +86-1-(861)-612-5469
Personal Email:

On Mon, Apr 29, 2013 at 4:11 PM, Rob Leclerc <> wrote:
Hi Dannon,

I've written some code to (i) query a dataset to ensure that it's been uploaded after a submit and (ii) to ensure a resulting dataset has been written to the file.

#Block until all datasets have been uploaded
libset = submit(api_key, api_url + "libraries/%s/contents" % library_id, data, return_formatted = False)
for ds in libset:
    while True:
        uploaded_file = display(api_key, api_url + 'libraries/%s/contents/%s' %(library_id, ds['id']), return_formatted=False)
        if uploaded_file['misc_info'] == None:

#Block until all result datasets have been saved to the filesystem
result_ds_url = api_url + 'histories/' + history_id + '/contents/' + dsh['id'];
while True:
    result_ds = display(api_key, result_ds_url, return_formatted=False)
        if result_ds["state"] == 'ok':

Rob Leclerc, PhD
P: (Shanghai) +86-1-(861)-612-5469
Personal Email:

On Mon, Apr 29, 2013 at 11:18 AM, Dannon Baker <> wrote:
Yep, that example filesystem_paths you suggest should work fine.  The sleep() bit was a complete hack from the start, for simplicity in demonstrating a very basic pipeline, but what you probably want to do for a real implementation is query the dataset in question via the API, verify that the datatype/etc have been set, and only after that execute the workflow; instead of relying on sleep.

On Mon, Apr 29, 2013 at 9:24 AM, Rob Leclerc <> wrote:
Hi Dannon,

Thanks for the response. Sorry to be pedantic, but just to make sure that I understand the interpretation of this field on the other side of the API, I would need to have something like the following:

data['filesystem_paths'] = "/home/me/file1.vcf \n /home/me/file2.vcf /n /home/me/file3.vcf"

I assume I should also increase the time.sleep() to reflect the uploading of extra files?



Rob Leclerc, PhD
P: (Shanghai) +86-1-(861)-612-5469
Personal Email:

On Mon, Apr 29, 2013 at 9:15 AM, Dannon Baker <> wrote:
Hey Rob,

That does just submit exactly one at a time, executes the workflow, and then does the next all in separate transactions.  If you wanted to upload multiple filepaths at once, you'd just append more to the ''filesystem_paths' field (newline separated paths).


On Fri, Apr 26, 2013 at 11:54 PM, Rob Leclerc <> wrote:
I'm looking at and it's not clear from the example how you submit multiple datasets to a library. In the example, the first submit returns a libset [] with only a single entry and then proceeds to iterate through each dataset in the libset in the following section:

data = {}

   data['folder_id'] = library_folder_id

   data['file_type'] = 'auto'

   data['dbkey'] = ''

   data['upload_option'] = 'upload_paths'


data['filesystem_paths'] = fullpath

   data['create_type'] = 'file'

   libset = submit(api_key, api_url + "libraries/%s/contents" % library_id, data, return_formatted = False)


   for ds in libset:

       if 'id' in ds:

                        wf_data = {}

                        wf_data['workflow_id'] = workflow['id']

                        wf_data['history'] = "%s - %s" % (fname, workflow['name'])

                        wf_data['ds_map'] = {}

                        for step_id, ds_in in workflow['inputs'].iteritems():

                            wf_data['ds_map'][step_id] = {'src':'ld', 'id':ds['id']}

                        res = submit( api_key, api_url + 'workflows', wf_data, return_formatted=False)

Rob Leclerc, PhD
P: (Shanghai) +86-1-(861)-612-5469
Personal Email:

Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

To search Galaxy mailing lists use the unified search at: