Hi all, I'm trying to understand how to write a tool that generates a dataset rather than a single output file. I've tried following all of the examples but I'm stuck, so I thought I would distil down the simplest example I could write and ask for help here. So here's my example: https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55 it's a simple python script that writes three files to a directory named for the single input parameter. I think one of the problems I'm having is knowing where to write the output to. I've run this under planemo serve and the job runs, creating the output directory within the 'job_working_directory/000/1/SampleDataset' directory, however my dataset doesn't contain anything so clearly my outputs directive isn't working: <outputs> <collection type="list" label="$job_name" name="output1"> <discover_datasets pattern="__name_and_ext__" directory="$job_name" /> </collection> </outputs> ($job_name is the name of the directory that is being written to, SampleDataset in this case) Any help in getting this example working would be appreciated. Thanks, Steve -- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
Just a quick check - did you refresh your history to confirm that the dataset *is* empty? We had the same thing at SANBI but it turns out that Galaxy creates an empty output collection and then only populates it sometime after job completion (this is a know UI bug). See: http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections... On 20 October 2015 at 08:48, Steve Cassidy <steve.cassidy@mq.edu.au> wrote:
Hi all, I'm trying to understand how to write a tool that generates a dataset rather than a single output file. I've tried following all of the examples but I'm stuck, so I thought I would distil down the simplest example I could write and ask for help here.
So here's my example:
https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
it's a simple python script that writes three files to a directory named for the single input parameter.
I think one of the problems I'm having is knowing where to write the output to. I've run this under planemo serve and the job runs, creating the output directory within the 'job_working_directory/000/1/SampleDataset' directory, however my dataset doesn't contain anything so clearly my outputs directive isn't working:
<outputs> <collection type="list" label="$job_name" name="output1"> <discover_datasets pattern="__name_and_ext__" directory="$job_name" /> </collection> </outputs>
($job_name is the name of the directory that is being written to, SampleDataset in this case)
Any help in getting this example working would be appreciated.
Thanks,
Steve
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Thanks Peter, I did see that proviso somewhere but no, refreshing doesn't help. That page was one of those that I referred to getting to this point. Steve On 20 October 2015 at 18:33, Peter van Heusden <pvh@sanbi.ac.za> wrote:
Just a quick check - did you refresh your history to confirm that the dataset *is* empty? We had the same thing at SANBI but it turns out that Galaxy creates an empty output collection and then only populates it sometime after job completion (this is a know UI bug).
See: http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections...
On 20 October 2015 at 08:48, Steve Cassidy <steve.cassidy@mq.edu.au> wrote:
Hi all, I'm trying to understand how to write a tool that generates a dataset rather than a single output file. I've tried following all of the examples but I'm stuck, so I thought I would distil down the simplest example I could write and ask for help here.
So here's my example:
https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
it's a simple python script that writes three files to a directory named for the single input parameter.
I think one of the problems I'm having is knowing where to write the output to. I've run this under planemo serve and the job runs, creating the output directory within the 'job_working_directory/000/1/SampleDataset' directory, however my dataset doesn't contain anything so clearly my outputs directive isn't working:
<outputs> <collection type="list" label="$job_name" name="output1"> <discover_datasets pattern="__name_and_ext__" directory="$job_name" /> </collection> </outputs>
($job_name is the name of the directory that is being written to, SampleDataset in this case)
Any help in getting this example working would be appreciated.
Thanks,
Steve
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
I suspect that the problem might be in the <discover_datasets> then. I'm not an export on this, but "__name_and_ext__" turns into the regexp r"(?P<name>.*)\.(?P<ext>[^\.]+)?" in lib/galaxy/tools/parameters/output_collect.py, and is used by the DatasetCollector (line 358). This looks like it should match the filenames you're creating, but I'm not 100% sure how that code works. One thing I notice is the "directory" argument. If you write jobs to the current directory instead of "output_path" can you get it to work? Peter On 20 October 2015 at 09:52, Steve Cassidy <steve.cassidy@mq.edu.au> wrote:
Thanks Peter, I did see that proviso somewhere but no, refreshing doesn't help.
That page was one of those that I referred to getting to this point.
Steve
On 20 October 2015 at 18:33, Peter van Heusden <pvh@sanbi.ac.za> wrote:
Just a quick check - did you refresh your history to confirm that the dataset *is* empty? We had the same thing at SANBI but it turns out that Galaxy creates an empty output collection and then only populates it sometime after job completion (this is a know UI bug).
See: http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections...
On 20 October 2015 at 08:48, Steve Cassidy <steve.cassidy@mq.edu.au> wrote:
Hi all, I'm trying to understand how to write a tool that generates a dataset rather than a single output file. I've tried following all of the examples but I'm stuck, so I thought I would distil down the simplest example I could write and ask for help here.
So here's my example:
https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
it's a simple python script that writes three files to a directory named for the single input parameter.
I think one of the problems I'm having is knowing where to write the output to. I've run this under planemo serve and the job runs, creating the output directory within the 'job_working_directory/000/1/SampleDataset' directory, however my dataset doesn't contain anything so clearly my outputs directive isn't working:
<outputs> <collection type="list" label="$job_name" name="output1"> <discover_datasets pattern="__name_and_ext__" directory="$job_name" /> </collection> </outputs>
($job_name is the name of the directory that is being written to, SampleDataset in this case)
Any help in getting this example working would be appreciated.
Thanks,
Steve
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
Yes, I'm sure that's where the problem lies. Writing out to the current directory doesn't work. The files get written to 'job_working_directory/000/1/' but if I run the Upload File tool the result is placed in 'files/000/'. I think I need to work out where to write the files, I found some references to $__new_file_path__ but that doesn't seem to help. Steve On 20 October 2015 at 19:57, Peter van Heusden <pvh@sanbi.ac.za> wrote:
I suspect that the problem might be in the <discover_datasets> then. I'm not an export on this, but "__name_and_ext__" turns into the regexp r"(?P<name>.*)\.(?P<ext>[^\.]+)?" in lib/galaxy/tools/parameters/output_collect.py, and is used by the DatasetCollector (line 358). This looks like it should match the filenames you're creating, but I'm not 100% sure how that code works. One thing I notice is the "directory" argument. If you write jobs to the current directory instead of "output_path" can you get it to work?
Peter
On 20 October 2015 at 09:52, Steve Cassidy <steve.cassidy@mq.edu.au> wrote:
Thanks Peter, I did see that proviso somewhere but no, refreshing doesn't help.
That page was one of those that I referred to getting to this point.
Steve
On 20 October 2015 at 18:33, Peter van Heusden <pvh@sanbi.ac.za> wrote:
Just a quick check - did you refresh your history to confirm that the dataset *is* empty? We had the same thing at SANBI but it turns out that Galaxy creates an empty output collection and then only populates it sometime after job completion (this is a know UI bug).
See: http://pvh.wp.sanbi.ac.za/2015/09/18/adventures-in-galaxy-output-collections...
On 20 October 2015 at 08:48, Steve Cassidy <steve.cassidy@mq.edu.au> wrote:
Hi all, I'm trying to understand how to write a tool that generates a dataset rather than a single output file. I've tried following all of the examples but I'm stuck, so I thought I would distil down the simplest example I could write and ask for help here.
So here's my example:
https://gist.github.com/stevecassidy/0fa45ad5853faacb5f55
it's a simple python script that writes three files to a directory named for the single input parameter.
I think one of the problems I'm having is knowing where to write the output to. I've run this under planemo serve and the job runs, creating the output directory within the 'job_working_directory/000/1/SampleDataset' directory, however my dataset doesn't contain anything so clearly my outputs directive isn't working:
<outputs> <collection type="list" label="$job_name" name="output1"> <discover_datasets pattern="__name_and_ext__" directory="$job_name" /> </collection> </outputs>
($job_name is the name of the directory that is being written to, SampleDataset in this case)
Any help in getting this example working would be appreciated.
Thanks,
Steve
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
-- Department of Computing, Macquarie University http://web.science.mq.edu.au/~cassidy/
participants (2)
-
Peter van Heusden
-
Steve Cassidy