Possible to define a datatype corresponding to a directory?
Hi all, I have just downloaded galaxy, got up and running with a local installation. I'd like to have a crack at integrating some of our tools into the system. To start off with, we use a data format that is actually a directory containing multiple (variable number of) files and subdirectories. Is it possible to define a datatype that treats this directory as a single object in terms of using it as tool inputs and outputs? It seems like I could define a datatype corresponding to a zipped version of the directory (and this would be useful for uploading), but once it's uploaded I don't want the tools to be packing and unpacking zipfiles all the time. Cheers, Len.
Hi, Len. Please take a look at https://bitbucket.org/galaxy/galaxy-central/wiki/CompositeDatatypes and see if that might help? You can copy or create arbitrary files in the directory stored as the html datatype's extra_files_path. To make them visible and accessible to the user, write a list of href links in a valid html page of the html datatype - the display automagically deals with looking in the extra_files_path to find them. Composite objects are also great for organizing messing tool outputs for display and might be what you are looking for to pass bunches of files to tools. Note that since your tools will need to look in the html file's extra_files_path, that needs to be passed in to your tools on the command line. Hope this helps get you going. Fri, Oct 22, 2010 at 8:21 PM, Len Trigg <len@realtimegenomics.com> wrote:
Hi all,
I have just downloaded galaxy, got up and running with a local installation. I'd like to have a crack at integrating some of our tools into the system.
To start off with, we use a data format that is actually a directory containing multiple (variable number of) files and subdirectories. Is it possible to define a datatype that treats this directory as a single object in terms of using it as tool inputs and outputs?
It seems like I could define a datatype corresponding to a zipped version of the directory (and this would be useful for uploading), but once it's uploaded I don't want the tools to be packing and unpacking zipfiles all the time.
Cheers, Len.
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
-- Ross Lazarus MBBS MPH Associate Professor, Harvard Medical School Director of Bioinformatics, Channing Laboratory 181 Longwood Ave., Boston MA 02115, USA. Tel: +1 617 505 4850
Ross wrote:
Please take a look at https://bitbucket.org/galaxy/galaxy-central/wiki/CompositeDatatypes and see if that might help?
You can copy or create arbitrary files in the directory stored as the html datatype's extra_files_path. To make them visible and accessible to the user, write a list of href links in a valid html page of the html datatype - the display automagically deals with looking in the extra_files_path to find them.
OK, that looks like it might do the trick. Do you know if it supports directories being placed in the extra_files_path? (The "Composite Datasets" section of http://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput seems to imply that the additional content can only be files, not directories) Cheers, Len.
Composite objects are also great for organizing messing tool outputs for display and might be what you are looking for to pass bunches of files to tools. Note that since your tools will need to look in the html file's extra_files_path, that needs to be passed in to your tools on the command line.
Hope this helps get you going.
Fri, Oct 22, 2010 at 8:21 PM, Len Trigg <len@realtimegenomics.com <mailto:len@realtimegenomics.com>> wrote:
Hi all,
I have just downloaded galaxy, got up and running with a local installation. I'd like to have a crack at integrating some of our tools into the system.
To start off with, we use a data format that is actually a directory containing multiple (variable number of) files and subdirectories. Is it possible to define a datatype that treats this directory as a single object in terms of using it as tool inputs and outputs?
It seems like I could define a datatype corresponding to a zipped version of the directory (and this would be useful for uploading), but once it's uploaded I don't want the tools to be packing and unpacking zipfiles all the time.
Cheers, Len.
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu> http://lists.bx.psu.edu/listinfo/galaxy-dev
-- Ross Lazarus MBBS MPH Associate Professor, Harvard Medical School Director of Bioinformatics, Channing Laboratory 181 Longwood Ave., Boston MA 02115, USA. Tel: +1 617 505 4850
On Tue, Oct 26, 2010 at 12:51 PM, Len Trigg <len@realtimegenomics.com>wrote:
Ross wrote:
Please take a look at https://bitbucket.org/galaxy/galaxy-central/wiki/CompositeDatatypes and see if that might help?
You can copy or create arbitrary files in the directory stored as the html datatype's extra_files_path. To make them visible and accessible to the user, write a list of href links in a valid html page of the html datatype - the display automagically deals with looking in the extra_files_path to find them.
OK, that looks like it might do the trick. Do you know if it supports directories being placed in the extra_files_path?
Never tried, but I'm going to guess that's possible. Any thoughts, Dan? You'll need to take care of it in terms of tools that try to use the data and writing the links so the display works - let us know how you go?
(The "Composite Datasets" section of http://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput seems to imply that the additional content can only be files, not directories)
I think it's probably written that way because the code to generate the index for an 'auto_primary_file' composite object was designed to create a single level index of files - it could probably be made to create links for a recursive walk of the entire tree if that were ever needed - but I think you could use a 'basic' composite object and create your own complex index ?
Cheers, Len.
Composite objects are also great for organizing messing tool outputs for display and might be what you are looking for to pass bunches of files to tools. Note that since your tools will need to look in the html file's extra_files_path, that needs to be passed in to your tools on the command line. Hope this helps get you going.
Fri, Oct 22, 2010 at 8:21 PM, Len Trigg <len@realtimegenomics.com<mailto: len@realtimegenomics.com>> wrote:
Hi all,
I have just downloaded galaxy, got up and running with a local installation. I'd like to have a crack at integrating some of our tools into the system.
To start off with, we use a data format that is actually a directory containing multiple (variable number of) files and subdirectories. Is it possible to define a datatype that treats this directory as a single object in terms of using it as tool inputs and outputs?
It seems like I could define a datatype corresponding to a zipped version of the directory (and this would be useful for uploading), but once it's uploaded I don't want the tools to be packing and unpacking zipfiles all the time.
Cheers, Len.
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu>
Do you know if it supports directories being placed in the extra_files_path?
Never tried, but I'm going to guess that's possible. Any thoughts, Dan?
You should be able to use subdirectories in extra_/files_path for reading and writing within tools, but due to current limitations with how the web-route is defined, you are limited to only viewing the files found within the 'base' extra_files_path (e.g. when using the eye icon); unfortunately this is less than ideal and should be enhanced. Do please let us know if you are unable to use subdirectories in tools input/output. Thanks for using Galaxy, Dan On Oct 26, 2010, at 1:21 PM, Ross wrote:
On Tue, Oct 26, 2010 at 12:51 PM, Len Trigg <len@realtimegenomics.com> wrote: Ross wrote: Please take a look at https://bitbucket.org/galaxy/galaxy-central/wiki/CompositeDatatypes and see if that might help?
You can copy or create arbitrary files in the directory stored as the html datatype's extra_files_path. To make them visible and accessible to the user, write a list of href links in a valid html page of the html datatype - the display automagically deals with looking in the extra_files_path to find them.
OK, that looks like it might do the trick. Do you know if it supports directories being placed in the extra_files_path?
Never tried, but I'm going to guess that's possible. Any thoughts, Dan? You'll need to take care of it in terms of tools that try to use the data and writing the links so the display works - let us know how you go?
(The "Composite Datasets" section of http://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput seems to imply that the additional content can only be files, not directories)
I think it's probably written that way because the code to generate the index for an 'auto_primary_file' composite object was designed to create a single level index of files - it could probably be made to create links for a recursive walk of the entire tree if that were ever needed - but I think you could use a 'basic' composite object and create your own complex index ?
Cheers, Len.
Composite objects are also great for organizing messing tool outputs for display and might be what you are looking for to pass bunches of files to tools. Note that since your tools will need to look in the html file's extra_files_path, that needs to be passed in to your tools on the command line. Hope this helps get you going.
Fri, Oct 22, 2010 at 8:21 PM, Len Trigg <len@realtimegenomics.com <mailto:len@realtimegenomics.com>> wrote:
Hi all,
I have just downloaded galaxy, got up and running with a local installation. I'd like to have a crack at integrating some of our tools into the system.
To start off with, we use a data format that is actually a directory containing multiple (variable number of) files and subdirectories. Is it possible to define a datatype that treats this directory as a single object in terms of using it as tool inputs and outputs?
It seems like I could define a datatype corresponding to a zipped version of the directory (and this would be useful for uploading), but once it's uploaded I don't want the tools to be packing and unpacking zipfiles all the time.
Cheers, Len.
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu <mailto:galaxy-dev@lists.bx.psu.edu>
Daniel Blankenberg wrote:
You should be able to use subdirectories in extra_/files_path for reading and writing within tools, but due to current limitations with how the web-route is defined, you are limited to only viewing the files found within the 'base' extra_files_path (e.g. when using the eye icon); unfortunately this is less than ideal and should be enhanced. Do please let us know if you are unable to use subdirectories in tools input/output.
The extra subdirectories seems to be working fine, thanks. Another weirdness I noticed is that I have a parameter that is a text type where the percent symbol is allowed to be part of the text. When the tool is run, the percent is translated to an X symbol. How should I work around this? Another question... I have a tool that produces multiple outputs determined at runtime, so I am following the procedure described in https://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput Two of the files are SAM files (essentially filtered based on SAM criteria), and I am giving them unique names when placing them in __new_file_path__. Both files appear as history datasets, but the name is not employed to let the user know which one is which. Is there a way to name them differently in the history? Cheers, Len.
Hi Len,
Another weirdness I noticed is that I have a parameter that is a text type where the percent symbol is allowed to be part of the text. When the tool is run, the percent is translated to an X symbol. How should I work around this?
This is due to parameter sanitization used during command-line substitution. You can modify the way that Tool parameters are sanitized on a per parameter basis by using the <sanitizer> tag set. This wiki page has more information: http://bitbucket.org/galaxy/galaxy-central/wiki/ToolConfigSyntax(search for '<sanititzer> tag set'). It is also possible to turn the sanitization off for an entire tool. Of course, cautiousness is encouraged when modifying the sanitization actions.
Another question... I have a tool that produces multiple outputs determined at runtime, so I am following the procedure described in https://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput Two of the files are SAM files (essentially filtered based on SAM criteria), and I am giving them unique names when placing them in __new_file_path__. Both files appear as history datasets, but the name is not employed to let the user know which one is which. Is there a way to name them differently in the history?
Which revision are you using? Changeset 4465:78d2a72ee1d6 adds the provided name/designation in parenthesis to the name of the 'base' output dataset, similar to: 'toolName on data x, y (newDatasetName)'. Let us know if this isn't the behavior you experience with 4465:78d2a72ee1d6 or greater, or if this doesn't address your issue. Thanks for using Galaxy, Dan On Oct 27, 2010, at 6:03 PM, Len Trigg wrote:
Daniel Blankenberg wrote:
You should be able to use subdirectories in extra_/files_path for reading and writing within tools, but due to current limitations with how the web-route is defined, you are limited to only viewing the files found within the 'base' extra_files_path (e.g. when using the eye icon); unfortunately this is less than ideal and should be enhanced. Do please let us know if you are unable to use subdirectories in tools input/output.
The extra subdirectories seems to be working fine, thanks.
Another weirdness I noticed is that I have a parameter that is a text type where the percent symbol is allowed to be part of the text. When the tool is run, the percent is translated to an X symbol. How should I work around this?
Another question... I have a tool that produces multiple outputs determined at runtime, so I am following the procedure described in https://bitbucket.org/galaxy/galaxy-central/wiki/ToolsMultipleOutput Two of the files are SAM files (essentially filtered based on SAM criteria), and I am giving them unique names when placing them in __new_file_path__. Both files appear as history datasets, but the name is not employed to let the user know which one is which. Is there a way to name them differently in the history?
Cheers, Len.
Hi, I'm running revision 4354:d681ef7538ed and trying to update my galaxy instance. When I use hg pull it says there are no changes found but obviously there have been committed changesets such as the one below. What could be the problem. Also in response to this:
Which revision are you using? Changeset 4465:78d2a72ee1d6 adds the provided name/designation in parenthesis to the name of the 'base' output dataset, similar to: 'toolName on data x, y (newDatasetName)'. Let us know if this isn't the behavior you experience with 4465:78d2a72ee1d6 or greater, or if this doesn't address your issue.
Is it possible to rename the primary output dataset as well or will this remain the default label given in the tool xml. Thanks! Shaun Webb -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Hi Shaun, 4354:d681ef7538ed is the latest revision in galaxy-dist, 4465:78d2a72ee1d6 is currently available in galaxy-central, but will be available in galaxy-dist when it is next updated (not sure on the time frame of this). I am sorry for this confusion.
Is it possible to rename the primary output dataset as well or will this remain the default label given in the tool xml.
You can rename the primary output dataset, using the 'label' attribute for the output dataset in the tool definition, but this will also be the base name used before the parenthesis in the additional files. Thanks for using Galaxy, Dan On Oct 28, 2010, at 10:58 AM, SHAUN WEBB wrote:
Hi, I'm running revision 4354:d681ef7538ed and trying to update my galaxy instance.
When I use hg pull it says there are no changes found but obviously there have been committed changesets such as the one below. What could be the problem.
Also in response to this:
Which revision are you using? Changeset 4465:78d2a72ee1d6 adds the provided name/designation in parenthesis to the name of the 'base' output dataset, similar to: 'toolName on data x, y (newDatasetName)'. Let us know if this isn't the behavior you experience with 4465:78d2a72ee1d6 or greater, or if this doesn't address your issue.
Is it possible to rename the primary output dataset as well or will this remain the default label given in the tool xml.
Thanks! Shaun Webb
-- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
_______________________________________________ galaxy-dev mailing list galaxy-dev@lists.bx.psu.edu http://lists.bx.psu.edu/listinfo/galaxy-dev
participants (4)
-
Daniel Blankenberg
-
Len Trigg
-
Ross
-
SHAUN WEBB