environment variables and paths for toolshed tools
Are there any environment variables that are honored by toolshed installed tools? I tried creating a tool that uses ${GALAXY_DATA_INDEX_DIR} or $GALAXY_HOME, then uploaded it the test toolshed, then installed it automatically. Neither of these resolved to what I expected. I don't want to hard-code the path in the xml tool file, but rather have a default location for other executables and jar files. How should this best be done? David Hoover Helix Systems Staff
Hello David, This is not currently possible, but will be available in the next Galaxy distribution release, currently scheduled for about 10 days from now. You will include a tool_dependencies.xml file in your tool shed repository that looks something like the following. <?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="JAVA_JAR_PATH" action="set_to">$INSTALL_DIR</environment_variable> </set_environment> </tool_dependency> Your tool will find the required files via the defined JAVA_JAR_PATH by using a <requirements> tag set in the tool config, something like this: <requirements> <requirement type="set_environment">JAVA_JAR_PATH </requirement> </requirements> The <command> tag in the tool config would be something like this: <command interpreter="python"> python_wrapper.py $JAVA_JAR_PATH/some_file.jar $param1 $param2... </command> I'm close to having this working, so if you are interested in testing before the next Galaxy dist release, let me know and I'll tell you when the Galaxy central repository has the new feature. Thanks! Greg Von Kuster On Sep 12, 2012, at 1:00 PM, David Hoover wrote:
Are there any environment variables that are honored by toolshed installed tools? I tried creating a tool that uses ${GALAXY_DATA_INDEX_DIR} or $GALAXY_HOME, then uploaded it the test toolshed, then installed it automatically. Neither of these resolved to what I expected. I don't want to hard-code the path in the xml tool file, but rather have a default location for other executables and jar files. How should this best be done?
David Hoover Helix Systems Staff ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Sep 12, 2012, at 1:00 PM, David Hoover wrote:
Are there any environment variables that are honored by toolshed installed tools? I tried creating a tool that uses ${GALAXY_DATA_INDEX_DIR} or $GALAXY_HOME, then uploaded it the test toolshed, then installed it automatically. Neither of these resolved to what I expected. I don't want to hard-code the path in the xml tool file, but rather have a default location for other executables and jar files. How should this best be done?
David Hoover Helix Systems Staff
On Wed, Sep 12, 2012 at 6:15 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello David,
This is not currently possible, but will be available in the next Galaxy distribution release, currently scheduled for about 10 days from now. You will include a tool_dependencies.xml file in your tool shed repository that looks something like the following.
<?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="JAVA_JAR_PATH" action="set_to">$INSTALL_DIR</environment_variable> </set_environment> </tool_dependency>
Your tool will find the required files via the defined JAVA_JAR_PATH by using a <requirements> tag set in the tool config, something like this:
<requirements> <requirement type="set_environment">JAVA_JAR_PATH </requirement> </requirements>
The <command> tag in the tool config would be something like this:
<command interpreter="python"> python_wrapper.py $JAVA_JAR_PATH/some_file.jar $param1 $param2... </command>
I'm close to having this working, so if you are interested in testing before the next Galaxy dist release, let me know and I'll tell you when the Galaxy central repository has the new feature.
Thanks!
Greg Von Kuster
Hi guys, Reading the rest of the thread from December, Greg's new code did get checked in and should be working (I've not tried this yet). However, I can't help feeling that defining the two environment variables David suggested, $GALAXY_DATA_INDEX_DIR and/or $GALAXY_HOME, would cover the majority of use cases and be far simpler for tool authors to use. Am I overlooking something? Usecase: Tool comes with a simple datafile (e.g. defining a search motif, say motif.dat) which is used by a script (say motif.py) via a normal Galaxy tool XML file (say motif.xml). Perhaps I can just put my data file next to the script and XML file, in which case it is easy for the script to locate? But I assumed that Galaxy best practice would be to use the tool-data folder somehow... Thanks, Peter
Hello Peter, please see my response at the end. On Feb 15, 2013, at 6:47 AM, Peter Cock wrote:
On Sep 12, 2012, at 1:00 PM, David Hoover wrote:
Are there any environment variables that are honored by toolshed installed tools? I tried creating a tool that uses ${GALAXY_DATA_INDEX_DIR} or $GALAXY_HOME, then uploaded it the test toolshed, then installed it automatically. Neither of these resolved to what I expected. I don't want to hard-code the path in the xml tool file, but rather have a default location for other executables and jar files. How should this best be done?
David Hoover Helix Systems Staff
On Wed, Sep 12, 2012 at 6:15 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello David,
This is not currently possible, but will be available in the next Galaxy distribution release, currently scheduled for about 10 days from now. You will include a tool_dependencies.xml file in your tool shed repository that looks something like the following.
<?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="JAVA_JAR_PATH" action="set_to">$INSTALL_DIR</environment_variable> </set_environment> </tool_dependency>
Your tool will find the required files via the defined JAVA_JAR_PATH by using a <requirements> tag set in the tool config, something like this:
<requirements> <requirement type="set_environment">JAVA_JAR_PATH </requirement> </requirements>
The <command> tag in the tool config would be something like this:
<command interpreter="python"> python_wrapper.py $JAVA_JAR_PATH/some_file.jar $param1 $param2... </command>
I'm close to having this working, so if you are interested in testing before the next Galaxy dist release, let me know and I'll tell you when the Galaxy central repository has the new feature.
Thanks!
Greg Von Kuster
Hi guys,
Reading the rest of the thread from December, Greg's new code did get checked in and should be working (I've not tried this yet).
However, I can't help feeling that defining the two environment variables David suggested, $GALAXY_DATA_INDEX_DIR and/or $GALAXY_HOME, would cover the majority of use cases and be far simpler for tool authors to use. Am I overlooking something?
Usecase: Tool comes with a simple datafile (e.g. defining a search motif, say motif.dat) which is used by a script (say motif.py) via a normal Galaxy tool XML file (say motif.xml).
Perhaps I can just put my data file next to the script and XML file, in which case it is easy for the script to locate? But I assumed that Galaxy best practice would be to use the tool-data folder somehow...
Thanks,
Peter
The current implementation places your motif.dat data file in the repository installation directory, and tools that are properly configured to locate dependencies that are included in the repository contents will find it. This approach allows for tool dependency discovery and easy maintenance when uninstalling repositories as all contents are kept where they were installed. Tools that are properly configured with <requirement> tags will find dependencies because the associated env.sh file is sourced, providing the location of all dependencies to the environment. If your motif.dat file is one that would be manually altered by the Galaxy administrator over time, perhaps automatically moving it to the GALAXY_DATA_INDEX_DIR is justified. However, in this case, uninstalling the repository would probably not uninstall the motif.dat file. Does this fit with what you are thinking? What kinds of files would require knowledge of the GALAXY_HOME directory? Thanks Peter, Greg Von Kuster
On Fri, Feb 15, 2013 at 4:05 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hi guys,
Reading the rest of the thread from December, Greg's new code did get checked in and should be working (I've not tried this yet).
However, I can't help feeling that defining the two environment variables David suggested, $GALAXY_DATA_INDEX_DIR and/or $GALAXY_HOME, would cover the majority of use cases and be far simpler for tool authors to use. Am I overlooking something?
Usecase: Tool comes with a simple datafile (e.g. defining a search motif, say motif.dat) which is used by a script (say motif.py) via a normal Galaxy tool XML file (say motif.xml).
Perhaps I can just put my data file next to the script and XML file, in which case it is easy for the script to locate? But I assumed that Galaxy best practice would be to use the tool-data folder somehow...
Thanks,
Peter
The current implementation places your motif.dat data file in the repository installation directory, and tools that are properly configured to locate dependencies that are included in the repository contents will find it. This approach allows for tool dependency discovery and easy maintenance when uninstalling repositories as all contents are kept where they were installed. Tools that are properly configured with <requirement> tags will find dependencies because the associated env.sh file is sourced, providing the location of all dependencies to the environment.
Is there a documented example I should be reading? In this use case (without any explicit XML markup), can I assume/have the motif.dat file be installed in the same folder next to the motif.py script and tool defining motif.xml file?
If your motif.dat file is one that would be manually altered by the Galaxy administrator over time, perhaps automatically moving it to the GALAXY_DATA_INDEX_DIR is justified. However, in this case, uninstalling the repository would probably not uninstall the motif.dat file. Does this fit with what you are thinking?
In this case no, I would not expect the motif.dat file to be edited. Does this mean GALAXY_DATA_INDEX_DIR and the galaxy tool-data folder are intended for 'configuration' files the local admin may need to edit, rather than static tool specific data file?
What kinds of files would require knowledge of the GALAXY_HOME directory?
Knowing GALAXY_DATA_INDEX_DIR would probably cover most cases, so knowing GALAXY_HOME is probably not needed. Regards, Peter
Hi Peter, On Feb 15, 2013, at 11:14 AM, Peter Cock wrote:
On Fri, Feb 15, 2013 at 4:05 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hi guys,
Reading the rest of the thread from December, Greg's new code did get checked in and should be working (I've not tried this yet).
However, I can't help feeling that defining the two environment variables David suggested, $GALAXY_DATA_INDEX_DIR and/or $GALAXY_HOME, would cover the majority of use cases and be far simpler for tool authors to use. Am I overlooking something?
Usecase: Tool comes with a simple datafile (e.g. defining a search motif, say motif.dat) which is used by a script (say motif.py) via a normal Galaxy tool XML file (say motif.xml).
Perhaps I can just put my data file next to the script and XML file, in which case it is easy for the script to locate? But I assumed that Galaxy best practice would be to use the tool-data folder somehow...
Thanks,
Peter
The current implementation places your motif.dat data file in the repository installation directory, and tools that are properly configured to locate dependencies that are included in the repository contents will find it. This approach allows for tool dependency discovery and easy maintenance when uninstalling repositories as all contents are kept where they were installed. Tools that are properly configured with <requirement> tags will find dependencies because the associated env.sh file is sourced, providing the location of all dependencies to the environment.
Is there a documented example I should be reading? In this use case (without any explicit XML markup), can I assume/have the motif.dat file be installed in the same folder next to the motif.py script and tool defining motif.xml file?
This scenario uses a tool dependency that is included in the repository which is discussed in the following section of the tool shed wiki. Using this approach a tool config (Cheetah template) would have to be configured with the proper <requirement> tag set and an associated <command> tag set that would ultimately have the path to the motif.py script and motif.dat file defined in the environment. Of course, this assumes the motify.py script is not the tool executable itself. http://wiki.galaxyproject.org/ToolShedToolFeatures#Finding_dependencies_incl...
If your motif.dat file is one that would be manually altered by the Galaxy administrator over time, perhaps automatically moving it to the GALAXY_DATA_INDEX_DIR is justified. However, in this case, uninstalling the repository would probably not uninstall the motif.dat file. Does this fit with what you are thinking?
In this case no, I would not expect the motif.dat file to be edited.
Does this mean GALAXY_DATA_INDEX_DIR and the galaxy tool-data folder are intended for 'configuration' files the local admin may need to edit, rather than static tool specific data file?
Not necessarily, but with the exception of sample .loc files included in the repository, repository contents should not generally be required to be moved outside of their installation directory.
What kinds of files would require knowledge of the GALAXY_HOME directory?
Knowing GALAXY_DATA_INDEX_DIR would probably cover most cases, so knowing GALAXY_HOME is probably not needed.
I believe that the current implementation supports locating all tool dependencies, whether included in the repository or 3-rd party, so I'm not sure introducing GALAXY_DATA_INDEX_DIR to the repository installation process will be beneficial. I can be swayed, however, if I see justification for supporting it.
Regards,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Fri, Feb 15, 2013 at 4:32 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Usecase: Tool comes with a simple datafile (e.g. defining a search motif, say motif.dat) which is used by a script (say motif.py) via a normal Galaxy tool XML file (say motif.xml).
Perhaps I can just put my data file next to the script and XML file, in which case it is easy for the script to locate? But I assumed that Galaxy best practice would be to use the tool-data folder somehow...
Thanks,
Peter
Is there a documented example I should be reading? In this use case (without any explicit XML markup), can I assume/have the motif.dat file be installed in the same folder next to the motif.py script and tool defining motif.xml file?
This scenario uses a tool dependency that is included in the repository which is discussed in the following section of the tool shed wiki. Using this approach a tool config (Cheetah template) would have to be configured with the proper <requirement> tag set and an associated <command> tag set that would ultimately have the path to the motif.py script and motif.dat file defined in the environment. Of course, this assumes the motify.py script is not the tool executable itself.
Actually for the example I am working on, the motif.py script would be the tool executable, called from motif.xml like this: <command interpreter="python">motif.py $fasta_file $tabular_file</command> i.e. A completely self contained Galaxy tool consisting of four files: motif.xml - Galaxy tool definition motif.txt - README file motif.py - Python script which is the tool motif.dat - Data file used by the Python script I am hoping that no further XML configuration files are needed in order to handle the data file. Peter
On Feb 15, 2013, at 11:45 AM, Peter Cock wrote:
On Fri, Feb 15, 2013 at 4:32 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Usecase: Tool comes with a simple datafile (e.g. defining a search motif, say motif.dat) which is used by a script (say motif.py) via a normal Galaxy tool XML file (say motif.xml).
Perhaps I can just put my data file next to the script and XML file, in which case it is easy for the script to locate? But I assumed that Galaxy best practice would be to use the tool-data folder somehow...
Thanks,
Peter
Is there a documented example I should be reading? In this use case (without any explicit XML markup), can I assume/have the motif.dat file be installed in the same folder next to the motif.py script and tool defining motif.xml file?
This scenario uses a tool dependency that is included in the repository which is discussed in the following section of the tool shed wiki. Using this approach a tool config (Cheetah template) would have to be configured with the proper <requirement> tag set and an associated <command> tag set that would ultimately have the path to the motif.py script and motif.dat file defined in the environment. Of course, this assumes the motify.py script is not the tool executable itself.
Actually for the example I am working on, the motif.py script would be the tool executable, called from motif.xml like this:
<command interpreter="python">motif.py $fasta_file $tabular_file</command>
i.e. A completely self contained Galaxy tool consisting of four files:
motif.xml - Galaxy tool definition motif.txt - README file motif.py - Python script which is the tool motif.dat - Data file used by the Python script
I am hoping that no further XML configuration files are needed in order to handle the data file.
Peter
In this case, your motif.xml file can be configured with the following: <command interpreter="python"> motif.py $fasta_file $tabular_file </command> <requirements> <requirement type="set_environment">MOTIF_DAT_PATH</requirement> </requirements> and your repository will need to include the following tool_dependencies.xml file: <?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="MOTIF_DAT_PATH" action="set_to">$REPOSITORY_INSTALL_DIR</environment_variable> </set_environment> </tool_dependency>
On Fri, Feb 15, 2013 at 4:57 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
On Feb 15, 2013, at 11:45 AM, Peter Cock wrote:
Actually for the example I am working on, the motif.py script would be the tool executable, called from motif.xml like this:
<command interpreter="python">motif.py $fasta_file $tabular_file</command>
i.e. A completely self contained Galaxy tool consisting of four files:
motif.xml - Galaxy tool definition motif.txt - README file motif.py - Python script which is the tool motif.dat - Data file used by the Python script
I am hoping that no further XML configuration files are needed in order to handle the data file.
Peter
In this case, your motif.xml file can be configured with the following:
<command interpreter="python"> motif.py $fasta_file $tabular_file </command> <requirements> <requirement type="set_environment">MOTIF_DAT_PATH</requirement> </requirements>
and your repository will need to include the following tool_dependencies.xml file:
<?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="MOTIF_DAT_PATH" action="set_to">$REPOSITORY_INSTALL_DIR</environment_variable> </set_environment> </tool_dependency>
That seems overly complicated. Would I not be able to assume that the motif.xml, motif.py, motif.dat etc are all in the same folder (then I can just use a relative path to find the data file - nice and easy). If not, how about simply defining $REPOSITORY_INSTALL_DIR within the XML language so I can do this: <command interpreter="python">motif.py $fasta_file $tabular_file $REPOSITORY_INSTALL_DIR/motif.dat</command> And/or defining $REPOSITORY_INSTALL_DIR as an environment variable which I can then use in the script (be it Python, Perl, shell, etc)? Thanks, Peter
Hi Peter, Thanks for your thoughts on this. I've created the following Trello card for this enhancement. https://trello.com/card/galaxy-tool-enhancement-to-accommodate-repository-in... The priority for the tool shed in the area of tool dependencies like this has been the implementation of features that enable sharing a single dependency across multiple tools in separate repositories, so your scenario is not yet supported. This enhancement would actually be as much (or perhaps more) on the Galaxy end than the tool shed, I think, so there may be more resources available to get to it sooner than I can right now. Of course, someone wil get to it as soon as possible. Thanks again for all of your feedback and insight on this. Greg Von Kuster On Feb 15, 2013, at 12:07 PM, Peter Cock wrote:
On Fri, Feb 15, 2013 at 4:57 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
In this case, your motif.xml file can be configured with the following:
<command interpreter="python"> motif.py $fasta_file $tabular_file </command> <requirements> <requirement type="set_environment">MOTIF_DAT_PATH</requirement> </requirements>
and your repository will need to include the following tool_dependencies.xml file:
<?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="MOTIF_DAT_PATH" action="set_to">$REPOSITORY_INSTALL_DIR</environment_variable> </set_environment> </tool_dependency>
That seems overly complicated. Would I not be able to assume that the motif.xml, motif.py, motif.dat etc are all in the same folder (then I can just use a relative path to find the data file - nice and easy).
If not, how about simply defining $REPOSITORY_INSTALL_DIR within the XML language so I can do this:
<command interpreter="python">motif.py $fasta_file $tabular_file $REPOSITORY_INSTALL_DIR/motif.dat</command>
And/or defining $REPOSITORY_INSTALL_DIR as an environment variable which I can then use in the script (be it Python, Perl, shell, etc)?
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Sat, Feb 16, 2013 at 9:41 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hi Peter,
Thanks for your thoughts on this. I've created the following Trello card for this enhancement.
https://trello.com/card/galaxy-tool-enhancement-to-accommodate-repository-in...
The priority for the tool shed in the area of tool dependencies like this has been the implementation of features that enable sharing a single dependency across multiple tools in separate repositories, so your scenario is not yet supported.
This enhancement would actually be as much (or perhaps more) on the Galaxy end than the tool shed, I think, so there may be more resources available to get to it sooner than I can right now. Of course, someone wil get to it as soon as possible.
Thanks again for all of your feedback and insight on this.
Greg Von Kuster
Hi Greg, I've read http://wiki.galaxyproject.org/InstallingRepositoriesToGalaxy which is targeted at Galaxy administrators rather than tools authors. It explains how a revision specific folder is assigned to each installed tool, but what (if anything) of the Tool's folder structure is preserved? e.g. One of the paths given is: .../shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml The prefix is the revision specific path for that tool for that Tool Shed, .../shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/ Within that, it seems to preserve the emboss_datatypes folders. i.e. goto http://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes and click browse repository tip files from the top left actions menu, it has just one file: emboss_datatypes/datatypes_conf.xml Based on a sample of one, it seems the installation process just unpacks the tool shed files (well, probably using hg rather than unpacking a tar-ball), and preserves their relative path structure. If so, I can just use relative paths to find a data file from the tool executable's own location on disk. (This is what I was hoping would be the case - I'm looking for explicit confirmation). i.e. For the example use case, if motif.py and motif.dat are in the same folder in the Tool Shed upload, they will be in the same folder as each other once installed. That way motif.py can easily locate the data file motif.dat by looking in the same folder as it itself is located. This is actually very simple (provided I can assume the local folder structure is maintained). (I think I was originally on the wrong track by assuming I should be using the tool-data folder, which is complicated by not knowing where that will existing on disk). Thanks, Peter
Hi Peter, see below... On Feb 19, 2013, at 5:10 AM, Peter Cock wrote:
On Sat, Feb 16, 2013 at 9:41 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hi Peter,
Thanks for your thoughts on this. I've created the following Trello card for this enhancement.
https://trello.com/card/galaxy-tool-enhancement-to-accommodate-repository-in...
The priority for the tool shed in the area of tool dependencies like this has been the implementation of features that enable sharing a single dependency across multiple tools in separate repositories, so your scenario is not yet supported.
This enhancement would actually be as much (or perhaps more) on the Galaxy end than the tool shed, I think, so there may be more resources available to get to it sooner than I can right now. Of course, someone wil get to it as soon as possible.
Thanks again for all of your feedback and insight on this.
Greg Von Kuster
Hi Greg,
I've read http://wiki.galaxyproject.org/InstallingRepositoriesToGalaxy which is targeted at Galaxy administrators rather than tools authors. It explains how a revision specific folder is assigned to each installed tool, but what (if anything) of the Tool's folder structure is preserved?
The installation process uses the mercurial API, so the process to install a specific repository changeset_revision ultimately includes the following 2 hg commands. hg clone <repository_clone_url> hg update -r changeset_revision This results in the structure of the installed repository being exactly the same of the specific revision from which the repository was cloned.
e.g. One of the paths given is:
.../shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml
The prefix is the revision specific path for that tool for that Tool Shed,
.../shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/
Within that, it seems to preserve the emboss_datatypes folders. i.e. goto http://toolshed.g2.bx.psu.edu/view/devteam/emboss_datatypes and click browse repository tip files from the top left actions menu, it has just one file: emboss_datatypes/datatypes_conf.xml
Yes, the complete directory hierarchy and content of the original repository revision is preserved when it installed into Galaxy.
Based on a sample of one, it seems the installation process just unpacks the tool shed files (well, probably using hg rather than unpacking a tar-ball), and preserves their relative path structure. If so, I can just use relative paths to find a data file from the tool executable's own location on disk. (This is what I was hoping would be the case - I'm looking for explicit confirmation).
This is correct, hg clone provides this behavior.
i.e. For the example use case, if motif.py and motif.dat are in the same folder in the Tool Shed upload, they will be in the same folder as each other once installed. That way motif.py can easily locate the data file motif.dat by looking in the same folder as it itself is located. This is actually very simple (provided I can assume the local folder structure is maintained).
Yes, this is correct.
(I think I was originally on the wrong track by assuming I should be using the tool-data folder, which is complicated by not knowing where that will existing on disk).
I think your request is still valid though and we will add this enhancement for the Galaxy tool component to handle $REPOSITORY_INSTALL_DIR in tool configs. In the meantime, it looks like you can work around this missing feature. Please let me know if you bump into issues.
Thanks,
Peter
On Tue, Feb 19, 2013 at 11:39 AM, Greg Von Kuster <greg@bx.psu.edu> wrote:
On Feb 19, 2013, at 5:10 AM, Peter Cock wrote:
Yes, the complete directory hierarchy and content of the original repository revision is preserved when it installed into Galaxy.
Based on a sample of one, it seems the installation process just unpacks the tool shed files (well, probably using hg rather than unpacking a tar-ball), and preserves their relative path structure. If so, I can just use relative paths to find a data file from the tool executable's own location on disk. (This is what I was hoping would be the case - I'm looking for explicit confirmation).
This is correct, hg clone provides this behavior.
i.e. For the example use case, if motif.py and motif.dat are in the same folder in the Tool Shed upload, they will be in the same folder as each other once installed. That way motif.py can easily locate the data file motif.dat by looking in the same folder as it itself is located. This is actually very simple (provided I can assume the local folder structure is maintained).
Yes, this is correct.
Excellent - that solves this use-case nicely.
(I think I was originally on the wrong track by assuming I should be using the tool-data folder, which is complicated by not knowing where that will existing on disk).
I think your request is still valid though and we will add this enhancement for the Galaxy tool component to handle $REPOSITORY_INSTALL_DIR in tool configs. In the meantime, it looks like you can work around this missing feature. Please let me know if you bump into issues.
I can still see potential uses for $REPOSITORY_INSTALL_DIR both in the tool XML language, and as an environment variable - but I don't need that for the time being. Thanks Greg, all sorted for now, Peter
On Tue, Feb 19, 2013 at 12:06 PM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Tue, Feb 19, 2013 at 11:39 AM, Greg Von Kuster <greg@bx.psu.edu> wrote:
On Feb 19, 2013, at 5:10 AM, Peter Cock wrote:
Yes, the complete directory hierarchy and content of the original repository revision is preserved when it installed into Galaxy.
Based on a sample of one, it seems the installation process just unpacks the tool shed files (well, probably using hg rather than unpacking a tar-ball), and preserves their relative path structure. If so, I can just use relative paths to find a data file from the tool executable's own location on disk. (This is what I was hoping would be the case - I'm looking for explicit confirmation).
This is correct, hg clone provides this behavior.
i.e. For the example use case, if motif.py and motif.dat are in the same folder in the Tool Shed upload, they will be in the same folder as each other once installed. That way motif.py can easily locate the data file motif.dat by looking in the same folder as it itself is located. This is actually very simple (provided I can assume the local folder structure is maintained).
Yes, this is correct.
Excellent - that solves this use-case nicely.
The tool I was talking about is actually a re-implementation of predictNLS from the Rost Lab, which uses regular expressions to look for Nuclear Localization Signals (NLS): https://rostlab.org/owiki/index.php/PredictNLS My Galaxy version is now available here: http://toolshed.g2.bx.psu.edu/view/peterjc/predictnls Touch wood that will automatically install (and comes with a minimal test case too). Thanks, Peter
Hello David, I've just committed change set 7656:6aadac8026cb to the Galaxy central repository that provides the ability to do what you need here. It would be great if you could try things out and let me know if you run into any problems. I've been working with nikhil-joshi's deseq_and_sam2counts repository in the main Galaxy tool shed because his tools require this new feature as well. This example should provide you with the information you'll need to tweak your tool shed repository so that your tool dependencies are located where they get installed rather than attempting to move them to ${GALAXY_DATA_INDEX_DIR} or some other location. Here is the tool_dependency.xml file entry for locating a directory referred to by an environment variable name R_SCRIPT_PATH. <?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="R_SCRIPT_PATH" action="set_to">$REPOSITORY_INSTALL_DIR</environment_variable> </set_environment> </tool_dependency> The <set_enviroment> tag, is still supported inside <package> tag sets, but when defined at the xml root level, it will locate dependencies included in the installed tool shed repository. Of course, the above tool_dependencies.xml file could of course also include entries for tool dependencies that are packages, For example: <?xml version="1.0"?> <tool_dependency> <set_environment version="1.0"> <environment_variable name="R_SCRIPT_PATH" action="set_to">$REPOSITORY_INSTALL_DIR</environment_variable> </set_environment> <package name="R" version="2.15.1"> <install version="1.0"> <actions> <action type="download_by_url">http://CRAN.R-project.org/src/base/R-2/R-2.15.1.tar.gz</action> <action type="shell_command">./configure --prefix=$INSTALL_DIR</action> <action type="shell_command">make</action> <action type="set_environment"> <environment_variable name="PATH" action="prepend_to">$INSTALL_DIR/bin</environment_variable> </action> </actions> </install> <readme> You need a FORTRAN compiler or perhaps f2c in addition to a C compiler to build R. </readme> </package> </tool_dependency> These tool dependency definitions are handled as described in the following section of the tool shed wiki: http://wiki.g2.bx.psu.edu/Tool%20Shed#Automatic_third-party_tool_dependency_... So in order for the dependencies to be handled when the repository is installed, they must be defined in at least one of the <requirement> tag sets in at least 1 tool config in the repository. So, the <requirements> tag set in the deseq.xml tool config file will look something like this (there are more requirement tags defined here than in the tool_dependencies.xml fiel above, but you should get the idea). <requirements> <requirement type="set_environment">R_SCRIPT_PATH</requirement> <requirement type="package" version="2.15.1">R</requirement> <requirement type="package" version="2.10">Bioconductor</requirement> <requirement type="package" version="1.8.3">DESeq</requirement> <requirement type="package" version="1.24.0">aroma.light</requirement> <requirement type="package" version="0.20-6">lattice</requirement> </requirements> The <command> tag set in the deseq.xml tool config is also slightly altered. Here is the way it looks before using this new approach: <command interpreter="python"> stderr_wrapper.py Rscript ${GALAXY_DATA_INDEX_DIR}/deseq.R $counts $column_types $comparison $top_table $diagnostic_html "$diagnostic_html.files_path" "$counts.name" </command> To use this new feature, the command string now uses the R_SCRIPT_PATH environment variable ( notice the required backslash to escape the $ ) <command interpreter="python"> stderr_wrapper.py Rscript \$R_SCRIPT_PATH/deseq.R $counts $column_types $comparison $top_table $diagnostic_html "$diagnostic_html.files_path" "$counts.name" </command> When installed from the tool shed, a tool dependency object named R_SCRIPT_PATH will be created and associated with the installed repository. The dependency will have a pointer to the env.sh file that is created to set the value of the R_SCRIPT_PATH environment variable. Let me know if you bump into any issues in getting this working for your tools. Thanks! Greg Von Kuster On Sep 12, 2012, at 1:00 PM, David Hoover wrote:
Are there any environment variables that are honored by toolshed installed tools? I tried creating a tool that uses ${GALAXY_DATA_INDEX_DIR} or $GALAXY_HOME, then uploaded it the test toolshed, then installed it automatically. Neither of these resolved to what I expected. I don't want to hard-code the path in the xml tool file, but rather have a default location for other executables and jar files. How should this best be done?
David Hoover Helix Systems Staff ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (3)
-
David Hoover
-
Greg Von Kuster
-
Peter Cock