Resolving requirement type "binary" and "python-module"
Hi all, we run galaxy on a cluster and real user job sumission (via DRMAA). The problem is getting the Galaxy standard tools to work with environment-module. We use environment-modules to fit the environment to the needs of a job and want to do the same for galaxy jobs. I highly expect the need to handle different versions of the same program on our galaxy instance. It is no problem to configure indivdual evironments for new tools, as we can simply set the necessary <requirement type="package" version="xyz"> entries in the tool.xml and set a 'module load xyz' in the corresponding env.sh files. The challenge is to get the galaxy standard tools to work. Several of them do not define a requirement-package tag. Instead one finds the types "binary" or "python-module". But these are not handeled by Managed Tool Dependencies (env.sh). Especially tools that require python-module=rpy, should also have a package=R, or: if python-module=Gnuplot I would also need a package='gnuplot' entry to trigger the corresponding module load command. I also found binary=gnuplot, or binary=R... these also need to be loaded on our system. The only way I can think of solving this, is by editing these tool.xmls by hand or do you guys think there is a more general solution? Are there by chance any methods to resolve binaries and python-modules? (could not find anything about it) Greetings, Phil
Phil, Galaxy uses a number of ways to resolve tool dependencies, some of which may be useful in your situation: 1. If a tool dependency entry exists in the database that matches the name, type, and version, it attempts to load tool_dependency_dir/package_name/version/repository_owner/repository_name/changeset_revision/env.sh 2. If not, it looks for tool_dependency_dir/package_name/version/env.sh or tool_dependency_dir/package_name/default/env.sh 3. Even if no env.sh is found in the above steps, it attempts to run the command defined by the tool, using executable files from $PATH, python modules from the python path, and so on. If none of the above locations contains the package or module required by the tool, it will of course not run successfully. For more details and step-by-step instructions on configuring your Galaxy installation with tool dependencies, see http://wiki.galaxyproject.org/Admin/Config/Tool%20Dependencies Also, a list of software used by Galaxy standard tools can be found at http://wiki.galaxyproject.org/Admin/Tools/Tool%20Dependencies --Dave B. On 09/09/2013 10:09 AM, Hans-philipp Brachvogel wrote:
Hi all,
we run galaxy on a cluster and real user job sumission (via DRMAA). The problem is getting the Galaxy standard tools to work with environment-module.
We use environment-modules to fit the environment to the needs of a job and want to do the same for galaxy jobs. I highly expect the need to handle different versions of the same program on our galaxy instance.
It is no problem to configure indivdual evironments for new tools, as we can simply set the necessary <requirement type="package" version="xyz"> entries in the tool.xml and set a 'module load xyz' in the corresponding env.sh files.
The challenge is to get the galaxy standard tools to work. Several of them do not define a requirement-package tag. Instead one finds the types "binary" or "python-module". But these are not handeled by Managed Tool Dependencies (env.sh). Especially tools that require python-module=rpy, should also have a package=R, or: if python-module=Gnuplot I would also need a package='gnuplot' entry to trigger the corresponding module load command.
I also found binary=gnuplot, or binary=R... these also need to be loaded on our system.
The only way I can think of solving this, is by editing these tool.xmls by hand or do you guys think there is a more general solution? Are there by chance any methods to resolve binaries and python-modules? (could not find anything about it)
Greetings, Phil ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hey Dave, thanks for the reply! Guess I wrote too much and explained badly what I meant. I had already tested the Managed Tool Dependencies, but my problem was that those only work for <requirement type="package">xyz</requirement> They do not handle <requirement type="binary"... or <requirement type="python-module", which is what I am looking for (beacuse of our special setting). Anyhow, I searched a bit and found in galaxy_dir/lib/galaxy/tools/__init__.py. at line 2671: if requirement.type in [ 'package', 'set_environment' ]: script_file, base_path, version = self.app.toolbox.dependency_manager.find_dep... So I changed that line to: if requirement.type in [ 'package', 'set_environment', 'binary', 'python-module' ] This works, for me. If I have e.g.: <requirement type="python-module">rpy</requirement> I can now use dep_dir/rpy/default/env.sh to also load R into the environment. Now, I wonder if that change will have any bad consequences? I could not find anything about the requirement type 'binary' and 'python-module', what are they normally needed for? Could this be relevant when using tool-shed? Phil Quoting Dave Bouvier <dave@bx.psu.edu>:
Phil,
Galaxy uses a number of ways to resolve tool dependencies, some of which may be useful in your situation:
1. If a tool dependency entry exists in the database that matches the name, type, and version, it attempts to load tool_dependency_dir/package_name/version/repository_owner/repository_name/changeset_revision/env.sh
2. If not, it looks for tool_dependency_dir/package_name/version/env.sh or tool_dependency_dir/package_name/default/env.sh
3. Even if no env.sh is found in the above steps, it attempts to run the command defined by the tool, using executable files from $PATH, python modules from the python path, and so on.
If none of the above locations contains the package or module required by the tool, it will of course not run successfully.
For more details and step-by-step instructions on configuring your Galaxy installation with tool dependencies, see http://wiki.galaxyproject.org/Admin/Config/Tool%20Dependencies
Also, a list of software used by Galaxy standard tools can be found at http://wiki.galaxyproject.org/Admin/Tools/Tool%20Dependencies
--Dave B.
On 09/09/2013 10:09 AM, Hans-philipp Brachvogel wrote:
Hi all,
we run galaxy on a cluster and real user job sumission (via DRMAA). The problem is getting the Galaxy standard tools to work with environment-module.
We use environment-modules to fit the environment to the needs of a job and want to do the same for galaxy jobs. I highly expect the need to handle different versions of the same program on our galaxy instance.
It is no problem to configure indivdual evironments for new tools, as we can simply set the necessary <requirement type="package" version="xyz"> entries in the tool.xml and set a 'module load xyz' in the corresponding env.sh files.
The challenge is to get the galaxy standard tools to work. Several of them do not define a requirement-package tag. Instead one finds the types "binary" or "python-module". But these are not handeled by Managed Tool Dependencies (env.sh). Especially tools that require python-module=rpy, should also have a package=R, or: if python-module=Gnuplot I would also need a package='gnuplot' entry to trigger the corresponding module load command.
I also found binary=gnuplot, or binary=R... these also need to be loaded on our system.
The only way I can think of solving this, is by editing these tool.xmls by hand or do you guys think there is a more general solution? Are there by chance any methods to resolve binaries and python-modules? (could not find anything about it)
Greetings, Phil ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Tue, Sep 10, 2013 at 9:16 AM, Hans-philipp Brachvogel <hans-philipp.brachvogel@student.uni-tuebingen.de> wrote:
Hey Dave,
thanks for the reply! Guess I wrote too much and explained badly what I meant. I had already tested the Managed Tool Dependencies, but my problem was that those only work for <requirement type="package">xyz</requirement>
They do not handle <requirement type="binary"... or <requirement type="python-module", which is what I am looking for (beacuse of our special setting). Anyhow, I searched a bit and found in galaxy_dir/lib/galaxy/tools/__init__.py. at line 2671:
if requirement.type in [ 'package', 'set_environment' ]: script_file, base_path, version = self.app.toolbox.dependency_manager.find_dep...
So I changed that line to:
if requirement.type in [ 'package', 'set_environment', 'binary', 'python-module' ]
This works, for me. If I have e.g.: <requirement type="python-module">rpy</requirement> I can now use dep_dir/rpy/default/env.sh to also load R into the environment.
Now, I wonder if that change will have any bad consequences? I could not find anything about the requirement type 'binary' and 'python-module', what are they normally needed for? Could this be relevant when using tool-shed?
Phil
These dependency declarations pre-date the Tool Shed and are (as far as I know) ignored by the Tool Shed. They may currently be non-functional, but I think they could be useful with a little magic added to the Galaxy jobs to check the dependency as part of the shell script submitted for the job (to give nice clear failure errors about missing dependencies, rather than what ever cryptic error the tool may give). Peter
Okay. So 'binary' and 'python-module' module are currently just treated differently from 'package' in this context because they are kind of not in use? But isn't it then a good idea to also include these type tags into the dependency management system by generally changing this line to: if requirement.type in [ 'package', 'set_environment', 'binary', 'python-module'] or just kick out the whole if clause? Phil Quoting Peter Cock <p.j.a.cock@googlemail.com>:
On Tue, Sep 10, 2013 at 9:16 AM, Hans-philipp Brachvogel <hans-philipp.brachvogel@student.uni-tuebingen.de> wrote:
Hey Dave,
thanks for the reply! Guess I wrote too much and explained badly what I meant. I had already tested the Managed Tool Dependencies, but my problem was that those only work for <requirement type="package">xyz</requirement>
They do not handle <requirement type="binary"... or <requirement type="python-module", which is what I am looking for (beacuse of our special setting). Anyhow, I searched a bit and found in galaxy_dir/lib/galaxy/tools/__init__.py. at line 2671:
if requirement.type in [ 'package', 'set_environment' ]: script_file, base_path, version = self.app.toolbox.dependency_manager.find_dep...
So I changed that line to:
if requirement.type in [ 'package', 'set_environment', 'binary', 'python-module' ]
This works, for me. If I have e.g.: <requirement type="python-module">rpy</requirement> I can now use dep_dir/rpy/default/env.sh to also load R into the environment.
Now, I wonder if that change will have any bad consequences? I could not find anything about the requirement type 'binary' and 'python-module', what are they normally needed for? Could this be relevant when using tool-shed?
Phil
These dependency declarations pre-date the Tool Shed and are (as far as I know) ignored by the Tool Shed. They may currently be non-functional, but I think they could be useful with a little magic added to the Galaxy jobs to check the dependency as part of the shell script submitted for the job (to give nice clear failure errors about missing dependencies, rather than what ever cryptic error the tool may give).
Peter
participants (3)
-
Dave Bouvier
-
Hans-philipp Brachvogel
-
Peter Cock