We've run into a scenario lately where we need to run a very large workflow
(huge data in intermediate steps) many times. We can't do this because
Galaxy copies all intermediate steps to all notes, which would bog down the
servers too much.
I asked about something similar before and John mentioned the feature to
automatically delete intermediate step data in a workflow once it
completed, was coming soon. Is that a feature now? That would help.
Ultimately though we can't be copying all this data around to all nodes.
The network just isn't good enough, so I have an idea.
What if we have an option on the 'run workflow' screen to only run on one
node (eliminating the neat Galaxy concurrency ability for that workflow
unfortunately)? Then it just propagates the final step data.
Or maybe only copy to a couple other nodes, to keep concurrency.
If the job errored then in this case I think it should just throw out all
the data, or propagate where it stopped.
I've been trying to work on implementing this myself but it's taking me a
long time. I only just started understanding the pyramid stack, and am
putting in the checkbox in the run.mako template. I still need to learn the
database schema, message passing, and how jobs are stored, and how to tell
condor to only use 1 node, (and more I'm sure) in Galaxy. (I'm drowning)
This seems like a really important feature though as Galaxy gains more
traction as a research tool for bigger projects that demand working with
huge data, and running huge workflows many many times.
I've implemented some rather basic tool access control and am looking
for feedback on my implementation.
Our organisation wanted the ability to restrict tools to different
users/roles. As such I've implemented as an "execute" tag which can be
applied to either <section> or <tools> in the tool configuration file.
# Example galaxy-admin changes
<section execute="firstname.lastname@example.org,email@example.com" id="EncodeTools" name="ENCODE Tools">
<tool file="encode/gencode_partition.xml" />
<tool execute="b(a)b.co" file="encode/random_intervals.xml" />
which would allow A and B to access gencode_parition, but only B would
be able to access random_intervals. To put it explicity
- by default, everyone can access all tools
- if section level permissions are set, then those are set as defaults
for all tools in that section
- if tool permissions are set, they will override the defaults.
# Pros and Cons
There are some good features
- non-accessible tools won't show up in the left hand panel, based on user
- non-accessible tools cannot be run or accessed.
There are some caveats however.
- existence of tools is not completely hidden.
- Labels are not hidden at all.
- workflows break completely if a tool is unavailable to a shared user
and the user copies+edits. They can be copied, and viewed (says tool not
found), but cannot be edited.
the call to app.toolbox.tool_panel.items() in
templates/webapps/galaxy/workflow/editor.mako, as that returns the raw
tool list, rather than one that's filtered on whether or not the user
has access. I'm yet to figure out a clean fix for this. Additionally,
empty sections are still shown even if there aren't tools listed in them.
For a brief overview of my changes, please see the attached diff. (It's
missing one change because I wasn't being careful and started work on
multiple different features)
# Changeset overview
In brief, most of the changes consist of
- new method in model.User to check if an array of roles overlaps at all
with a user's roles
- modifications to appropriate files for reading in the new
- modification to get_tool to pass user information, as whether or not a
tool exists is now dependent on who is asking.
Please let me know if you have input on this before I create a pull
request on this feature.
I believe this will fix a number of previously brought up issues (at
least to my understanding of the issues listed)
+ (I saw some solution where they were adding "_beta" to tool names
which gave permissions to developers somewhere, but cannot find that now)
Center for Phage Technology
Texas A&M University
College Station, TX 77843
I’ve just updated my local instance to the latest version. With the migration of tools to the tool shed I used the migrate command to keep my tools as they were in the previous versions.
$ sh ./scripts/migrate_tools/0008_tools.sh install_dependencies
and I’ve updated the database to 117
$ sh manage_db.sh upgrade
However, when I’m using samtools I’m getting an error related to the data. I’m not sure what the issue is and I’ve check all the .loc and .sh files and everything is pointing to the proper locations. Any ideas what the issue might be?
An error occurred with this dataset:Could not determine Samtools version /bin/sh: line 1: 1616 Segmentation fault: 11 samtools 2>&1 Error extracting alignments from (/galaxy-dist/database/files/001/dataset_1585.dat),
While attempting a fresh Galaxy install from both galaxy-central and
galaxy-dist, I ran into a problem initialising the default SQLite database
$ hg clone https://bitbucket.org/galaxy/galaxy-dist
$ cd galaxy-dist
galaxy.model.migrate.check DEBUG 2013-10-31 10:28:26,143 pysqlite>=2
egg successfully loaded for sqlite dialect
Traceback (most recent call last):
OperationalError: (OperationalError) database is locked u'PRAGMA
After some puzzlement, I realised this was down to the file system -
I was trying this under my home directory mounted via a distributed
file system (gluster I think).
Repeating the experiment under /tmp on a local hard disk worked :)
(I'm posting this message for future reference; hopefully Google and/or
mailing list searches will help anyone else facing this error)
I'm at the Florida State University HPC and setting up an a Galaxy server that will be used to submit jobs to our HPC cluster. I'm using apache as a proxy for the Galaxy server and I'm in the process of setting up ldap authentication.
I had planned to mount our HPC file system (with user home directories) on the Galaxy server so that users would have access to their data but I'm wondering if I have the wrong idea about how Galaxy works and if there is a way to map users to their home directories in Galaxy easily.
Thanks for any pointers.
I pulled down the latest changes today (it was long overdue), and am
running into the following error when I run the tool migration:
sh ./scripts/migrate_tools/0008_tools.sh install_dependencies
No handlers could be found for logger "galaxy.tools.data"
Repositories will be installed into configured tool_path location
[localhost] local: rm -rf ./database/tmp/tmp-toolshed-mtdWuQjBx
Skipping installation of tool dependency samtools version 0.1.18 since it
is installed in
Traceback (most recent call last):
File "./scripts/migrate_tools/migrate_tools.py", line 21, in <module>
app = MigrateToolsApplication( sys.argv[ 1 ] )
line 83, in __init__
line 121, in __init__
line 509, in install_repository
line 351, in handle_repository_contents
guid = self.get_guid( repository_clone_url, relative_install_dir,
line 259, in get_guid
full_path = str( os.path.abspath( os.path.join( root, name ) ) )
UnboundLocalError: local variable 'name' referenced before assignment
This doesn't appear to be a known functional test failure (if I am
reading the logs right), but I am also not the best resource to
troubleshoot this particular type of problem. I am going to move this to
the dev list for more feedback. If you want to add more details about
your system (local? tracking galaxy-dist or central?) that might help.
On 12/18/13 2:42 AM, Nicolas Lapalu wrote:
> I tried to run CCAT functionnal test, but I get an exception (same
> problem via web form)
> Data are well associated with the hg18 build, which is available in
> Error running CCAT.
> Traceback (most recent call last):
> File "/home/galaxy-dev/tools/peak_calling/ccat_wrapper.py", line 41,
> in <module>
> if __name__ == "__main__": main()
> File "/home/galaxy-dev/tools/peak_calling/ccat_wrapper.py", line 38,
> in main
> return stop_err( tmp_dir, e )
> File "/home/galaxy-dev/tools/peak_calling/ccat_wrapper.py", line 13,
> in stop_err
> raise exception
> AssertionError: The required chromosome length file does not exist.
> Any Idea ?
> Thanks, Nicolas
Hi Jeremy, Anna,
I'm having a similar problem here as well. I would like to get a history from one Galaxy server to another and I followed the export and import steps. I even did a wget at the destination Galaxy server to check whether it could download the tar.gz file and this worked. When using the web UI I also get this message on my log showing it is actually doing something when I trigger the import:
10.85.13.89 - - [18/Dec/2013:15:10:48 +0200] "POST /history/import_archive HTTP/1.1" 200 - "http://dev1.ab.wurnet.nl:8088/history/import_archive" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36"
galaxy.jobs DEBUG 2013-12-18 15:10:48,391 (3201) Working directory for job is: /home/lukas007/galaxy-dist/database/job_working_directory/003/3201
galaxy.jobs.handler DEBUG 2013-12-18 15:10:48,404 (3201) Dispatching to local runner
galaxy.jobs DEBUG 2013-12-18 15:10:48,478 (3201) Persisting job destination (destination id: local:///)
galaxy.jobs.handler INFO 2013-12-18 15:10:48,514 (3201) Job dispatched
galaxy.jobs.runners.local DEBUG 2013-12-18 15:10:48,742 (3201) executing: export GALAXY_SLOTS="1"; python /home/lukas007/galaxy-dist/lib/galaxy/tools/imp_exp/unpack_tar_gz_archive.py http://galaxy.wur.nl/galaxy_production/history/export_archive?id=b1f249e0... /home/lukas007/galaxy-dist/database/tmp/tmpPGL6le --url
galaxy.jobs DEBUG 2013-12-18 15:10:48,786 (3201) Persisting job destination (destination id: local:///)
galaxy.jobs.runners.local DEBUG 2013-12-18 15:10:48,875 execution finished: export GALAXY_SLOTS="1"; python /home/lukas007/galaxy-dist/lib/galaxy/tools/imp_exp/unpack_tar_gz_archive.py http://galaxy.wur.nl/galaxy_production/history/export_archive?id=b1f249e0... /home/lukas007/galaxy-dist/database/tmp/tmpPGL6le --url
galaxy.jobs DEBUG 2013-12-18 15:10:49,040 job 3201 ended
galaxy.datatypes.metadata DEBUG 2013-12-18 15:10:49,040 Cleaning up external metadata files
However, nothing is visible in my histories list. What could be wrong? I also don't see any error messages in the log above.
Thanks and regards,
From: galaxy-dev-bounces(a)lists.bx.psu.edu [mailto:firstname.lastname@example.org] On Behalf Of Jeremy Goecks
Sent: dinsdag 11 juni 2013 19:06
To: Edlund, Anna
Cc: galaxy-dev(a)bx.psu.edu; Nicola Segata
Subject: Re: [galaxy-dev] problems with exporting or importing data
Try following these steps:
(1) Make your history accessible (Share or Publish --> Make History Accessible by Link).
(2) Export your history again; make sure to wait until you can use the URL to download a .gz file of your history.
(3) Try importing it via URL to the Huttenhower Lab Galaxy.
Let us know if you have any problems.
On Jun 10, 2013, at 11:45 AM, Edlund, Anna wrote:
I have uploaded fasta files on the main galaxy server through my FileZilla program. In want to transfer them to the Galaxy at the Huttenhower Lab (see http://huttenhower.org/galaxy/root) page and to do that go to history list when I am logged in at both locations and I select 'Import a History from an archive'. I paste the url address from the main galaxy server (where my archive is located) and select submit. Then I waited for 2 days and nothing was transferred? I am clearly doing something wrong and would really like you input asap.
Thank you very much!!
My user name is aedlund(a)jcvi.org<mailto:email@example.com>
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
To search Galaxy mailing lists use the unified search at: