December 2013 - galaxy-dev - lists.galaxyproject.org

Request: Option to reduce server data transfer for big workflow in cluster
by Ben Gift 20 Dec '13

20 Dec '13

We've run into a scenario lately where we need to run a very large workflow (huge data in intermediate steps) many times. We can't do this because Galaxy copies all intermediate steps to all notes, which would bog down the servers too much. I asked about something similar before and John mentioned the feature to automatically delete intermediate step data in a workflow once it completed, was coming soon. Is that a feature now? That would help. Ultimately though we can't be copying all this data around to all nodes. The network just isn't good enough, so I have an idea. What if we have an option on the 'run workflow' screen to only run on one node (eliminating the neat Galaxy concurrency ability for that workflow unfortunately)? Then it just propagates the final step data. Or maybe only copy to a couple other nodes, to keep concurrency. If the job errored then in this case I think it should just throw out all the data, or propagate where it stopped. I've been trying to work on implementing this myself but it's taking me a long time. I only just started understanding the pyramid stack, and am putting in the checkbox in the run.mako template. I still need to learn the database schema, message passing, and how jobs are stored, and how to tell condor to only use 1 node, (and more I'm sure) in Galaxy. (I'm drowning) This seems like a really important feature though as Galaxy gains more traction as a research tool for bigger projects that demand working with huge data, and running huge workflows many many times.

3 8

Tool Access Control
by Eric Rasche 20 Dec '13

20 Dec '13

Howdy devs, I've implemented some rather basic tool access control and am looking for feedback on my implementation. # Why Our organisation wanted the ability to restrict tools to different users/roles. As such I've implemented as an "execute" tag which can be applied to either <section> or <tools> in the tool configuration file. # Example galaxy-admin changes For example: <section execute="a@b.co,b@b.co" id="EncodeTools" name="ENCODE Tools"> <tool file="encode/gencode_partition.xml" /> <tool execute="b(a)b.co" file="encode/random_intervals.xml" /> </section> which would allow A and B to access gencode_parition, but only B would be able to access random_intervals. To put it explicity - by default, everyone can access all tools - if section level permissions are set, then those are set as defaults for all tools in that section - if tool permissions are set, they will override the defaults. # Pros and Cons There are some good features - non-accessible tools won't show up in the left hand panel, based on user - non-accessible tools cannot be run or accessed. There are some caveats however. - existence of tools is not completely hidden. - Labels are not hidden at all. - workflows break completely if a tool is unavailable to a shared user and the user copies+edits. They can be copied, and viewed (says tool not found), but cannot be edited. Tool names/id/version info can be found in the javascript object due to the call to app.toolbox.tool_panel.items() in templates/webapps/galaxy/workflow/editor.mako, as that returns the raw tool list, rather than one that's filtered on whether or not the user has access. I'm yet to figure out a clean fix for this. Additionally, empty sections are still shown even if there aren't tools listed in them. For a brief overview of my changes, please see the attached diff. (It's missing one change because I wasn't being careful and started work on multiple different features) # Changeset overview In brief, most of the changes consist of - new method in model.User to check if an array of roles overlaps at all with a user's roles - modifications to appropriate files for reading in the new tool_config.xml's options - modification to get_tool to pass user information, as whether or not a tool exists is now dependent on who is asking. Please let me know if you have input on this before I create a pull request on this feature. # Fixes I believe this will fix a number of previously brought up issues (at least to my understanding of the issues listed) + https://trello.com/c/Zo7FAXlM/286-24-add-ability-to-password-secure-tools + (I saw some solution where they were adding "_beta" to tool names which gave permissions to developers somewhere, but cannot find that now) Cheers, Eric Rasche -- Eric Rasche Programmer II Center for Phage Technology Texas A&M University College Station, TX 77843 404-692-2048 esr(a)tamu.edu rasche.eric(a)yandex.ru

3 6

Error with Sam tools in Galaxy 4
by UMD Bioinformatics 20 Dec '13

20 Dec '13

Hello Devteam, I’ve just updated my local instance to the latest version. With the migration of tools to the tool shed I used the migrate command to keep my tools as they were in the previous versions. $ sh ./scripts/migrate_tools/0008_tools.sh install_dependencies and I’ve updated the database to 117 $ sh manage_db.sh upgrade However, when I’m using samtools I’m getting an error related to the data. I’m not sure what the issue is and I’ve check all the .loc and .sh files and everything is pointing to the proper locations. Any ideas what the issue might be? An error occurred with this dataset:Could not determine Samtools version /bin/sh: line 1: 1616 Segmentation fault: 11 samtools 2>&1 Error extracting alignments from (/galaxy-dist/database/files/001/dataset_1585.dat), Cheers Ian

2 3

New Galaxy won't start, SQLite (OperationalError) database is locked
by Peter Cock 20 Dec '13

20 Dec '13

Hello all, While attempting a fresh Galaxy install from both galaxy-central and galaxy-dist, I ran into a problem initialising the default SQLite database (database/universe.sqlite), $ hg clone https://bitbucket.org/galaxy/galaxy-dist ... $ cd galaxy-dist $ ./run.sh ... galaxy.model.migrate.check DEBUG 2013-10-31 10:28:26,143 pysqlite>=2 egg successfully loaded for sqlite dialect Traceback (most recent call last): ... OperationalError: (OperationalError) database is locked u'PRAGMA table_info("dataset")' () After some puzzlement, I realised this was down to the file system - I was trying this under my home directory mounted via a distributed file system (gluster I think). Repeating the experiment under /tmp on a local hard disk worked :) (I'm posting this message for future reference; hopefully Google and/or mailing list searches will help anyone else facing this error) Regards, Peter

2 5

Sysadmin Question
by Shrum, Donald C 19 Dec '13

19 Dec '13

Hi, I'm at the Florida State University HPC and setting up an a Galaxy server that will be used to submit jobs to our HPC cluster. I'm using apache as a proxy for the Galaxy server and I'm in the process of setting up ldap authentication. I had planned to mount our HPC file system (with user home directories) on the Galaxy server so that users would have access to their data but I'm wondering if I have the wrong idea about how Galaxy works and if there is a way to map users to their home directories in Galaxy easily. Thanks for any pointers. Donny Shrum

2 1

Consultez mon profil LinkedIn
by Rémy Dernat 19 Dec '13

19 Dec '13

LinkedIn ------------ J'aimerais vous inviter à rejoindre mon réseau professionnel en ligne, sur le site LinkedIn. Rémy Rémy Dernat System and Web programmer IT. Engineer chez CNRS Région de Montpellier , France Veuillez confirmer que vous connaissez Rémy Dernat : https://www.linkedin.com/e/1kvf56-hpe8ktzg-6h/isd/18792641650/8XO3w_SR/?hs=… -- Vous recevez des invitations à vous connecter par e-mail. Cliquez ici si vous ne souhaitez plus recevoir ces e-mails : http://www.linkedin.com/e/1kvf56-hpe8ktzg-6h/qaQIbd-APagZF22GBAQuFmcZcI_00a… (c) 2012 LinkedIn Corporation. 2029 Stierlin Ct, Mountain View, CA 94043, USA.

1 0

Error with tool migration
by Kerry Deutsch 19 Dec '13

19 Dec '13

Hello - I pulled down the latest changes today (it was long overdue), and am running into the following error when I run the tool migration: sh ./scripts/migrate_tools/0008_tools.sh install_dependencies No handlers could be found for logger "galaxy.tools.data" Repositories will be installed into configured tool_path location ../shed_tools [localhost] local: rm -rf ./database/tmp/tmp-toolshed-mtdWuQjBx Skipping installation of tool dependency samtools version 0.1.18 since it is installed in /users/galaxy/galaxy_deps/bin/samtools/0.1.18/devteam/package_samtools_0_1_18/171cd8bc208d Traceback (most recent call last): File "./scripts/migrate_tools/migrate_tools.py", line 21, in <module> app = MigrateToolsApplication( sys.argv[ 1 ] ) File "/users/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/migrate/common.py", line 83, in __init__ install_dependencies=install_dependencies ) File "/users/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/install_manager.py", line 121, in __init__ is_repository_dependency=is_repository_dependency ) File "/users/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/install_manager.py", line 509, in install_repository is_repository_dependency=is_repository_dependency ) File "/users/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/install_manager.py", line 351, in handle_repository_contents guid = self.get_guid( repository_clone_url, relative_install_dir, tool_config ) File "/users/galaxy/galaxy-dist/lib/tool_shed/galaxy_install/install_manager.py", line 259, in get_guid full_path = str( os.path.abspath( os.path.join( root, name ) ) ) UnboundLocalError: local variable 'name' referenced before assignment Ideas? Thanks much! Kerry

3 3

GCC2014 Training Day: Topic Nomination is now open
by Dave Clements 18 Dec '13

18 Dec '13

Hello all, Training Day topics are nominated by you <http://bit.ly/gcc2014tdnom>, the Galaxy Community. Please take a minute to nominate a topic. Any topic of interest to the Galaxy Community can be nominated and you are encouraged to nominate more than one topic. If you are looking for ideas, see - what was offered at GCC2013<http://wiki.galaxyproject.org/Events/GCC2013/TrainingDay> , - what topics were nominated in 2013 <http://bit.ly/1i2j1gN>, and - the Events <http://wiki.galaxyproject.org/Events> and the Events Archive<http://wiki.galaxyproject.org/Events/Archive> pages. Nominated topics will be published on the Training Day<http://wiki.galaxyproject.org/Events/GCC2014/TrainingDay> page as they come in. *Nominations close December 20*. Topics will be compiled into a uniform list by the GCC2014 Organizing Committee, and topics will be posted and voted on by the Galaxy Community, January 6-17. Nominate a topic now! <http://bit.ly/gcc2014tdnom> Topics will then be selected and scheduled based on topic interest, and the organizers' ability to confirm instructors for each session. Some very popular sessions may be scheduled more than once. The final schedule will be posted before registration opens. See you in Baltimore! GCC2014 Organizing Committee<http://wiki.galaxyproject.org/Events/GCC2014/Organizers> --- *About GCC2014:* The 2014 Galaxy Community Conference (GCC2014)<http://galaxyproject.org/gcc2014>will be held at the Homewood Campus<http://webapps.jhu.edu/jhuniverse/information_about_hopkins/campuses/homewo…> of Johns Hopkins University <http://jhu.edu/>, inBaltimore, Maryland<http://visitors.baltimorecity.gov/>, United States, from June 30 through July 2, 2014. Galaxy Community Conferences are an opportunity to participate in presentations, discussions, poster sessions, keynotes, lightning talks and bird-of-a-feather gatherings, all about high-throughput biology and the tools that support it. -- http://galaxyproject.org/ http://getgalaxy.org/ http://usegalaxy.org/ http://wiki.galaxyproject.org/

1 1

Error running CCAT test
by Jennifer Jackson 18 Dec '13

18 Dec '13

Hi Nicola, This doesn't appear to be a known functional test failure (if I am reading the logs right), but I am also not the best resource to troubleshoot this particular type of problem. I am going to move this to the dev list for more feedback. If you want to add more details about your system (local? tracking galaxy-dist or central?) that might help. Best, Jen Galaxy team On 12/18/13 2:42 AM, Nicolas Lapalu wrote: > Hi, > > I tried to run CCAT functionnal test, but I get an exception (same > problem via web form) > Data are well associated with the hg18 build, which is available in > test-data/chrom/hg18.len > > > Error running CCAT. > Traceback (most recent call last): > File "/home/galaxy-dev/tools/peak_calling/ccat_wrapper.py", line 41, > in <module> > if __name__ == "__main__": main() > File "/home/galaxy-dev/tools/peak_calling/ccat_wrapper.py", line 38, > in main > return stop_err( tmp_dir, e ) > File "/home/galaxy-dev/tools/peak_calling/ccat_wrapper.py", line 13, > in stop_err > raise exception > AssertionError: The required chromosome length file does not exist. > > Any Idea ? > > Thanks, Nicolas > > -- Jennifer Hillman-Jackson http://galaxyproject.org

1 0

Re: [galaxy-dev] problems with exporting or importing data from file
by Lukasse, Pieter 18 Dec '13

18 Dec '13

Hi Jeremy, Anna, I'm having a similar problem here as well. I would like to get a history from one Galaxy server to another and I followed the export and import steps. I even did a wget at the destination Galaxy server to check whether it could download the tar.gz file and this worked. When using the web UI I also get this message on my log showing it is actually doing something when I trigger the import: 10.85.13.89 - - [18/Dec/2013:15:10:48 +0200] "POST /history/import_archive HTTP/1.1" 200 - "http://dev1.ab.wurnet.nl:8088/history/import_archive" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.63 Safari/537.36" galaxy.jobs DEBUG 2013-12-18 15:10:48,391 (3201) Working directory for job is: /home/lukas007/galaxy-dist/database/job_working_directory/003/3201 galaxy.jobs.handler DEBUG 2013-12-18 15:10:48,404 (3201) Dispatching to local runner galaxy.jobs DEBUG 2013-12-18 15:10:48,478 (3201) Persisting job destination (destination id: local:///) galaxy.jobs.handler INFO 2013-12-18 15:10:48,514 (3201) Job dispatched galaxy.jobs.runners.local DEBUG 2013-12-18 15:10:48,742 (3201) executing: export GALAXY_SLOTS="1"; python /home/lukas007/galaxy-dist/lib/galaxy/tools/imp_exp/unpack_tar_gz_archive.py http://galaxy.wur.nl/galaxy_production/history/export_archive?id=b1f249e032… /home/lukas007/galaxy-dist/database/tmp/tmpPGL6le --url galaxy.jobs DEBUG 2013-12-18 15:10:48,786 (3201) Persisting job destination (destination id: local:///) galaxy.jobs.runners.local DEBUG 2013-12-18 15:10:48,875 execution finished: export GALAXY_SLOTS="1"; python /home/lukas007/galaxy-dist/lib/galaxy/tools/imp_exp/unpack_tar_gz_archive.py http://galaxy.wur.nl/galaxy_production/history/export_archive?id=b1f249e032… /home/lukas007/galaxy-dist/database/tmp/tmpPGL6le --url galaxy.jobs DEBUG 2013-12-18 15:10:49,040 job 3201 ended galaxy.datatypes.metadata DEBUG 2013-12-18 15:10:49,040 Cleaning up external metadata files However, nothing is visible in my histories list. What could be wrong? I also don't see any error messages in the log above. Thanks and regards, Pieter. From: galaxy-dev-bounces(a)lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Jeremy Goecks Sent: dinsdag 11 juni 2013 19:06 To: Edlund, Anna Cc: galaxy-dev(a)bx.psu.edu; Nicola Segata Subject: Re: [galaxy-dev] problems with exporting or importing data Anna, Try following these steps: (1) Make your history accessible (Share or Publish --> Make History Accessible by Link). (2) Export your history again; make sure to wait until you can use the URL to download a .gz file of your history. (3) Try importing it via URL to the Huttenhower Lab Galaxy. Let us know if you have any problems. Best, J. On Jun 10, 2013, at 11:45 AM, Edlund, Anna wrote: Hi. I have uploaded fasta files on the main galaxy server through my FileZilla program. In want to transfer them to the Galaxy at the Huttenhower Lab (see http://huttenhower.org/galaxy/root) page and to do that go to history list when I am logged in at both locations and I select 'Import a History from an archive'. I paste the url address from the main galaxy server (where my archive is located) and select submit. Then I waited for 2 days and nothing was transferred? I am clearly doing something wrong and would really like you input asap. Thank you very much!! Best regards, Anna Edlund My user name is aedlund(a)jcvi.org<mailto:aedlund@jcvi.org> ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/

2 1