March 2013 - galaxy-dev - lists.galaxyproject.org

Knowing who is currently logged into your system
by Jennifer Jackson 20 Apr '13

20 Apr '13

Hi Akshay, I am going to post your question over to the galaxy-dev(a)bx.psu.edu mailing list, to give it better viability. You could probably run an sql query against a database table to find out this information, but there may be a better way. Let's see if someone has a method worked out they want to share. Another alternative is to look in our documentation, ReadTheDocs. I did a search on "login" and a few potentials popped up: http://galaxy-dist.readthedocs.org/en/latest/ Going forward, the galaxy-dev list is the list you will want to post to - and consider subscribing to - if you are running a local instance and want to join/discuss issues with the community of other users doing the same. http://wiki.galaxyproject.org/MailingLists http://wiki.galaxyproject.org/Support#Mailing_Lists Thanks! Jen Galaxy team On 3/12/13 5:46 PM, Akshay Vivek Choche wrote: > Hello All, > > I was wondering if there is a way to know all the users logged into your local galaxy server? > > Thanks, > -Akshay Choche > > ___________________________________________________________ > The Galaxy User list should be used for the discussion of > Galaxy analysis and other features on the public server > at usegalaxy.org. Please keep all replies on the list by > using "reply all" in your mail client. For discussion of > local Galaxy instances and the Galaxy source code, please > use the Galaxy Development list: > > http://lists.bx.psu.edu/listinfo/galaxy-dev > > To manage your subscriptions to this and other Galaxy lists, > please use the interface at: > > http://lists.bx.psu.edu/ -- Jennifer Hillman-Jackson Galaxy Support and Training http://galaxyproject.org

4 5

Improving Administrative Data Clean Up (pgcleanup.py vs cleanup_datasets.py)
by Lance Parsons 18 Apr '13

18 Apr '13

I have been running a Galaxy server for our sequencing researchers for a while now and it's become increasingly successful. The biggest resource challenge for us has been, and continues to be disk space. As such, I'd like to implement some additional cleanup scripts. I thought I run a few questions by this list before I got too far into things. In general, I'm wondering how to implement updates/additions to the cleanup system that will be in line with the direction that the Galaxy project is headed. The pgcleanup.py script is the newest piece of code in this area (and even adds cleanup of exported histories, which are absent from the older cleanup scripts). Also, the pgcleanup.py script uses a "cleanup_event" table that I don't believe is used by the older cleanup_datasets.py script. However, the new pgcleanup.py script only works for Postgres, and worse, only for version 9.1+. I run my system on RedHat (CentOS) and thus we use version 8.4 of Postgres. Are there plans to support other databases or older versions of Postgres? I'd like to implement a script to delete (set the deleted flag) for certain datasets (e.g. raw data imported from our archive, for old, inactive users, etc.). I'm wondering if it would make sense to try and extend pgcleanup.py or cleanup_datasets.py. Or perhaps it would be best to just implement a separate script, though that seems like I'd have to re-implement a lot of boilerplate code for configuration reading, connections, logging, etc. Any tips on generally acceptable (supported) procedures for marking a dataset as deleted? Of course, I'll make any of the enhancements available (and would be happy to submit pull requests if there is interest). -- Lance Parsons - Scientific Programmer 134 Carl C. Icahn Laboratory Lewis-Sigler Institute for Integrative Genomics Princeton University

2 4

negative user data usage
by Geert Vandeweyer 11 Apr '13

11 Apr '13

Hi, Today, I found out that one user in our local galaxy installation (the administrator user) has a negative disk usage. - Reports shows : -72780720701 bytes - Galaxy history shows: -1% Does anybody have suggestions on what might be causing this and how to solve it? There is about 660 Gb of data in the histories of that user, but it was more before. I believe it happened after some histories were deleted and there was a message of one of them being shared. Best, Geert -- Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer(a)ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726

3 2

resubmit a job if the node fails
by zhengqiu cai 10 Apr '13

10 Apr '13

Hi, Can Galaxy resubmit a job if the node where the job is running fails? I know sge can do that by using qsub -r. It should be very useful if Galaxy can do that. Thank you, Cai

3 2

How to test DRMAA configuration?
by Joshua Orvis 09 Apr '13

09 Apr '13

I have a working local Galaxy instance and wanted to enable DRMAA support to utilize our SGE (or LSF) grid. Following the guide here<http://wiki.galaxyproject.org/Admin/Config/Performance/Cluster> I set what I appeared to need to make this work. From the DRMAA_LIBRARY_PATH env variable to all the configuration settings in universe_wsgi.ini, reconfiguring the server hosting Galaxy as a submit host, etc. Some specific config file changes made: new_file_path = /seq/gscidA/www/gscid_devel/htdocs/galaxy-dist/database/tmp start_job_runners = drmaa default_cluster_job_runner = drmaa:/// set_metadata_externally = True outputs_to_working_directory = True I then killed and restarted the Galaxy instance and tried a simple FASTQ -> FASTA test execution, but it ran locally. I couldn't find any sort of errors or messages related to DRMAA in the server log, and the job ran to completion. I commented out the local tool runner overrides. What can I do to test my DRMAA configuration and where should I look for errors? Thanks - Joshua

3 4

History periodically disappears in AWS Cloudman installation
by Greg Edwards 08 Apr '13

08 Apr '13

Hi, We're running a private Cloudman Galaxy on AWS for small-scale proteomics work. Lately the whole History of the main user id we use has occasionally disappeared, ie. on login the History is empty. The datasets aren't hiding in "Deleted Datasets". They appear to still be there in /mnt/galaxyData/files/000. They're not in the Anonymous (not logged in ) id. They're not in another id. The data doesn't come back later. We reload the latest datasets in use and the numbering in the history restarts from 1. We're running the most basic config, with the simple single-threaded database. Nothing of interest seems to be in the various Cloudman logs. I've searched the archives for "lost/deleted/disappeared datasets/history" etc but nothing useful turned up. This is our rev status .. UBUNTU /mnt/galaxyTools/galaxy-central $ hg summary parent: 8116:ecd131b136d0 tip libraries: fix in query for 'datasets_are_public' branch: default commit: 2 modified, 268 unknown update: (current) This is a poor fault report but .. appreciate any pointers here. Many thanks ... -- Greg Edwards, Port Jackson Bioinformatics gedwards2(a)gmail.com

2 1

Galaxy dont work on my Bio-Linux
by BrozPetr＠email.cz 08 Apr '13

08 Apr '13

Hello, please could you help me, how to run galaxy directly from my computer? I am using Bio-Linux (ubuntu distribution) and there is GALAXY already install. When I try to open it, it is state that: "The Galaxy server doesn't seem to be running on your machine. You may need to start it with the command: sudo start galaxy Or else Galaxy may still be starting up (it takes a couple of minutes to get going on the first run)." When I open terminal and write sudo start galaxy it is state that galaxy start/running... But when I trying to open galaxy (just double clic on the icon) it still state problem which I write above. Could you help me please how to fix it?? Thank you very much for your reply! Have a nice day, Petr.

2 1

Cancel running jobs/workflows through API
by Richard Park 03 Apr '13

03 Apr '13

Hey guys, I know there's no current way of canceling jobs or workflows through the API, but does anyone have any idea on how to implement this in Galaxy? i.e. where in the code base to see how galaxy is currently killing jobs. My ideal solution would be to get the list of qeued jobs for a giving history and call something to kill all running jobs in a history or a way of iterating through all jobs. Any help would greatly appreciated, thanks, Richard

2 2

Galaxy - FTP Download Problem
by Rob Leclerc 03 Apr '13

03 Apr '13

We have a CloudMan Galaxy install on Amazon and I'm able to upload FTP files from FireZilla (or programatically through a Java service I wrote). However, I'm unable to download that same file once uploaded. The 550 response is a "Requested action not take. File unavailable (e.g., file not found, no access). I have rwx permissions on the directory and rw permissions on the file.* * * * *Status:* Starting download of /TAF1_ChIP-2.txt *Command: *TYPE A *Response:* 200 Type set to A *Command: *PASV *Response: *227 Entering Passive Mode (10,40,11,236,117,50). *Status: *Server sent passive reply with unroutable address. Using server address instead. *Command: *RETR TAF1_ChIP-2.txt *Response: *550 TAF1_ChIP-2.txt: Operation not permitted *Error: Critical file transfer error* Thoughts? Cheers, Rob

2 1

reducing overhead for quick jobs
by Robert Baertsch 30 Mar '13

30 Mar '13

Hi All, After changing the sleep in manager.py from 5 seconds to 0.2 seconds, the job start is very snappy. I did not touch the various sleep(10)s that don't seem to be relevant. I'm testing how long it takes to sort a short bed file in galaxy. After the fix, the problem seems to be at the end of the process. In the timeline below the gap between the job ending on the server and the client getting the update is 5 seconds. Is that a built in sleep or something else? Can I turn off metadata for tools that don't generate it? BTW: thousands of seconds are on in the manager and handler logs and should be turned on in the main.log. Rather than 2 round trips to the server. I think we need a special class of tools that run in a single round trip (like the eyeball viz). The max run time for this class of jobs should be no more than a few seconds. Is that is the roadmap somewhere? -Robert Baertsch UC Santa Cruz https://medbook.ucsc.edu RUN1 - 9 seconds (1 sec to start, 0.2 sec to run , 2 sec metadata, 5.5 secs to update client) 12:08:12.? main: log job on main thread 12:08:12.8 manager : galaxy.jobs.manager DEBUG 2013-03-30 12:08:12,867 (2308) Job assigned to handler 'handler2' 12:08:13.26 handler: start handler 12:08:13.35 handler: job dispatched 12:08:13,44 handler: job runner started 12:08:13,72 handler: python tool started 12:08:13,91 handler: python tool finished 12:08:13,98 handler: set_metadata.sh started 12:08:15,70 handler: metadata ended 12:08:15,85 handler: job Ended 12:08:21 main: /api/histories - update on client RUN2 - 9 seconds (1.8 sec to start, 0.2 sec to run, 2 sec metadata, 4.5 sec to update client) 12:23:00.? main: log job on main thread 12:23:00,60 manager : galaxy.jobs.manager DEBUG 2013-03-30 12:08:12,867 (2308) Job assigned to handler 'handler2' 12:23:01,63 handler: start handler 12:23:01,72 handler: job dispatched 12:23:01,81 handler: job runner started 12:23:02,44 handler: python tool started 12:23:02,64 handler: python tool finished 12:23:02,72 handler: set_metadata.sh started 12:23:04,64 handler: metadata ended 12:23:04,71 handler: job Ended 12:23:09 main: /api/histories - update on client

1 0