January 2012 - galaxy-dev - lists.galaxyproject.org

Galaxy administrators - admin_users setting
by Peter Cock 30 May '12

30 May '12

Hi all, To make someone an administrator on a local Galaxy install, do I just need to add their email (login) to the comma separated setting admin_users in universe_wsgi.ini? https://bitbucket.org/galaxy/galaxy-central/wiki/Admin/AdminInterface I have this working on one server, but it doesn't seem to have any effect on a second server. Both are now running the current release (changeset 50e249442c5a). Is there some other setting needed to enable the admin interface? Thanks, Peter

2 4

how is metadata generated
by KOH Jia Yu Jayce 28 May '12

28 May '12

Hi Just wondering how metadata files in galaxy-dist/database/files/_metadata_files are generated? Are there any configurations in the xmls that specify these to be generated? Thanks

7 17

debugging jobs in 'new' state
by Shantanu Pavgi 10 May '12

10 May '12

We experienced an issue where some of the galaxy jobs were sitting in the 'new' state for a quite long time. They were not waiting for cluster resources to become available, but haven't been even queued up through DRMAA. We are currently using non-debug mode and following were my observations: * No indication of new jobs in paster.log file * database/pbs script didn't contain any associated job scripts * in backend database - job table contained their galaxy job id but no command_line input was recorded Also, not all the jobs are waiting in the 'new' state. Many jobs submitted after above waiting jobs got completed successfully on the cluster. Is there any job submission logic within galaxy which is being used for submitting jobs? Any clues on how to debug this issue will be really helpful. -- Thanks, Shantanu.

3 4

Error on local galaxy using SAM-to-BAM tool on a cluster
by Carlos Borroto 12 Apr '12

12 Apr '12

Hi, I'm running into this error: "Error sorting alignments from (/tmp/5800600.1.all.q/tmpXOc5mD/tmpAZCzt_), " When using SAM-to-BAM tool on a locally install Galaxy using a SGE cluster. I'm using the last version of galaxy-dist. I'm guessing I have a problem with the configuration for the tmp folder. I have this on "universe_wsgi.ini": # Temporary files are stored in this directory. new_file_path = /home/cborroto/galaxy_dist/database/tmp But I don't see this directory being used and from the error looks like /tmp in the node is used. I wonder if this is the problem, as I don't know if there is enough space in the local /tmp directory at the nodes? I ran the same tool in a subset of the same SAM file and it ran fine. Also, I see this in the description of the tool: "This tool uses the SAMTools toolkit to produce an indexed BAM file based on a sorted input SAM file." But what I actually need is to sort a SAM file output from bwa, I haven't found any other way than to converting it to BAM. Looking at "sam_to_bam.py" I see the BAM file will also be sorted. Would it be wrong to feed an unsorted SAM file into this tool? Finally, just to be sure there is nothing wrong with the initial SAM file, I ran "samtools view ..." and "samtools sort ..." on this file manually outside of Galaxy and it ran fine. Thanks in advance, Carlos

4 5

Error Setting BAM Metadata
by Liisa Koski 16 Mar '12

16 Mar '12

Hello, I am trying to upload BAM files (by pasting a URL) to my history(or DataLibrary) and get the following error. These are bam files which I had previously uploaded with no problems. Traceback (most recent call last): File "/doolittle/Galaxy/galaxy_dist/lib/galaxy/jobs/runners/local.py", line 126, in run_job job_wrapper.finish( stdout, stderr ) File "/doolittle/Galaxy/galaxy_dist/lib/galaxy/jobs/__init__.py", line 618, in finish dataset.set_meta( overwrite = False ) File "/doolittle/Galaxy/galaxy_dist/lib/galaxy/model/__init__.py", line 874, in set_meta return self.datatype.set_meta( self, **kwd ) File "/doolittle/Galaxy/galaxy_dist/lib/galaxy/datatypes/binary.py", line 179, in set_meta raise Exception, "Error Setting BAM Metadata: %s" % stderr Exception: Error Setting BAM Metadata: [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file) I ran bamtools on the unix command line to see if there was anything wrong with the file(s) but nothing. I tried uploading different bam files from other projects and get the same error. I did do an update to the latest release yesterday...if that helps? Thanks in advance, Liisa

2 3

Defining Job Runners Dynamically
by John Chilton 05 Mar '12

05 Mar '12

Hello All, I just issued a pull request that augments Galaxy to allow defining job runners dynamically at runtime (https://bitbucket.org/galaxy/galaxy-central/pull-request/12/dynamic-job-run…) Whether it makes the cut or not, I thought I would describe enhancements here in case anyone else would find it useful. There a couple use cases we hope this will help us address for our institution - one is dynamically switching queues based on user (we have a very nice shared memory resource that can only be used by researchers with NIH funding) and the other is inspecting input sizes to give more accurate max walltimes to pbs (a small number of cufflinks jobs for instance take over three days on our cluster but defining max walltimes in excess of that for all jobs could result in our queue sitting idle around our monthly downtimes). You might also imagine using this to dynamically switch queues entirely based on input sizes or parameters, or alter queue priorities based on the submitting user or input sizes/parameters. There are two steps to use this - you must add a line in universe.ini and define a function to compute the true job runner string in the new file lib/galaxy/jobs/rules.py. This first step is similar to what you would do to statically assign a tool to a particular job runner. If you would like to dynamically assign a job runner for cufflinks you would start by adding a line like one of the following to universe.ini cufflinks = dynamic:///python -or- cufflinks = dynamic:///python/compute_runner If you use the first form, a function called cufflinks must be defined in rules.py, adding the extra argument after python/ lets you specify a particular function by name (compute_runner in this example). This second option could let you assign job runners with the same function for multiple tools. The only other step is to define a python function in rules.py that produces a string corresponding to a valid job runner such as "local:///" or "pbs:///queue/-l walltime=48:00:00/". If the functions defined in this file take in arguments, these arguments should have names from the follow list: job_wrapper, user_email, app, job, tool, tool_id, job_id, user. The plumbing will map these arguments to the implied galaxy object. For instance, job_wrapper is the JobWrapper instance for the job that gets passed to the job runner, user_email is the user's email address or None, app is the main application configuration object used throughout the code base that can be used for instance to get values defined in universe.ini, job, tool, and user are model objects, and job_id and tool_id the relevant ids. If you are writing a function that routes a certain list of users to a particular queue or increases their priority, you will probably only need to take in one argument - user_email. However, if you are going to look at input file sizes you may want to take in an argument called job and use the following piece of code to find the input size for input named "input1" in the tool xml. inp_data = dict( [ ( da.name, da.dataset ) for da in job.input_datasets ] ) inp_data.update( [ ( da.name, da.dataset ) for da in job.input_library_datasets ] ) input1_file = inp_data[ "input1" ].file_name input1_size = os.path.getsize( input1_file ) This whole concept works for a couple of small tests on my local machine, but there are certain aspects of the job runner code that makes me feel there may be corner cases I am not seeing where this approach may not work - so your millage may vary. -John ------------------------------------------------ John Chilton Software Developer University of Minnesota Supercomputing Institute Office: 612-625-0917 Cell: 612-226-9223 E-Mail: chilton(a)msi.umn.edu

3 2

Galaxy strips CSS from HTML files
by Cory Spencer 02 Mar '12

02 Mar '12

Hello all - One of the Galaxy tools I've been developing generates HTML output which I'd styled using a <style>...</style> tag in the HTML header. After updating to the latest Galaxy release earlier today, the <html>, <head>...</head>, <style> and <body> tags started to get stripped from the output, rendering previously CSS styled output rather unstylish. Delving into things, I noticed a change committed in December that sanitizes the output for HTML files via a call to "sanitize_html": https://bitbucket.org/galaxy/galaxy-central/changeset/35fee32991ce#chg-lib/… The added lines 381 -> 383 in the new file appear to be causing this new behaviour. Is there any option for making this optional? What was the rational behind stripping out these tags on outputted HTML files? Thanks for any help! Cory Spencer

3 6

Switching Torque>PBSpro: qsub error 111 (cannot connect to server)
by Louise-Amélie Schmitt 17 Feb '12

17 Feb '12

Hello, We want to move Galaxy's jobs from our small TORQUE local install to a big cluster running PBS Pro. In the universe_wsgi.ini, I changed the cluster address as follows: default_cluster_job_runner = pbs:/// to: default_cluster_job_runner = pbs://sub-master/clng_new/ where sub-master is the name of the machine and clng_new is the queue. However, I get an error when trying to run any job: galaxy.jobs.runners.pbs ERROR 2012-01-16 11:10:00,894 Connection to PBS server for submit failed: 111: Could not find a text for this error, uhhh This corresponds to the qsub error 111 (Cannot connect to specified server host) which is, for some reason, caught by pbs_python as an error of its own (111 not corresponding to any pbs_python error code, hence the face-plant-message). Our guess is that we might need to re-scramble the pbs_python egg with PBS pro's libraries, is that correct? If it's the case, what do we have to set as LIBTORQUE_DIR? Thanks, L-A

4 11

bug: unsorted bam files and import into data library
by Florian Wagner 15 Feb '12

15 Feb '12

Hi, when a tool outputs an unsorted bam file, the indexing fails (quietly) and its metadata variable "bam_index" points to an inexistent file. This causes a nasty bug when trying to import the dataset into a data library and actually makes the library unusable unless you delete the broken entry from the - in my case Postgresql - database by hand. Are you working on it? Thanks, Florian

2 1

Uploading large file in browser
by Kim, Hyunsoo 14 Feb '12

14 Feb '12

Hello, I have local instance of galaxy and wanted to modify "upload file" so that I will be able to upload large files ( > 2GB). The reason I am trying to do this in browser is that extra tools for FTP do not really work in my environment because of all the constraints and firewalls. I came up with jQuery file upload tool (http://blueimp.github.com/jQuery-File-Upload/) and the tool seems fine if it is possible me to integrate into my galaxy instance. My questions are: -Is it too cumbersome to achieve this goal with external tools? -how deep should I hit the galaxy (at the code level) to integrate this jquery tool? -Are there any alternatives to upload large files in browser without FTP? Thanks, Daniel

4 7