April 2013 - galaxy-dev - lists.galaxyproject.org

Broken tools ids from ./run_functional_tests.sh -list
by Peter Cock 15 Apr '13

15 Apr '13

Hi all, Does anyone else see this problem with broken tool ids in the function test script? I found this because one of my own tools wasn't being recognised because for some reason the id was treated as _name= [sic]. This happens on the default branch: $ ./run_functional_tests.sh -list | grep "_name=" _name= RNA/DNA _name= Convert _name= Compute quality statistics _name= Draw quality score boxplot _name= DAVID $ hg branch default hg log -b default | head changeset: 9349:199b62339f26 user: jeremy goecks <jeremy.goecks(a)emory.edu> date: Thu Apr 11 08:44:42 2013 -0400 summary: Whitespace fixes for Vcf datatype. changeset: 9348:dbfc964167ae user: guerler date: Wed Apr 10 17:30:27 2013 -0400 summary: Enhance button for trackster visualization in data display viewer Have I found a recent regression, or is my system messed up somehow? Thanks, Peter

2 4

drmaa and JSV
by jean-François Taly 15 Apr '13

15 Apr '13

Hi, Our galaxy instance runs jobs in a SGE cluster using 2 job-handlers. The SGE cluster uses a Job Submission Verifier (JSV) that rejects any job submission that specify core binding strategies. When Galaxy starts, the first jobs we submit works perfectly: First job submission: galaxy.jobs.manager DEBUG 2013-04-15 14:29:59,285 (194) Job assigned to handler 'handler0' galaxy.jobs DEBUG 2013-04-15 14:29:59,934 (194) Working directory for job is: /scratch/nfs/galaxy.crg.es/job_working_directory/000/194 galaxy.jobs.handler DEBUG 2013-04-15 14:29:59,942 dispatching job 194 to drmaa runner galaxy.jobs.handler INFO 2013-04-15 14:30:00,166 (194) Job dispatched galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:00,468 (194) submitting file /scratch/nfs/galaxy.crg.es/ogs/galaxy_194.sh galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:00,468 (194) command is: python /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/tools/fastq/fastq_stats.py '/data/www-bi/galaxy.crg.es/files/000/dataset_4.dat' '/data/www-bi/galaxy.crg.es/files/000/dataset_238.dat' 'sanger' galaxy.jobs.runners.drmaa INFO 2013-04-15 14:30:01,538 (194) queued as 458816 galaxy.jobs.runners.drmaa DEBUG 2013-04-15 14:30:02,115 (194/458816) state change: job is queued and active # qstat -cb -j 458816 ============================================================== job_number: 458816 exec_file: job_scripts/458816 submission_time: Mon Apr 15 14:30:01 2013 owner: www-bi uid: 66401 group: www-bi gid: 501 sge_o_home: /data/www-bi sge_o_log_name: www-bi sge_o_path: /data/galaxy/apache/galaxy.crg.es/htdocs/scripts/galaxy-env/bin:/software/galaxy/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/data/www-bi/bin sge_o_shell: /bin/bash sge_o_workdir: /data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist sge_o_host: galaxy account: sge stderr_path_list: NONE:galaxy:/scratch/nfs/galaxy.crg.es/job_working_directory/000/194/194.drmerr reserve: y hard resource_list: virtual_free=12G,h_rt=21600 mail_list: www-bi(a)galaxy.crg.es notify: FALSE job_name: g194_fastq_stats_jtaly_crg_es stdout_path_list: NONE:galaxy:/scratch/nfs/galaxy.crg.es/job_working_directory/000/194/194.drmout jobshare: 0 hard_queue_list: www-el6 env_list: script_file: /scratch/nfs/galaxy.crg.es/ogs/galaxy_194.sh parallel environment: smp range: 2 verify_suitable_queues: 2 binding: set linear:2:0,0 scheduling info: queue instance "pr-el6(a)fenn.linux.crg.es" dropped because it is overloaded: np_load_avg=1.703333 (= 1.703333 + 0.50 * 0.000000 with nproc=12) >= 1.7 queue instance "short(a)node-ib0209bi.linux.crg.es" dropped because it is overloaded: np_load_avg=2.837500 (= 2.837500 + 0.50 * 0.000000 with nproc=8) >= 1.3 queue instance "long(a)node-ib0209bi.linux.crg.es" dropped because it is overloaded: np_load_avg=2.837500 (= 2.837500 + 0.50 * 0.000000 with nproc=8) >= 1.3 The core binding has been added by our jsv script. This is correct. But our second submission fails: galaxy.jobs.runners.drmaa ERROR 2013-04-15 14:30:56,263 Uncaught exception queueing job Traceback (most recent call last): File "/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 144, in run_next self.queue_job( obj ) File "/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/lib/galaxy/jobs/runners/drmaa.py", line 232, in queue_job job_id = self.ds.runJob(jt) File "/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/__init__.py", line 331, in runJob _h.c(_w.drmaa_run_job, jid, _ct.sizeof(jid), jobTemplate) File "/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/helpers.py", line 213, in c return f(*(args + (error_buffer, sizeof(error_buffer)))) File "/data/www-bi/apache/galaxy.crg.es/htdocs/galaxy-dist/eggs/drmaa-0.4b3-py2.6.egg/drmaa/errors.py", line 90, in error_check raise _ERRORS[code-1]("code %s: %s" % (code, error_buffer.value)) DeniedByDrmException: code 17: contact us: XXX(a)XXX.es if we look at the submited params: # cat /tmp/qsub_err.txt $VAR1 = { 'w' => 'e', 'N' => 'g195_fastq_stats_jtaly_crg_es', 'binding_amount' => '2', 'CMDNAME' => '/scratch/nfs/galaxy.crg.es/ogs/galaxy_195.sh', 'binding_type' => 'set', 'M' => { 'www-bi(a)galaxy.crg.es' => undef }, 'binding_strategy' => 'linear', 'l_hard' => { 'virtual_free' => '12G', 'h_rt' => '6:00:00' }, 'shell' => 'n', 'pe_min' => '2', 'USER' => 'www-bi', 'binding_socket' => '0', 'e' => { '/scratch/nfs/galaxy.crg.es/job_working_directory/000/195/195.drmerr' => undef }, 'GROUP' => 'www-bi', 'binding_core' => '0', 'pe_max' => '2', 'CMDARGS' => '0', 'q_hard' => { 'www-el6' => undef }, 'pe_name' => 'smp', 'CLIENT' => 'drmaa', 'b' => 'y', 'R' => 'y', 'VERSION' => '1.0', 'CONTEXT' => 'client', 'o' => { '/scratch/nfs/galaxy.crg.es/job_working_directory/000/195/195.drmout' => undef } }; There's a core binding strategy. The problem is that second job submission is inheriting submission parameters from the first job, and, as the JSV script does not allow to specify core binding strategy by the user, the job is rejected. If you wait some time (600 seconds), the new submit works again... We are wondering if anyone can help us to understand why the submission parameters been inherit by each job? Maybe the DRMAA session is not properly closed? or the environment not cleaned? Thank you for your help Best Jean-François $hg summary parent: 8795:9fd7fe0c5712 merge from stable branch: default commit: 1 modified, 59 unknown update: (current) -- ##################################### Jean-François Taly Bioinformatician Bioinformatics Core Facility http://biocore.crg.cat CRG - Centre de Regulació Genòmica (Room 439) Parc de Recerca Biomèdica de Barcelona (PRBB) Doctor Aiguader, 88 08003 Barcelona Spain email: jean-francois.taly(a)crg.eu phone: +34 93 316 0202 fax: +34 93 316 0099 #####################################

1 0

Research in Genomics
by Twaha Mlwilo 15 Apr '13

15 Apr '13

Hello, all First , I apologise if am posting in wrong forum!, I would like to learn how to conduct research on population genomic and am looking for recommendations for good material such as tutorials,books and course outline. thank you huu

1 0

tasks (parralelism) and from_work_dir bug
by Hagai Cohen 14 Apr '13

14 Apr '13

Hi, I have tried to use tasks with tophat2, but I got the following bug. > multi.py +153 msg = 'nothing to merge for %s (expected %i files)' \ % (output_file_name, len(task_dirs)) This occurs because tophat2 tool uses from_work_dir feature to get the output files. But on the line: output_files = [ f for f in output_files if os.path.exists(f) ] It returnes an empty list because no real output files exists. Did anyone else have tried this before? Hagai

2 1

Trouble with Toolshed on MySQL
by Rob Hooft 12 Apr '13

12 Apr '13

To ease the maintenance of our Dutch National galaxy, we're trying to set up our own toolshed on a system that happens to have a nicely large MySQL database running. We'd like to make it use that database. The setup has been ok, until we started running the toolshed on the MySQL server for the first time, which resulted in the message in the community_webapp.log I copied below. Is this an incompatibility? Regards, Rob Hooft File "lib/galaxy/webapps/community/model/migrate/versions/0001_initial_tables.py", line 150, in upgrade metadata.create_all() ... self._handle_dbapi_exception(e, statement, parameters, cursor, context) File "/opt/toolshed/galaxy-dist/eggs/SQLAlchemy-0.5.6_dev_r6498-py2.6.egg/sqlalchemy/engine/base.py", line 931, in _handle_dbapi_exception raise exc.DBAPIError.instance(statement, parameters, e, connection_invalidated=is_disconnect) OperationalError: (OperationalError) (1170, "BLOB/TEXT column 'annotation' used in key specification without a key length") u'CREATE INDEX ix_tool_annotation_association_annotation ON tool_annotation_association (annotation)' () Removing PID file community_webapp.pid -- Rob W.W. Hooft Chief Technology Officer BioAssist, Netherlands Bioinformatics Centre http://www.nbic.nl/ Skype: robhooft GSM: +31 6 27034319

2 1

heatmap capabilities!
by Hakeem Almabrazi 11 Apr '13

11 Apr '13

Hi, Is there a heatmap tool for Galaxy? If so can someone send me a link to how to integrate it to a local galaxy? I appreciate your help.

2 1

Dataset diskspace useage
by Alexander Kurze 11 Apr '13

11 Apr '13

Hello, I am running a local galaxy server where I have restricted the amount of data for each user to a 100gb. Users are able to delete their own history to regain diskspace. However, for some users deleting the histories (all saved histories and datasets) only frees ~80% of their allowance. As an admin I ran the "clean-up" scripts which were recommended on the galaxy homepage. But this still didn't increase their data limit. Anyone knows what is going on? Best wishes, Alex

2 1

negative user data usage
by Geert Vandeweyer 11 Apr '13

11 Apr '13

Hi, Today, I found out that one user in our local galaxy installation (the administrator user) has a negative disk usage. - Reports shows : -72780720701 bytes - Galaxy history shows: -1% Does anybody have suggestions on what might be causing this and how to solve it? There is about 660 Gb of data in the histories of that user, but it was more before. I believe it happened after some histories were deleted and there was a message of one of them being shared. Best, Geert -- Geert Vandeweyer, Ph.D. Department of Medical Genetics University of Antwerp Prins Boudewijnlaan 43 2650 Edegem Belgium Tel: +32 (0)3 275 97 56 E-mail: geert.vandeweyer(a)ua.ac.be http://ua.ac.be/cognitivegenetics http://www.linkedin.com/pub/geert-vandeweyer/26/457/726

3 2

Tophat didn't run
by Sarah Maman 11 Apr '13

11 Apr '13

Hello, When I run tophat ("Tophat for Illumina Find splice junctions using RNA-seq data ), the job failed with truncated files. However, index files are available and I get exactly the same error message using built-in index or one of my history. / Tool execution generated the following error message: Error in tophat: [2013-04-10 09:17:07] Beginning TopHat run (v2.0.5) ----------------------------------------------- [2013-04-10 09:17:07] Checking for Bowtie Bowtie version: 2.0.0.7 [2013-04-10 09:17:07] Checking for Samtools Samtools version: 0.1.19.0 [2013-04-10 09:17:07] Checking for Bowtie index files Error: Could not find Bowtie 2 index files (/work/galaxy/Danio_rerio.Zv9.62.dna.chromosome.22.fa.*.bt2) The tool produced the following additional output: TopHat v2.0.5 tophat -p 4 /work/galaxy/Danio_rerio.Zv9.62.dna.chromosome.22.fa /work/galaxy/database/files/006/dataset_6528.dat [bam_header_read] EOF marker is absent. The input is probably truncated. [bam_header_read] invalid BAM binary header (this is not a BAM file). [bam_index_core] Invalid BAM header.[bam_index_build2] fail to index the BAM file. Epilog : job finished at mer. avril 10 09:17:12 CEST 2013 / In this post (http://dev.list.galaxyproject.org/tophat-for-illumina-looking-in-wrong-dire…) the solution isn't found. Do you have any idea, Sarah Maman -- --*-- Sarah Maman INRA - LGC - SIGENAE http://www.sigenae.org/ Chemin de Borde-Rouge - Auzeville - BP 52627 31326 Castanet-Tolosan cedex - FRANCE Tel: +33(0)5.61.28.57.08 Fax: +33(0)5.61.28.57.53 --*--

2 3

Setting extension of file names generated by Galaxy
by Holtgrewe, Manuel 11 Apr '13

11 Apr '13

Dear all, is it possible to make Galaxy use a specific extension for a given data type without a wrapper? I.e. can I make Galaxy use the extension .bam, .sam, .fasta etc. internally instead of .dat? Thanks, Manuel

2 1