DRMAA error with latest update 26920e20157f
by Shantanu Pavgi
I am getting following error with the latest galaxy-dist revision '26920e20157f' update. The Python version is 2.6.6.
{{{
galaxy.jobs.runners.drmaa ERROR 2012-01-29 21:00:28,577 Uncaught exception queueing job
Traceback (most recent call last):
File "/projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py", line 140, in run_next
self.queue_job( obj )
File "/projects/galaxy/galaxy-165/lib/galaxy/jobs/runners/drmaa.py", line 190, in queue_job
command_line )
TypeError: not all arguments converted during string formatting
}}}
I was wondering if anyone else is experiencing this same issue. The system works fine when I rollback to revision 'b258de1e6cea'. Are there any additional configuration details required with the latest revision that I am missing??
--
Shantanu
11 years
Disk size of all users?
by Bossers, Alex
Reading this on the wiki: http://wiki.g2.bx.psu.edu/Admin/Disk%20Quotas
Shows that there is a record in the DB tracking the users allocated diskspace for histories.....
Is there a convenient way to get this info using the galaxy admin panels?
Thereby we can track heavy users and urge them to cleanup or to improve data practice...
Thanks
Alex
11 years
How to allow anonymous users to run workflows?
by Tim te Beek
Hi all,
Was wondering how I can allow anonymous users to run workflows in my
local Galaxy instance, as currently users need to be logged in to run
workflows. I'd like drop this requirement in light of the intended
publication of a workflow in a journal which demands that "Web
services must not require mandatory registration by the user.". Could
any you tell me how I can accomplish this?
I've seen the option to use an external authentication method which
could be employed to artificially 'login' anonymous users for a single
session, but it appears this would also disable the normal users
administration mechanisms in Galaxy, so I'm not sure this would be a
good fit. Any hints on how to proceed, either via this route or
otherwise, would be much appreciated.
Best regards,
Tim
11 years
January 27, 2012 Galaxy Distribution & News Brief
by Jennifer Jackson
January 27, 2012 Galaxy Distribution & News Brief
Complete News Brief
* http://wiki.g2.bx.psu.edu/DevNewsBriefs/2012_01_27
Highlights:
* Important metadata and Python 2.5 support corrections
* SAMtools upgraded for version 0.1.18. Mpileup added.
* Dynamic filtering, easy color options, and quicker
indexing enhance Trackster
* Set up your Galaxy instance to run cluster jobs as
the real user, not the Galaxy owner
* Improvements to metadata handling and searching in
the Tool Shed
* Improved solutions for schema access, jobs management,
& workflow imports and inputs.
* New datatypes (Eland, XML), multiple tool enhancements,
and bug fixes.
Get Galaxy!
* http://getgalaxy.org
new: % hg clone http://www.bx.psu.edu/hg/galaxy galaxy-dist
upgrade: % hg pull -u -r 26920e20157f
Read the release announcement and see the prior release history
* http://wiki.g2.bx.psu.edu/DevNewsBriefs/
Need help with a local instance?
Search with our custom google tools!
* http://wiki.g2.bx.psu.edu/Mailing%20Lists#Searching
And consider subscribing to the galaxy-dev mailing list!
* http://wiki.g2.bx.psu.edu/Mailing%20Lists#Subscribing_and_Unsubscribing
--
Jennifer Jackson
Galaxy Team
http://usegalaxy.org
http://galaxyproject.org
http://galaxyproject.org/wiki/Support
11 years
input param with type="data" multiple="true" working?
by Leandro Hermida
Hello,
There have been previous requests/questions (some mine) about fixing Galaxy
tool functionality to enable a multiple select menu item for input data in
the history with the following:
<param ... type="data" multiple="true" ... />
Instead of using the cumbersome <repeat> tags and resulting form. Is this
working in the latest Galaxy build?
kind regards,
Leandro
11 years
how to use projects for fair-share on compute-cluster
by Edward Kirton
Galaxy sites usually do all work a compute cluster, with all jobs submitted
as a "galaxy" unix user, so there isn't any "fair-share" accounting between
users.
Other sysops have created a solution to run jobs as the actual unix user,
which may be feasible for an intranet site but is undesirable for a site
accessible via the internet due to security reasons.
A simpler and more secure method to enable fair-share is by using projects.
Here's a simple scenario and straightforward solution: Multiple groups in
an organization use the same galaxy site and it is desirable to enable
fair-share accounting between the groups. All users in a group consume the
same fair-share, which is generally acceptable.
1) configure scheduler with a project for each group, configure each user
to use their group's project by default, and grant galaxy user access to
submit jobs to any project; all users should be associated with a project.
There's a good chance your grid is already configured this way.
2) create a database which maps galaxy user id to a project; i use a cron
job to create a standalone sqlite3 db. since this is site-specific, code
is not provided but hints are given below. Rather than having a separate
database, the proj could have been added to the galaxy db, but i sought to
minimize my changes.
3) add a snippet of code to drmaa.py's queue_job method to lookup proj from
job_wrapper.user_id and append to jt.nativeSpecification; see below
Here are the changes required. It's small enough that I didn't do this as
a clone/patch.
(1) lib/galaxy/jobs/runners/drmaa.py:
11 import sqlite3
12
...
155 native_spec = self.get_native_spec( runner_url )
156
157 # BEGIN ADD USER'S PROJ
158 if self.app.config.user_proj_map_db is not None:
159 try:
160 conn = sqlite3.connect(self.app.config.user_proj_map_db)
161 c = conn.cursor()
162 c.execute('SELECT PROJ FROM USER_PROJ WHERE GID=?',
[job_wrapper.user_id])
163 row = c.fetchone()
164 c.close
165 native_spec += ' -P ' + row[0]
166 except:
167 log.debug("Cannot look up proj of user %s" %
job_wrapper.user_id)
168 # END ADD USER'S PROJ
(2) lib/galaxy/config.py: add support for user_proj_map_db variable
self.user_proj_map_db = resolve_path( kwargs.get(
"user_proj_map_db", None ), self.root )
(3) universe_wsgi.ini:
user_proj_map_db = /some/path/to/user_proj_map_db.sqlite
(4) here's some suggestions to help get you started on a script to make the
sqlite3 db.
a) parse ldap tree example: (to get uid:email)
ldapsearch -LLL -x -b 'ou=aliases,dc=jgi,dc=gov'
b) parse scheduler config: (to get uid:proj)
qconf -suserl | /usr/bin/xargs -I '{}' qconf -suser '{}' | egrep
'name|default_project'
c) query galaxy db: (to get gid:email)
select id, email from galaxy_user;
The limitation of this method is that all jobs submitted by a user will
always be charged to the same project (which may be okay, depending on how
your organization uses projects). However a user may have access to
several projects and may wish to associate some jobs with a particular
project. This could be accomplished by adding an option to the user
preferences; a user would chose a project from their available projects and
any jobs submitted would have to record their currently chosen project.
Alternatively, histories could be associated with a particular project.
This solution would require significant changes to galaxy, so i haven't
implemented it (and the simple solution works well enough for me).
Edward Kirton
US DOE JGI
11 years
MergeSamFiles.jar and TMPDIR
by Glen Beane
We recently updated to the latest galaxy-dist, and learned that the sam_merge.xml tool now uses picard MergeSamFiles.jar to merge the files instead of the samtools merge wrapper sam_merge.py.
this is a problem for us because MergeSamFiles.jar does not honor $TMPDIR when creating temporary file names (the jvm developers inexplicably hard code the value of java.io.tmpdir to /tmp in Unix/Linux rather than doing the Right Thing) . On our cluster, TMPDIR is set to something like /scratch/batch_job_id/. This location has plenty of free space, however /tmp does not and now we can't successfully merge largeish bam files.
In case anyone else is bit by this, I think there are two options
the Picard tools take an optional TMP_DIR= argument that lets us specify the location we want to use for a temporary directory. Initially we ended up modifying the .xml to add TMP_DIR=\$TMPDIR to the arguments to MergeSamFiles.jar. This works, but we could potentially need to do this with multiple Picard tools and not just MergeSamFiles. Now I am probably going to go with the following solution:
add something like "export _JAVA_OPTIONS=-Djava.io.tmpdir=$TMPDIR" to the .bashrc file for my Galaxy user.
--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153
11 years
problem with "Input dataset" workflow control feature and custom non-subclass datatypes
by Leandro Hermida
Hi,
There seems to be a weird bug with the "Input dataset" workflow
control feature, hard to explain clearly but I'll try my best.
If you define a custom datatype that is a simple subclass of an
existing galaxy datatype, e.g.:
<datatype extension="myext" type="galaxy.datatypes.data:Text"
subclass="True" display_in_upload="true"/>
And if this datatype will be the input to a workflow where you want to
use the multiple input files feature you must put into your workflow
editor an "Input dataset" box at the beginning and connect it.
If you define a custom datatype that's it's own custom class, e.g.:
<datatype extension="myext" type="galaxy.datatypes.data:MyExt"
display_in_upload="true"/>
with a simple class in lib/galaxy/datatypes/data.py e.g.:
class MyExt( Data ):
file_ext = "myext"
And if this datatype will be the input data to a workflow if you have
an "Input dataset" box at the beginning for some reason the drop-down
menu (or multi-select) won't not have files of this type from your
history it just ignores them. Now what is strange is if I edit the
workflow and remove the beginning "Input dataset" box and start the
workflow with just the first tool which has this custom datatype as an
input parameter then when I try to run the workflow everything shows
up properly :-/
Hope I explained this ok, seems like something is broken with the
"Input dataset" workflow control feature.
best,
Leandro
11 years
Error Msg: Cluster could not complete job
by Dave Lin
Dear Galaxy Support,
I'm getting the following error message when trying to process larger Solid
files.
ERROR MESSAGE: "Cluster could not complete job"
- Compute Quality Statistic-- First got the error message. Ran ok after
re-running the job.
- Subsequent job of converting qual/csfasta -> fastq failed with same error
message
- Doesn't seem to happen on small solid files
Potentially relevant information:
1. Cloud Instance on Amazon/Large instance
2. Only one master node on cluster.
3. Has been updated using the update feature to a version as of late last
week.
4. Only 1 user right now on system, so there shouldn't be any competing
load.
5. Downloaded a bunch of data files, so volume was at 94%. Currently in
process of expanding volume.
Question: Is this expected behavior or have I misconfigured something (i.e.
some timeout value)? Any suggestions?
Thanks in advance,
Dave
P.S. I'm new to galaxy and impressed so far. Keep up the great work.
11 years
software installs: PATH vs env.sh
by Andrew Warren
Hello,
We recently transitioned from a CloudMan instance of galaxy to our own
cluster and started having problems with calls to tools from within
other tools. For example when Tophat calls bowtie-inspect its not
finding the executable. To fix this I listed bowtie in the
requirements section of the tophat wrapper like so:
<tool id="tophat" name="Tophat for Illumina" version="1.5.0">
<description>Find splice junctions using RNA-seq data</description>
<version_command>tophat --version</version_command>
<requirements>
<requirement type="package">tophat</requirement>
<requirement type='package'>bowtie</requirement>
<requirement type="package">samtools</requirement>
</requirements>
Now I am wondering, is it generally expected that all tools used by
galaxy will have their executables on the user galaxy's PATH? Is the
above a good solution? Or is there something else likely amiss with
our galaxy setup? I think we recently pulled updates for some major
tool_shed release but I haven't been able to determine if any of the
tools listed above were affected by that.
Wish I were in Český Krumlov asking this question. Missed the
registration deadline...doh.
Thanks,
Andrew Warren
11 years