Appropriate place to ask questions about your NGLIMS for Galaxy
by scott.coutts@monash.edu
Hi,
I'm a newbie when it comes to this kind of thing, so I was wondering whether you can tell me the most appropriate place to ask questions about making some modifications to your nglims extension for galaxy? Should they be posted to the normal galaxy development list? Should I ask you directly?
Cheers,
Scott.
11 years, 8 months
library_upload_from_import_dir
by George, David
Hi,
I'm a Galaxy and Python newbie and I'm working on project that desires
to upload files from a directory to the Galaxy server. We'd like to
physically copy the files rather than maintain references to them.
I'm starting by following the examples in scripts/api/README. All the
display and library_create_library examples work. However, the
library_upload_from_import_dir fails. In my universe...ini,
library_import_dir is defined to be /home/dgeorge/galaxy/import and
there is a bed directory under that with a short file, bed1.bed in the
correct format. I'm not sure what to provide for the last parameter,
the dbkey - maybe that's my problem?
I ran:
./library_upload_from_import_dir.py 59d2fd4e020e178f8c48e61150e513c2
http://localhost:8080/api/libraries/f597429621d6eb2b/contents
c6ca0ddb55be603a10e891fff2e902c3 bed bed hg19
And I get the error:
Exception happened during processing of request from ('127.0.0.1',
38857)
Traceback (most recent call last):
File
"/home/dgeorge/galaxy/galaxy-central/eggs/Paste-1.6-py2.6.egg/paste/http
server.py", line 1053, in process_request_in_thread
self.finish_request(request, client_address)
File "/usr/lib/python2.6/SocketServer.py", line 322, in finish_request
self.RequestHandlerClass(request, client_address, self)
File "/usr/lib/python2.6/SocketServer.py", line 618, in __init__
self.finish()
File "/usr/lib/python2.6/SocketServer.py", line 661, in finish
self.wfile.flush()
File "/usr/lib/python2.6/socket.py", line 297, in flush
self._sock.sendall(buffer(data, write_offset, buffer_size))
error: [Errno 32] Broken pipe
Any ideas or suggestions?
Thanks!
David George
Staff Software Engineer
Illumina, Inc.
25861 Industrial Blvd.
Hayward, CA 94545
Tel: 510-670-9326
Fax: 510-670-9302
Email: dgeorge(a)illumina.com <mailto:dgeorge@illumina.com>
11 years, 8 months
Uploading Zip Files
by SHAUN WEBB
Hi,
I have a tool that takes a zipped archive as input, finds all the
sequence files (~90 of them), separates them by barcode, does a little
extra processing and outputs a single fasta file. I would like this to
run on Galaxy in a workflow with a few other tools.
I know there have been discussions about uploading zip files to Galaxy
before and that currently Galaxy automatically unzips a file and
uploads the first file in the archive. Are there any plans to change
this behaviour and be able to upload a zipped file without
decompressing.
Thanks
Shaun
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
11 years, 8 months
Re: [galaxy-dev] suggestion for multithreading
by Assaf Gordon
(moved to galaxy-dev)
Nate Coraor wrote, On 06/02/2011 01:31 PM:
> Peter Cock wrote:
>> On Thu, Jun 2, 2011 at 6:23 PM, Nate Coraor <nate(a)bx.psu.edu> wrote:
>>>
>>> pbs.py then knows to translate '<resource type="cores">8</resource>' to
>>> '-l nodes=1:ppn=8'.
>>>
>>> Your tool can access that value a bunch, like $__resources__.cores.
>>>
>>> The same should be possible for other consumables.
>>>
Just a thought here:
The actual parameters that are passed to the scheduler are not necessarily hard-coded.
Meaning, at least with SGE, specifying the number of cores can be:
qsub -pe threads=8
or
qsub -pe cores=8
or
qsub -pe jiffies=8
and same thing for memory limitation (e.g. "-l virtual_free=800M").
The reason is that those resources (e.g. "threads", "cores", "virtual_free") are just identifiers, and they are created and configured by whomever installed SGE - they are not built-in or hard-coded).
So just be careful in your design/implementation when automatically translating XML resources to hard-coded parameters.
If you do hard-code them, just make sure the specifically document it (i.e. Galaxy expect the SGE threads parameter to be "-pe threads=8" and nothing else).
-gordon
11 years, 8 months
Problem with Torque/Maui
by Marco Moretto
Hi all,
first of all, thanks to the Galaxy Team for this really useful software.
Actually I don't really know if my problem is related with Galaxy or with
Torque/Maui but I didn't find any solution looking in both Torque and Maui
user lists, so I hope that some of you with more experience could give me
some good advices. I'm trying to set up Galaxy in a small local virtual
environment in order to test it. I started with 2 virtual ubuntu servers
called galaxy1 and galaxy2. On galaxy1 I succesfully installed Galaxy,
Apache, Torque and Maui. I'm using Postgres ad DBMS. It is installed on
another "real" DB server. The virtual server galaxy2 is used as node. Galaxy
is working like a charm locally but when I try to use Torque problems arise.
Torque alone works correctly. That means that I can submit a job with qsub
and everything works. The 2 virtual server (galaxy1 and galaxy2) share a
directory (through NFS) in which I installed Galaxy following the "unified
method" from the documentation.
Now, as I said, Galaxy alone works, Torque/Maui alone works. When I put the
two together nothing works.
As a test I upload (using local runner) a gff file. Then I try to make a
filter to the gff using "Filter and sort -> Extract features". When I run
this tool the corresponding job on the Torque queue runs forever in Hold
state. I report some output from diagnose programs:
The diagnose -j reports the following:
Name State Par Proc QOS WCLimit R Min User
Group Account QueuedTime Network Opsys Arch Mem Disk Procs
Class Features
29 Hold DEF 1 DEF 1:00:00 0 1 galaxy
galaxy - 00:02:36 [NONE] [NONE] [NONE] >=0 >=0 NC0
[batch:1] [NONE]
While the showq command reports
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING
STARTTIME
0 Active Jobs 0 of 1 Processors Active (0.00%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT
QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT
QUEUETIME
29 galaxy Hold 1 1:00:00 Wed May 4
03:56:40
The checkjob reports:
checking job 29
State: Hold
Creds: user:galaxy group:galaxy class:batch qos:DEFAULT
WallTime: 00:00:00 of 1:00:00
SubmitTime: Wed May 4 03:56:40
(Time Queued Total: 00:03:07 Eligible: 00:00:01)
The qstat -f reports
Job Id: 33.galaxy1.research.intra.ismaa.it
Job_Name = 27_Extract_features1_marco.moretto(a)iasma.it
Job_Owner = galaxy(a)galaxy1.research.intra.ismaa.it
job_state = W
queue = batch
server = galaxy1.research.intra.ismaa.it
ctime = Wed May 4 04:56:36 2011
Error_Path =
galaxy1:/mnt/equallogic1/galaxy/galaxy-dist/database/pbs/27.e
exec_host = galaxy2/0
exec_port = 15003
Execution_Time = Wed May 4 05:26:41 2011
mtime = Wed May 4 04:56:37 2011
Output_Path =
galaxy1:/mnt/equallogic1/galaxy/galaxy-dist/database/pbs/27.
o
qtime = Wed May 4 04:56:36 2011
Resource_List.neednodes = 1
Resource_List.nodect = 1
Resource_List.nodes = 1
Resource_List.walltime = 01:00:00
stagein = /mnt/equallogic1/galaxy/tmp/dataset_18.dat@galaxy1
:/mnt/equallog
ic1/galaxy/galaxy-dist/database/files/000/dataset_18.dat,
/mnt/equallogic1/galaxy/tmp/dataset_30.dat@galaxy1
:/mnt/equallogic1/g
alaxy/galaxy-dist/database/files/000/dataset_30.dat
stageout =
/mnt/equallogic1/galaxy/galaxy-dist/database/files/000/dataset_
30.dat@galaxy1
:/mnt/equallogic1/galaxy/galaxy-dist/database/files/000/
dataset_30.dat
substate = 37
Variable_List = PBS_O_QUEUE=batch,
PBS_O_HOST=galaxy1.research.intra.ismaa.it
euser = galaxy
egroup = galaxy
hashname = 33.galaxy1.research.intra.ismaa.it
queue_rank = 33
queue_type = E
StartDate: -00:03:06 Wed May 4 03:56:41
Total Tasks: 1
Req[0] TaskCount: 1 Partition: DEFAULT
Network: [NONE] Memory >= 0 Disk >= 0 Swap >= 0
Opsys: [NONE] Arch: [NONE] Features: [NONE]
IWD: [NONE] Executable: [NONE]
Bypass: 0 StartCount: 1
PartitionMask: [ALL]
PE: 1.00 StartPriority: 1
cannot select job 29 for partition DEFAULT (non-idle state 'Hold')
and finally the tracejob reports
/var/spool/torque/server_priv/accounting/20110504: Permission denied
/var/spool/torque/mom_logs/20110504: No such file or directory
/var/spool/torque/sched_logs/20110504: No such file or directory
Job: 33.galaxy1.research.intra.ismaa.it
05/04/2011 04:56:36 S enqueuing into batch, state 1 hop 1
05/04/2011 04:56:36 S Job Queued at request of
galaxy(a)galaxy1.research.intra.ismaa.it, owner =
galaxy(a)galaxy1.research.intra.ismaa.it, job name =
27_Extract_features1_marco.moretto(a)iasma.it, queue
= batch
05/04/2011 04:56:37 S Job Run at request of
galaxy(a)galaxy1.research.intra.ismaa.it
05/04/2011 04:56:41 S Email 's' to
galaxy(a)galaxy1.research.intra.ismaa.it failed: Child process 'sendmail -f
adm galaxy(a)galaxy1.research.intra.ismaa.it'
returned 127 (errno 10:No child processes)
The only clear thing to me is that after the submission the scheduler puts
it in a Hold state. But I cannot understand why. I also try to run the
Galaxy-generated sh script with qsub. Following the Galaxy log:
galaxy.jobs.runners.pbs DEBUG 2011-05-04 04:56:36,748 (27) submitting file
/mnt/equallogic1/galaxy/galaxy-dist/database/pbs/27.sh
I copied and run the 27.sh script with the command:
qsub 27.sh
And the job runs correctly. So what it is not clear to me is if the problem
is related to Torque/Maui or is related to the way in which the job is
submitted from Galaxy.
Sorry for the very long e-mail and thank you very much for any help.
---
Marco
11 years, 8 months
TSS for list of genes
by shamsher jagat
I am not sure if this is the right place to post this message, I want to
download TSS fro a list of around 1000 human genes. Is this something I can
do in Galaxy?
Thanks.
11 years, 8 months
Error: unable to read Galaxy config
by Lewis, Brian Andrew
After following the instructions in the wiki for setting up scaling/load balancing, I was testing out submitting a job and got this error when trying to pull data from USCS Main:
Traceback (most recent call last):
File "/usr/local/galaxy-dist/tools/data_source/data_source.py", line 6, in
from galaxy.util import gzip_magic
File "/usr/local/galaxy-dist/lib/galaxy/util/__init__.py", line 20, in
pkg_resources.require( 'docutils' )
File "/usr/local/galaxy-dist/lib/galaxy/eggs/__init__.py", line 400, in require
c = Crate()
File "/usr/local/galaxy-dist/lib/galaxy/eggs/__init__.py", line 259, in __init__
self.galaxy_config = GalaxyConfig()
File "/usr/local/galaxy-dist/lib/galaxy/eggs/__init__.py", line 358, in __init__
raise Exception( "error: unable to read Galaxy config from %s" % GalaxyConfig.config_file )
Exception: error: unable to read Galaxy config from /usr/local/galaxy-dist/0
I'm running Galaxy on a RHEL6 server and I have one web server and one runner set up using Apache as a proxy server.
Thanks,
Brian
11 years, 8 months
Migration error: fields in MySQL
by John Eppley
I had an error upgrading my galaxy instance. I got the following exception while migrating the db (during step 64->65):
sqlalchemy.exc.ProgrammingError: (ProgrammingError) (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'fields FROM form_definition' at line 1") u'SELECT id, fields FROM form_definition' []
It seems my version (4.1.22-log) of MySQL did not like 'fields' as a column name. If I alias the formdefinition as f and us f.fields, the error goes away. I also had to modify migration 76 for the same reason.
Here is my diff of the migrations dir:
diff -r 50e249442c5a lib/galaxy/model/migrate/versions/0065_add_name_to_form_fields_and_values.py
--- a/lib/galaxy/model/migrate/versions/0065_add_name_to_form_fields_and_values.py Thu Apr 07 08:39:07 2011 -0400
+++ b/lib/galaxy/model/migrate/versions/0065_add_name_to_form_fields_and_values.py Fri Apr 15 11:09:26 2011 -0400
@@ -39,7 +39,7 @@
return ''
# Go through the entire table and add a 'name' attribute for each field
# in the list of fields for each form definition
- cmd = "SELECT id, fields FROM form_definition"
+ cmd = "SELECT f.id, f.fields FROM form_definition f"
result = db_session.execute( cmd )
for row in result:
form_definition_id = row[0]
@@ -53,7 +53,7 @@
field[ 'helptext' ] = field[ 'helptext' ].replace("'", "''").replace('"', "")
field[ 'label' ] = field[ 'label' ].replace("'", "''")
fields_json = to_json_string( fields_list )
- cmd = "UPDATE form_definition SET fields='%s' WHERE id=%i" %( fields_json, form_definition_id )
+ cmd = "UPDATE form_definition f SET f.fields='%s' WHERE f.id=%i" %( fields_json, form_definition_id )
db_session.execute( cmd )
# replace the values list in the content field of the form_values table with a name:value dict
cmd = "SELECT form_values.id, form_values.content, form_definition.fields" \
@@ -112,7 +112,7 @@
cmd = "UPDATE form_values SET content='%s' WHERE id=%i" %( to_json_string( values_list ), form_values_id )
db_session.execute( cmd )
# remove name attribute from the field column of the form_definition table
- cmd = "SELECT id, fields FROM form_definition"
+ cmd = "SELECT f.id, f.fields FROM form_definition f"
result = db_session.execute( cmd )
for row in result:
form_definition_id = row[0]
@@ -124,5 +124,5 @@
for index, field in enumerate( fields_list ):
if field.has_key( 'name' ):
del field[ 'name' ]
- cmd = "UPDATE form_definition SET fields='%s' WHERE id=%i" %( to_json_string( fields_list ), form_definition_id )
+ cmd = "UPDATE form_definition f SET f.fields='%s' WHERE id=%i" %( to_json_string( fields_list ), form_definition_id )
db_session.execute( cmd )
diff -r 50e249442c5a lib/galaxy/model/migrate/versions/0076_fix_form_values_data_corruption.py
--- a/lib/galaxy/model/migrate/versions/0076_fix_form_values_data_corruption.py Thu Apr 07 08:39:07 2011 -0400
+++ b/lib/galaxy/model/migrate/versions/0076_fix_form_values_data_corruption.py Fri Apr 15 11:09:26 2011 -0400
@@ -32,7 +32,7 @@
def upgrade():
print __doc__
metadata.reflect()
- cmd = "SELECT form_values.id as id, form_values.content as field_values, form_definition.fields as fields " \
+ cmd = "SELECT form_values.id as id, form_values.content as field_values, form_definition.fields as fdfields " \
+ " FROM form_definition, form_values " \
+ " WHERE form_values.form_definition_id=form_definition.id " \
+ " ORDER BY form_values.id"
@@ -46,7 +46,7 @@
except Exception, e:
corrupted_rows = corrupted_rows + 1
# content field is corrupted
- fields_list = from_json_string( _sniffnfix_pg9_hex( str( row['fields'] ) ) )
+ fields_list = from_json_string( _sniffnfix_pg9_hex( str( row['fdfields'] ) ) )
field_values_str = _sniffnfix_pg9_hex( str( row['field_values'] ) )
try:
#Encoding errors? Just to be safe.
-j
11 years, 8 months
Support for subdirs in dataset extra_files_path
by Jim Johnson
Request is issue#494 https://bitbucket.org/galaxy/galaxy-central/issue/494/support-sub-dirs-in...
I'm finding that some qiime metagenomics applications build HTML results with an inherent directory structure. For some other applications, e.g. FastQC, I've been able to flatten the hierarchy and edit the html, but that appears problematic for qiime.
Galaxy hasn't supported a dataset extra_files_path hierarchy, though the developers don't seem opposed to the idea: http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-October/003605.html
I added a route in lib/galaxy/web/buildapp.py and modified the dataset download code to traverse a hierarchy in lib/galaxy/web/controllers/dataset.py
I don't think these add any security vulnerabilities, (I tried the obvious ../../ ).
$ hg diff lib/galaxy/web/buildapp.py
diff -r 6ae06d89fec7 lib/galaxy/web/buildapp.py
--- a/lib/galaxy/web/buildapp.py Wed Mar 16 09:01:57 2011 -0400
+++ b/lib/galaxy/web/buildapp.py Wed Mar 16 10:24:13 2011 -0500
@@ -94,6 +94,8 @@
webapp.add_route( '/async/:tool_id/:data_id/:data_secret', controller='async', action='index', tool_id=None, data_id=None, data_secret=None )
webapp.add_route( '/:controller/:action', action='index' )
webapp.add_route( '/:action', controller='root', action='index' )
+ # allow for subdirectories in extra_files_path
+ webapp.add_route( '/datasets/:dataset_id/display/{filename:.+?}', controller='dataset', action='display', dataset_id=None, filename=None)
webapp.add_route( '/datasets/:dataset_id/:action/:filename', controller='dataset', action='index', dataset_id=None, filename=None)
webapp.add_route( '/display_application/:dataset_id/:app_name/:link_name/:user_id/:app_action/:action_param', controller='dataset', action='display_application', dataset_id=None, user_id=None, app_name = None, link_name = None, app_action = None, action_param = None )
webapp.add_route( '/u/:username/d/:slug', controller='dataset', action='display_by_username_and_slug' )
$
$ hg diff lib/galaxy/web/controllers/dataset.py
diff -r 6ae06d89fec7 lib/galaxy/web/controllers/dataset.py
--- a/lib/galaxy/web/controllers/dataset.py Wed Mar 16 09:01:57 2011 -0400
+++ b/lib/galaxy/web/controllers/dataset.py Wed Mar 16 10:24:29 2011 -0500
@@ -266,17 +266,18 @@
log.exception( "Unable to add composite parent %s to temporary library download archive" % data.file_name)
msg = "Unable to create archive for download, please report this error"
messagetype = 'error'
- flist = glob.glob(os.path.join(efp,'*.*')) # glob returns full paths
- for fpath in flist:
- efp,fname = os.path.split(fpath)
- try:
- archive.add( fpath,fname )
- except IOError:
- error = True
- log.exception( "Unable to add %s to temporary library download archive" % fname)
- msg = "Unable to create archive for download, please report this error"
- messagetype = 'error'
- continue
+ for root, dirs, files in os.walk(efp):
+ for fname in files:
+ fpath = os.path.join(root,fname)
+ rpath = os.path.relpath(fpath,efp)
+ try:
+ archive.add( fpath,rpath )
+ except IOError:
+ error = True
+ log.exception( "Unable to add %s to temporary library download archive" % rpath)
+ msg = "Unable to create archive for download, please report this error"
+ messagetype = 'error'
+ continue
if not error:
if params.do_action == 'zip':
archive.close()
11 years, 8 months