Galaxy and PBS/Torque Cluster
Hello All, We are using a Bio-Linux 8 Galaxy instance and are currently trying to integrate in into a PBS/Torque cluster. Currently we are able to run jobs locally but not via PBS. Can anyone provide us with a step by step guide to configure Galaxy and the PBS. I have included some of the errors we are receiving below: galaxy.datatypes.metadata DEBUG 2015-08-19 14:56:42,920 Failed to cleanup MetadataTempFile temp files from /usr/lib/galaxy-server/database/job_working_directory/000/96 /metadata_out_HistoryDatasetAssociation_100_Kk1ajw: No JSON object could be decoded galaxy.jobs.runners.pbs ERROR 2015-08-19 14:56:42,960 Connection to PBS server for submit failed: 0: no error We can not figure out what is stopping our jobs from entering the PBS queue. Below is our job_conf.xml. <?xml version="1.0"?> <job_conf> <plugins workers="2"> <!-- 'workers' is the number of threads for the runner's work queue. --> <!-- The default from <plugins> is used if not defined for a <plugin> --> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="2"/> <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner"/> </plugins> <handlers> <handler id="main"/> <!-- Additional job handlers - the id should match the name of a [server:<id>] in universe_wsgi.ini --> </handlers> <destinations default="defaultq"> <!-- Destinations define details about remote resources and how jobs should be executed on those remote resources --> <destination id="defaultq" runner="pbs" tags="cluster"> <param id="-q">defaultq</param> <param id="-l">walltime=24:00:00</param> </destination> </destinations> </job_conf> Thank you very much for all the assistance! Regards, Ryan -- Ryan S. Johnson, PhD Applications Scientist Center for Advanced Research Computing P: (650) 430-6194 E:rjohns03@carc.unm.edu
Ryan, Not sure what the problem is. I'm not sure the Galaxy that is distributed with Biolinux is really setup to be used with PBS/Torque. If you have pbs/torque installed and running properly and if you can submit job scripts via the qsub command as whatever user Galaxy runs under - the job_conf.xml file you have should be fine. Have you scrambled a pbs_python egg for your box? LIBTORQUE_DIR=/path/to/libtorque python scripts/scramble.py -e pbs_python (https://wiki.galaxyproject.org/Admin/Config/Performance/Cluster#PBS) The dependencies for that Galaxy instance may be a little different since it is installed with a package manager - I am not sure if it allows scrambling eggs dynamically like that. It might be worth downloading a fresh Galaxy into your home directory or something and configure and test it to use PBS. This would allow you to narrow the problem down to something about Galaxy versus something about the PBS torque setup. -John On Wed, Aug 19, 2015 at 10:17 PM, Ryan Johnson <rjohns03@carc.unm.edu> wrote:
Hello All,
We are using a Bio-Linux 8 Galaxy instance and are currently trying to integrate in into a PBS/Torque cluster. Currently we are able to run jobs locally but not via PBS. Can anyone provide us with a step by step guide to configure Galaxy and the PBS.
I have included some of the errors we are receiving below:
galaxy.datatypes.metadata DEBUG 2015-08-19 14:56:42,920 Failed to cleanup MetadataTempFile temp files from /usr/lib/galaxy-server/database/job_working_directory/000/96 /metadata_out_HistoryDatasetAssociation_100_Kk1ajw: No JSON object could be decoded
galaxy.jobs.runners.pbs ERROR 2015-08-19 14:56:42,960 Connection to PBS server for submit failed: 0: no error
We can not figure out what is stopping our jobs from entering the PBS queue. Below is our job_conf.xml.
<?xml version="1.0"?> <job_conf> <plugins workers="2"> <!-- 'workers' is the number of threads for the runner's work queue. --> <!-- The default from <plugins> is used if not defined for a <plugin> --> <plugin id="local" type="runner" load="galaxy.jobs.runners.local:LocalJobRunner" workers="2"/> <plugin id="pbs" type="runner" load="galaxy.jobs.runners.pbs:PBSJobRunner"/> </plugins> <handlers> <handler id="main"/> <!-- Additional job handlers - the id should match the name of a [server:<id>] in universe_wsgi.ini --> </handlers> <destinations default="defaultq"> <!-- Destinations define details about remote resources and how jobs should be executed on those remote resources --> <destination id="defaultq" runner="pbs" tags="cluster"> <param id="-q">defaultq</param> <param id="-l">walltime=24:00:00</param> </destination> </destinations> </job_conf>
Thank you very much for all the assistance!
Regards, Ryan
-- Ryan S. Johnson, PhD Applications Scientist Center for Advanced Research Computing P: (650) 430-6194 E:rjohns03@carc.unm.edu
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
John Chilton
-
Ryan Johnson