On Nov 8, 2011, at 6:34 PM, Andrew Warren wrote:
Hi Nate,
I am running in daemon mode so this out of paster.log.
The job gets submitted to the PBS queue normally but the error shows up in history panel
right away (also the working directory shows up during the cufflinks run):
Hi Andrew,
The job should not even be able to be submitted to the PBS queue. The error you're
seeing in the history (Unable to run job due to a misconfiguration of the Galaxy job
running system. Please contact a site administrator.) is logged when the job manager
tries to place the job in the queue of a job runner which has been defined in the Galaxy
configuration but has not properly loaded. This is right after a statement which should
log the failure:
log.error( 'put(): (%s) Invalid job runner: %s' % ( job_wrapper.job_id,
runner_name ) )
I'm not sure how that message could not appear in the log when the associated message
is appearing in the history item. Can you confirm that the output below was for a job
that immediately failed with the above error message? By any chance, are you running
multiple Galaxy servers simultaneously with job running enabled?
--nate
galaxy.jobs DEBUG 2011-10-24 17:51:25,567 dispatching job 595 to pbs
runner
galaxy.jobs INFO 2011-10-24 17:51:25,772 job 595 dispatched
galaxy.jobs.runners.pbs DEBUG 2011-10-24 17:51:26,153 (595) submitting file
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/pbs/595.sh
galaxy.jobs.runners.pbs DEBUG 2011-10-24 17:51:26,155 (595) command is: python
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/tools/ngs_rna/cufflinks_wrapper.py
--input=/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/files/001/dataset_1866.dat
--assembled-isoforms-output=/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/files/002/dataset_2161.dat
--num-threads="6" -I 300000 -F 0.05
-j 0.05 -g
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/files/001/dataset_1699.dat
-b
--ref_file=/opt/rnaseq_data/indices/bowtie/Salmonella/14028S/Salmonella_enterica_subsp_enterica_serovar_Typhimurium_str_14028S.fna
--dbkey=14028S
--index_dir=/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/tool-data
galaxy.jobs.runners.pbs DEBUG 2011-10-24 17:51:26,157 (595) queued in batch queue as
303.localhost
galaxy.jobs DEBUG 2011-10-24 17:51:26,599 dispatching job 596 to pbs runner
galaxy.jobs INFO 2011-10-24 17:51:26,738 job 596 dispatchedgalaxy.jobs.runners.pbs DEBUG
2011-10-24 17:51:26,767 (595/303.localhost) PBS job state changed from N to R
Then later with no errors in paster.log when the cufflinks job is finishing (notice that
it doesn't try to copy the contents of the working directory for this job like it
normally does because galaxy thinks the job wasn't submitted to the queue even though
it was):
galaxy.jobs.runners.pbs DEBUG 2011-10-24 19:05:57,402 (595/303.localhost) PBS job has
left queue
galaxy.jobs.runners.pbs DEBUG 2011-10-24 19:05:57,402 (596/304.localhost) PBS job state
changed from Q to R
galaxy.jobs.runners.pbs DEBUG 2011-10-24 21:29:01,995 (596/304.localhost) PBS job has
left queue
galaxy.jobs.runners.pbs DEBUG 2011-10-24 21:29:01,995 (597/305.localhost) PBS job state
changed from Q to R
galaxy.jobs DEBUG 2011-10-24 21:29:02,999 finish(): Moved
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/job_working_directory/596/isoforms.fpkm_tracking
to
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/files/002/dataset_2163.dat
as directed by from_work_dir
galaxy.jobs DEBUG 2011-10-24 21:29:03,555 finish(): Moved
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/job_working_directory/596/genes.fpkm_tracking
to
/opt/hts_software/galaxy-parent/galaxy_server/galaxy-dist/database/files/002/dataset_2162.dat
as directed by from_work_dir
Thanks,
Andrew
On Mon, Nov 7, 2011 at 4:00 PM, Nate Coraor <nate(a)bx.psu.edu> wrote:
On Oct 25, 2011, at 2:43 AM, Andrew Warren wrote:
> Hello,
> I recently encountered a problem when trying to run Cufflinks on eight BAM files on
our galaxy instance (via the multi-input toggle) and received the error: "Unable to
run job due to a misconfiguration of the Galaxy job running system" for some, but not
all, of the cufflinks jobs that appear in the history. These particular BAM files were
copied over from the history of a larger workflow where they were successfully run through
cufflinks. In the case of the problem workflow run, the cufflinks jobs are all
successfully submitted to Torque PBS and continue to run and finish but many have this
error displayed in the History. The jobs with the error displayed fail to copy files from
the working directory to the database directory despite running to completion. We recently
updated to galaxy-central changeset 6131:be6c89c33639. I have seen this error multiple
times for this workflow, even after restarting galaxy. Does anyone have any ideas of what
might be going wrong?
Hi Andrew,
The error you're receiving indicates that there should also be a traceback logged to
the Galaxy server log file or output when this occurs. Could you check the log/output for
such a traceback?
Thanks,
--nate
>
> Thanks for any help,
> Andrew Warren
> ___________________________________________________________
> Please keep all replies on the list by using "reply all"
> in your mail client. To manage your subscriptions to this
> and other Galaxy lists, please use the interface at:
>
>
http://lists.bx.psu.edu/