Re: [galaxy-dev] Possible bug in drmaa and Galaxy?

2 Aug 2016

      Hi Nicola,

Ah!  I see!  I misunderstood how the logs worked.

I thought whatever message Cuffdiff returns with, it passes it to the
scheduler.  Then, the scheduler in turn passes it to Galaxy.  I was
confused why Cuffdiff and Galaxy recognized the mistake but the
scheduler did not.

Now I see!  When either reports success or failure, they are actually
reporting on two different things.  Presumably, this means it's
impossible for the scheduler to report an error and Galaxy to report a
success?

Thanks a lot for taking the time to correct me!  I understand now!

Ray

On Tue, Aug 2, 2016 at 6:12 PM, Nicola Soranzo <nsoranzo@tiscali.it> wrote:
...
Hi Ray,
I don't see anything strange in the Galaxy logs. The job on the cluster
"finished normally" just means that it completed its execution without being
killed by the cluster scheduler or for other external reasons. The second
line in the logs is telling you that Galaxy has seen that the job returned
an exit code greater than 0, which under Unix means that it terminated with
some error. Therefore Galaxy set the state of the output datasets to Error
and you should see them in red in your history.
Obviously you may want to fix the cause of the job failure, but it don't
think that this is due to a bug in drmaa or Galaxy itself, probably just the
cufflinks dependency you mentioned.
Cheers,
Nicola
On 02/08/16 10:42, Raymond Wan wrote:
...
Dear all,
I'm not sure but I think I found a bug somewhere.  However, I can't
tell where the problem is and, so, who to report it to.
This is what I see in paster.log:
-----
galaxy.jobs.runners.drmaa DEBUG 2016-08-01 13:02:08,895 (897/193391)
state change: job finished normally
galaxy.jobs.output_checker INFO 2016-08-01 13:02:09,244 Job 897: Fatal
error: Exit code 1 ()
galaxy.jobs DEBUG 2016-08-01 13:02:09,317 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,354 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,678 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,707 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,738 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,770 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,809 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,846 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,890 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,928 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:09,964 setting dataset state to ERROR
galaxy.jobs DEBUG 2016-08-01 13:02:10,001 setting dataset state to ERROR
galaxy.jobs INFO 2016-08-01 13:02:11,560 Collecting metrics for Job 897
galaxy.jobs DEBUG 2016-08-01 13:02:11,579 job 897 ended (finish()
executed in (2402.377 ms))
-----
As it turns out, there *is* a problem with the run.  So, the latter
error messages are correct.  So, I think the first line is wrong.  Or
is this expected behaviour?
About what I'm doing.  I'm running the IUC Cuffdiff tool and am using
it to generate an sqlite database for cummeRbund.  It seems that it is
running the system-installed version of R and some packages are
missing.  (I'm having problems installing the R packages, but I'm
still investigating this.)
So, there are error messages in the output.  Perhaps there is a
problem with the error message being returned by the tool to the drmaa
(SLURM, in our case)?
Ray
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
   https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at:
   http://galaxyproject.org/search/mailinglists/