Dear all, I'm not sure but I think I found a bug somewhere. However, I can't tell where the problem is and, so, who to report it to. This is what I see in paster.log: ----- galaxy.jobs.runners.drmaa DEBUG 2016-08-01 13:02:08,895 (897/193391) state change: job finished normally galaxy.jobs.output_checker INFO 2016-08-01 13:02:09,244 Job 897: Fatal error: Exit code 1 () galaxy.jobs DEBUG 2016-08-01 13:02:09,317 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,354 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,678 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,707 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,738 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,770 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,809 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,846 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,890 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,928 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,964 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:10,001 setting dataset state to ERROR galaxy.jobs INFO 2016-08-01 13:02:11,560 Collecting metrics for Job 897 galaxy.jobs DEBUG 2016-08-01 13:02:11,579 job 897 ended (finish() executed in (2402.377 ms)) ----- As it turns out, there *is* a problem with the run. So, the latter error messages are correct. So, I think the first line is wrong. Or is this expected behaviour? About what I'm doing. I'm running the IUC Cuffdiff tool and am using it to generate an sqlite database for cummeRbund. It seems that it is running the system-installed version of R and some packages are missing. (I'm having problems installing the R packages, but I'm still investigating this.) So, there are error messages in the output. Perhaps there is a problem with the error message being returned by the tool to the drmaa (SLURM, in our case)? Ray
Hi Ray, I don't see anything strange in the Galaxy logs. The job on the cluster "finished normally" just means that it completed its execution without being killed by the cluster scheduler or for other external reasons. The second line in the logs is telling you that Galaxy has seen that the job returned an exit code greater than 0, which under Unix means that it terminated with some error. Therefore Galaxy set the state of the output datasets to Error and you should see them in red in your history. Obviously you may want to fix the cause of the job failure, but it don't think that this is due to a bug in drmaa or Galaxy itself, probably just the cufflinks dependency you mentioned. Cheers, Nicola On 02/08/16 10:42, Raymond Wan wrote:
Dear all,
I'm not sure but I think I found a bug somewhere. However, I can't tell where the problem is and, so, who to report it to.
This is what I see in paster.log:
----- galaxy.jobs.runners.drmaa DEBUG 2016-08-01 13:02:08,895 (897/193391) state change: job finished normally galaxy.jobs.output_checker INFO 2016-08-01 13:02:09,244 Job 897: Fatal error: Exit code 1 () galaxy.jobs DEBUG 2016-08-01 13:02:09,317 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,354 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,678 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,707 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,738 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,770 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,809 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,846 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,890 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,928 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,964 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:10,001 setting dataset state to ERROR galaxy.jobs INFO 2016-08-01 13:02:11,560 Collecting metrics for Job 897 galaxy.jobs DEBUG 2016-08-01 13:02:11,579 job 897 ended (finish() executed in (2402.377 ms)) -----
As it turns out, there *is* a problem with the run. So, the latter error messages are correct. So, I think the first line is wrong. Or is this expected behaviour?
About what I'm doing. I'm running the IUC Cuffdiff tool and am using it to generate an sqlite database for cummeRbund. It seems that it is running the system-installed version of R and some packages are missing. (I'm having problems installing the R packages, but I'm still investigating this.)
So, there are error messages in the output. Perhaps there is a problem with the error message being returned by the tool to the drmaa (SLURM, in our case)?
Ray ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Nicola, Ah! I see! I misunderstood how the logs worked. I thought whatever message Cuffdiff returns with, it passes it to the scheduler. Then, the scheduler in turn passes it to Galaxy. I was confused why Cuffdiff and Galaxy recognized the mistake but the scheduler did not. Now I see! When either reports success or failure, they are actually reporting on two different things. Presumably, this means it's impossible for the scheduler to report an error and Galaxy to report a success? Thanks a lot for taking the time to correct me! I understand now! Ray On Tue, Aug 2, 2016 at 6:12 PM, Nicola Soranzo <nsoranzo@tiscali.it> wrote:
Hi Ray, I don't see anything strange in the Galaxy logs. The job on the cluster "finished normally" just means that it completed its execution without being killed by the cluster scheduler or for other external reasons. The second line in the logs is telling you that Galaxy has seen that the job returned an exit code greater than 0, which under Unix means that it terminated with some error. Therefore Galaxy set the state of the output datasets to Error and you should see them in red in your history.
Obviously you may want to fix the cause of the job failure, but it don't think that this is due to a bug in drmaa or Galaxy itself, probably just the cufflinks dependency you mentioned.
Cheers, Nicola
On 02/08/16 10:42, Raymond Wan wrote:
Dear all,
I'm not sure but I think I found a bug somewhere. However, I can't tell where the problem is and, so, who to report it to.
This is what I see in paster.log:
----- galaxy.jobs.runners.drmaa DEBUG 2016-08-01 13:02:08,895 (897/193391) state change: job finished normally galaxy.jobs.output_checker INFO 2016-08-01 13:02:09,244 Job 897: Fatal error: Exit code 1 () galaxy.jobs DEBUG 2016-08-01 13:02:09,317 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,354 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,678 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,707 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,738 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,770 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,809 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,846 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,890 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,928 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:09,964 setting dataset state to ERROR galaxy.jobs DEBUG 2016-08-01 13:02:10,001 setting dataset state to ERROR galaxy.jobs INFO 2016-08-01 13:02:11,560 Collecting metrics for Job 897 galaxy.jobs DEBUG 2016-08-01 13:02:11,579 job 897 ended (finish() executed in (2402.377 ms)) -----
As it turns out, there *is* a problem with the run. So, the latter error messages are correct. So, I think the first line is wrong. Or is this expected behaviour?
About what I'm doing. I'm running the IUC Cuffdiff tool and am using it to generate an sqlite database for cummeRbund. It seems that it is running the system-installed version of R and some packages are missing. (I'm having problems installing the R packages, but I'm still investigating this.)
So, there are error messages in the output. Perhaps there is a problem with the error message being returned by the tool to the drmaa (SLURM, in our case)?
Ray ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
Nicola Soranzo
-
Raymond Wan