I have not seen any reply to this question from last year, so I wanted to re-up it again...

I also run into this issue quite often and with the recent introduction of the (half completed?) feature of the notion of Paused-Jobs it seems we are getting very close to this working...

I know I can re-run a failed job, but I don't think restarting a paused job that is reliant on that first job "knows" to wiat for the new job to finish, does it?

But I suspect that is what the intended functionality is, since otherwise paused jobs are not very useful....

I am still using the Feb-8 version of Galaxy, but don't think I saw anything in the April version that addresses this issue, right?

Maybe it would be useful to make one particular error state ("Job did not return any result from the cluster" or something of that sort that I see if a cluster node fails) make Galaxy simply re-submit the job (with a fixed number of tries ofcourse, 3 seems a decent number) and keep on going, rather than immediately make the job go into error state...

Thanks,

Thon

On Apr 19, 2012, at 09:39 AM, zhengqiu cai <caizhq2005@yahoo.com.cn> wrote:

Hi,

Can Galaxy resubmit a job if the node where the job is running fails?
I know sge can do that by using qsub -r.

It should be very useful if Galaxy can do that.

Thank you,

Cai

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

http://lists.bx.psu.edu/