I have not seen any reply to this question from last year, so I wanted to re-up it
again...
I also run into this issue quite often and with the recent introduction of the (half
completed?) feature of the notion of Paused-Jobs it seems we are getting very close to
this working...
I know I can re-run a failed job, but I don't think restarting a paused job that is
reliant on that first job "knows" to wiat for the new job to finish, does it?
But I suspect that is what the intended functionality is, since otherwise paused jobs are
not very useful....
I am still using the Feb-8 version of Galaxy, but don't think I saw anything in the
April version that addresses this issue, right?
Maybe it would be useful to make one particular error state ("Job did not return any
result from the cluster" or something of that sort that I see if a cluster node
fails) make Galaxy simply re-submit the job (with a fixed number of tries ofcourse, 3
seems a decent number) and keep on going, rather than immediately make the job go into
error state...
Thanks,
Thon
On Apr 19, 2012, at 09:39 AM, zhengqiu cai <caizhq2005(a)yahoo.com.cn> wrote:
Hi,
Can Galaxy resubmit a job if the node where the job is running fails?
I know sge can do that by using qsub -r.
It should be very useful if Galaxy can do that.
Thank you,
Cai
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client. To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/