On Feb 10, 2012, at 6:47 AM, Peter Cock wrote:
Hello all,
I've noticed we have about a dozen stalled upload jobs on our server from several users. e.g.
Job ID User Last Update Tool State Command Line Job Runner PID/Cluster ID 2352 xxxx 21 hours ago upload1 upload None None None ... 2339 yyyy 19 hours ago upload1 upload None None None
The job numbers are consecutive (2339 to 2352) and reflect a problem for a couple of hours yesterday morning. I believe this was due to the underlying file system being unmounted (without restarting Galaxy), and at the time restarting Galaxy fixed uploading files. Test jobs since then have completed normally - but these zombie jobs remain.
Using the "Stop jobs" option does not clear these dead upload jobs.
Restarting the Galaxy server does not clear them either.
This is our production server and was running galaxy-dist, changeset 5743:720455407d1c - which I have now updated to the current release, 6621:26920e20157f - which makes no difference to these stalled jobs.
Does anyone have any insight into what might be wrong, and how to get rid of these zombie tasks?
Hi Peter, Are you using the nginx upload module? There's no way to fix these from within Galaxy, unfortunately. You'll have to update them in the database. --nate
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: