jobs stuck in new state
I have many jobs stuck in the 'new' state on our local Galaxy instance. The jobs can't be stopped using the Admin->Manage jobs tool. First, does anyone know why a job would get stuck in the 'new' state for weeks? I have cleaned things up by manually setting their states to 'error' in the MySQL database. Is there a better way of dealing with 'new' jobs? BTW, our Galaxy instance was updated about two weeks ago. Wondering, David Hoover Helix Systems Staff
Hi David, This is pretty common in the case of workflows. When a workflow step fails, the next job in the workflow will be set to the "paused" state and all jobs downstream of the paused job will remain in the "new" state until corrective action is taken. The current query for finding jobs-ready-to-run (if tracking jobs in the database, which is automatically enabled for multiprocess Galaxy configurations) ignores 'new' state jobs whose inputs are not ready, so these jobs sitting around should not cause any harm. --nate On Wed, Mar 26, 2014 at 12:25 PM, David Hoover <hooverdm@helix.nih.gov>wrote:
I have many jobs stuck in the 'new' state on our local Galaxy instance. The jobs can't be stopped using the Admin->Manage jobs tool. First, does anyone know why a job would get stuck in the 'new' state for weeks? I have cleaned things up by manually setting their states to 'error' in the MySQL database. Is there a better way of dealing with 'new' jobs?
BTW, our Galaxy instance was updated about two weeks ago.
Wondering, David Hoover Helix Systems Staff ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Nate, I just wanted to share my experience about this issue. We sometimes have a big number of those jobs in limbo (new, waiting or paused). When we restart Galaxy, I saw that this has an impact on the job handlers. The job handler seems to be adding the new jobs to the job handler queue, and when there are thousands of those, it slows down the restart process. Also we just noticed yesterday that it breaks up the admin Manage Job page. It seems like the page wants to load all of those jobs inside the same page and just times out. If I do not change the state of those jobs in limbo to let's say deleted, we cannot access the admin manage job page. What are your thoughts on that? Thanks, Yves Gagnon From: Nate Coraor <nate@bx.psu.edu> To: David Hoover <hooverdm@helix.nih.gov> Cc: Galaxy Dev <galaxy-dev@bx.psu.edu> Date: 28/03/2014 04:17 PM Subject: Re: [galaxy-dev] jobs stuck in new state Sent by: galaxy-dev-bounces@lists.bx.psu.edu Hi David, This is pretty common in the case of workflows. When a workflow step fails, the next job in the workflow will be set to the "paused" state and all jobs downstream of the paused job will remain in the "new" state until corrective action is taken. The current query for finding jobs-ready-to-run (if tracking jobs in the database, which is automatically enabled for multiprocess Galaxy configurations) ignores 'new' state jobs whose inputs are not ready, so these jobs sitting around should not cause any harm. --nate On Wed, Mar 26, 2014 at 12:25 PM, David Hoover <hooverdm@helix.nih.gov> wrote: I have many jobs stuck in the 'new' state on our local Galaxy instance. The jobs can't be stopped using the Admin->Manage jobs tool. First, does anyone know why a job would get stuck in the 'new' state for weeks? I have cleaned things up by manually setting their states to 'error' in the MySQL database. Is there a better way of dealing with 'new' jobs? BTW, our Galaxy instance was updated about two weeks ago. Wondering, David Hoover Helix Systems Staff ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
In my case it was uncompleted metadata in one of the input files. (but maybe it was not "new" state but something else?) HTH, ido On Mar 26, 2014, at 5:25 PM, David Hoover <hooverdm@helix.nih.gov> wrote:
I have many jobs stuck in the 'new' state on our local Galaxy instance. The jobs can't be stopped using the Admin->Manage jobs tool. First, does anyone know why a job would get stuck in the 'new' state for weeks? I have cleaned things up by manually setting their states to 'error' in the MySQL database. Is there a better way of dealing with 'new' jobs?
BTW, our Galaxy instance was updated about two weeks ago.
Wondering, David Hoover Helix Systems Staff ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
This turned out to be my own ignorance. After creating identical handlers in both universe_wsgi.ini and job_conf.xml, and restarting Galaxy in daemon mode, the jobs became resilient to Galaxy restarts. DOH! Thanks Nate for pointing this out, as well as the limits collection. David On Mar 31, 2014, at 12:20 PM, Ido Tamir <tamir@imp.ac.at> wrote:
In my case it was uncompleted metadata in one of the input files. (but maybe it was not "new" state but something else?)
HTH, ido
On Mar 26, 2014, at 5:25 PM, David Hoover <hooverdm@helix.nih.gov> wrote:
I have many jobs stuck in the 'new' state on our local Galaxy instance. The jobs can't be stopped using the Admin->Manage jobs tool. First, does anyone know why a job would get stuck in the 'new' state for weeks? I have cleaned things up by manually setting their states to 'error' in the MySQL database. Is there a better way of dealing with 'new' jobs?
BTW, our Galaxy instance was updated about two weeks ago.
Wondering, David Hoover Helix Systems Staff ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (4)
-
David Hoover
-
Ido Tamir
-
Nate Coraor
-
Yves Gagnon