We experienced an issue where some of the galaxy jobs were sitting in the 'new' state for a quite long time. They were not waiting for cluster resources to become available, but haven't been even queued up through DRMAA. We are currently using non-debug mode and following were my observations: * No indication of new jobs in paster.log file * database/pbs script didn't contain any associated job scripts * in backend database - job table contained their galaxy job id but no command_line input was recorded Also, not all the jobs are waiting in the 'new' state. Many jobs submitted after above waiting jobs got completed successfully on the cluster. Is there any job submission logic within galaxy which is being used for submitting jobs? Any clues on how to debug this issue will be really helpful. -- Thanks, Shantanu.
Hi, you didn't specify what clustering method u r using. Are u using drmaa or pbs? With Regards, ---------------- Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 On Fri, Jul 29, 2011 at 10:03 AM, Shantanu Pavgi <pavgi@uab.edu> wrote:
We experienced an issue where some of the galaxy jobs were sitting in the 'new' state for a quite long time. They were not waiting for cluster resources to become available, but haven't been even queued up through DRMAA. We are currently using non-debug mode and following were my observations: * No indication of new jobs in paster.log file * database/pbs script didn't contain any associated job scripts * in backend database - job table contained their galaxy job id but no command_line input was recorded
Also, not all the jobs are waiting in the 'new' state. Many jobs submitted after above waiting jobs got completed successfully on the cluster. Is there any job submission logic within galaxy which is being used for submitting jobs? Any clues on how to debug this issue will be really helpful.
-- Thanks, Shantanu. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Thanks for the reply Ambarish. We are using SGE cluster and job submission is done using drmaa. -- Shantanu. On Jul 28, 2011, at 7:24 PM, ambarish biswas wrote: Hi, you didn't specify what clustering method u r using. Are u using drmaa or pbs? With Regards, ---------------- Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 On Fri, Jul 29, 2011 at 10:03 AM, Shantanu Pavgi <pavgi@uab.edu<mailto:pavgi@uab.edu>> wrote: We experienced an issue where some of the galaxy jobs were sitting in the 'new' state for a quite long time. They were not waiting for cluster resources to become available, but haven't been even queued up through DRMAA. We are currently using non-debug mode and following were my observations: * No indication of new jobs in paster.log file * database/pbs script didn't contain any associated job scripts * in backend database - job table contained their galaxy job id but no command_line input was recorded Also, not all the jobs are waiting in the 'new' state. Many jobs submitted after above waiting jobs got completed successfully on the cluster. Is there any job submission logic within galaxy which is being used for submitting jobs? Any clues on how to debug this issue will be really helpful. -- Thanks, Shantanu. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
hi, can you paste the configuration from your *universe_wsgi.ini* file. With Regards, ---------------- Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866 On Fri, Jul 29, 2011 at 12:31 PM, Shantanu Pavgi <pavgi@uab.edu> wrote:
Thanks for the reply Ambarish. We are using SGE cluster and job submission is done using drmaa.
-- Shantanu.
On Jul 28, 2011, at 7:24 PM, ambarish biswas wrote:
Hi, you didn't specify what clustering method u r using. Are u using drmaa or pbs?
With Regards, ---------------- Ambarish Biswas, University of Otago Department of Biochemistry, Dunedin, New Zealand, Tel: +64(22)0855647 Fax: +64(0)3 479 7866
On Fri, Jul 29, 2011 at 10:03 AM, Shantanu Pavgi <pavgi@uab.edu> wrote:
We experienced an issue where some of the galaxy jobs were sitting in the 'new' state for a quite long time. They were not waiting for cluster resources to become available, but haven't been even queued up through DRMAA. We are currently using non-debug mode and following were my observations: * No indication of new jobs in paster.log file * database/pbs script didn't contain any associated job scripts * in backend database - job table contained their galaxy job id but no command_line input was recorded
Also, not all the jobs are waiting in the 'new' state. Many jobs submitted after above waiting jobs got completed successfully on the cluster. Is there any job submission logic within galaxy which is being used for submitting jobs? Any clues on how to debug this issue will be really helpful.
-- Thanks, Shantanu. ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Thu, Jul 28, 2011 at 11:03 PM, Shantanu Pavgi <pavgi@uab.edu> wrote:
We experienced an issue where some of the galaxy jobs were sitting in the 'new' state for a quite long time. They were not waiting for cluster resources to become available, but haven't been even queued up through DRMAA. We are currently using non-debug mode and following were my observations: * No indication of new jobs in paster.log file * database/pbs script didn't contain any associated job scripts * in backend database - job table contained their galaxy job id but no command_line input was recorded
Also, not all the jobs are waiting in the 'new' state. Many jobs submitted after above waiting jobs got completed successfully on the cluster. Is there any job submission logic within galaxy which is being used for submitting jobs? Any clues on how to debug this issue will be really helpful.
I've just been searching the archives for any other cases of new jobs not getting queued (either with the local runner or via DRMAA) but sitting in state new - and I found your query. Did you ever solve your issue Shantanu? We had something similar just happen, but it affected all new jobs, unlike what you described. Fortunately I could work out what the cause was - our Galaxy partition had run out of disk space. I did some cleanup, and then I could submit and run new jobs - but the existing stalled jobs remained stalled. Peter
participants (3)
-
ambarish biswas
-
Peter Cock
-
Shantanu Pavgi