Hi,

We have a problem on our production instances since one or two years (never found the time to write it and we still hope to find by ourself).

Some jobs stay in "waiting" status (grey) indefinitely.

Job ID User Last Update Tool State Inputs Command Line Job Runner PID/Cluster ID
63094 foo@bar.fr 10 hours ago toolshed.g2.bx.psu.edu/repos/devteam/fastq_groomer/fastq_groomer/1.0.4 new 123250 setting_metadata None None None
63093 foo@bar.fr 10 hours ago toolshed.g2.bx.psu.edu/repos/devteam/fastq_groomer/fastq_groomer/1.0.4 new 123249 setting_metadata None None None
63088 foo@bar.fr 14 hours ago testtoolshed.g2.bx.psu.edu/repos/jjohnson/trinityrnaseq/trinityrnaseq_norm/0.0.2 new 123250 setting_metadata, 123249 setting_metadata None None None

Our scheduler is SGE.

We tried to found the origin of this issue :
 - there is not relevant clues in the log files
 - sometimes, a restart make start those jobs
 - sometimes, kill a job make the others start
 - sometimes, there is no choice than resubmit

Is anyone can propose an idea to fix this behaviour ?

Thanks

Gildas