Hello Jean-François,

  Have you made any progress tracking down this error? This appears very serious, but to tell you the truth I have no clue what could cause it. The distribution you are using is pretty old at this point I feel like if it was a bug the exhibited under relatively standard parameter combinations someone else would have reported it by now. 

  Can you tell me some things: has this been reported with any other workflows? Is there anything special about this workflow? Can you rebuild the workflow and see if the error occurs again?

  Additional questions if the problem is not restricted to the workflow: are you running Galaxy as a single process or multiple processes? If multiple processes, how many web, handler, and manager processes do you have? Are they all on the same machine? Have you made any modifications to Galaxy that could result in this behavior? What is the value of track_jobs_in_database in your universe_wsgi.ini configuration file?

-John


On Thu, Nov 7, 2013 at 10:34 AM, Jean-Francois Payotte <jean-francois.payotte@dnalandmarks.ca> wrote:
Dear Galaxy mailing-list,

Once again I come seeking for your help. I hope someone already had this issue or will have an idea on where to look to solve it. :)

One of our users reported having workflows failing because some steps were executed before all their inputs where ready.
You can find a screenshot attached, where we can see that step (42) "Sort on data 39" has been executed while step (39) is still waiting to run (gray box).

This behaviour has been reproduced with at least two different Galaxy tools (one custom, and the sort tool which comes standard with Galaxy).
This behaviour seems to be a little bit random, as running two times a workflow where this issue occurs, only one time did some steps were executed in the wrong order.

I could be wrong, but I don't think this issue is grid-related as, from my understanding, Galaxy is not using SGE job dependencies functionality.
I believe all jobs stays in some internal queues (within Galaxy) until all input files are ready, and only then the job is submitted to the cluster.

Any help or any hint on what to look at to solve this issue would be greatly appreciated.
We have updated our Galaxy instance to August 12th distribution on October 1st, and I believe we never experienced this issue before the update.

Many thanks for your help,
Jean-François



___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/