On Thursday, January 17, 2013, Carlos Borroto wrote:
On Wed, Jan 16, 2013 at 7:28 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
>> Renaming the file to replace the colon with (say) an underscore allows
>> a manual qsub to work fine with UGE. I've edited Galaxy to avoid the
>> colons (patch below) but the submission still fails.
>>

Hi Peter,

After seeing your email I now wonder if the problem I described
here[1] and didn't get any answer about it is related to your findings
while trying UGE.

[1]http://dev.list.galaxyproject.org/Issue-when-enabling-use-tasked-jobs-with-torque-and-nfs-td4657294.html

I noticed the only mayor different I can notice between jobs
submission with and without tasked option enabled is a colon in the
name.

Some overlap yes, and I do normally have BLAST running with
task splitting. That does probably explain the source of the colon.
I now suspect the colon is only a problem in SGE / UGE at the
command line using qsub - it may work via the Python API.

I have also seen some similar problems with sub-jobs failing
yet Galaxy still had the task as a green success (not sure
if that has happens recently or not).

Yesterday while our UGE cluster was under load, often one
or more of my split task jobs would fail (Galaxy could not
collect the output files - presumably network related for
accessing the shared storage), but Galaxy was correctly
flagging these as red failed jobs.

Cluster problems are hard to debug - especially on
a live cluster where there are other users active :(

Peter