Merging files: Error merging filesglobal name 'shutil' is not defined
Hi all, While testing a BLAST search using plain text output (the 'txt' datatype in Galaxy) or subclasses (like 'blastxml') using galaxy-dist with task splitting enabled, I found I new bug when the files are merged: Error merging files global name 'shutil' is not defined This is a side effect of this commit due to a missing import (perhaps the imports changed since Bjoern wrote the patch): https://bitbucket.org/galaxy/galaxy-central/commits/510b063792de89332428ed12... In addition to the missing import in data.py, there is a potential similar issue with binary.py but this is masked by an evil import * line ;) My suggested patch for the import issue below, however right now even though the merge is failing, Galaxy failed to treat the job as a failure - it was shown as a green successful job with "Error merging files" as stdout, and for stderr: "global name 'shutil' is not defined", but return code 0. So there is a deeper problem with the merge error not marking the job as failed. Also there is a minor bug in that the stdout and stderr are combined for the peep text without any white space - presumably most of the time the stdout will have a trailing new line, but not here - thus "filesglobal" [sic], not say "files global" or a newline. There's a second patch below for that minor issue too (although it seems my new lines get turned into spaces by the time they are shown in the history peep text). Regards, Peter -- $ hg diff diff -r 8b9ca63f9128 lib/galaxy/datatypes/binary.py --- a/lib/galaxy/datatypes/binary.py Wed Apr 24 16:24:41 2013 -0400 +++ b/lib/galaxy/datatypes/binary.py Thu Apr 25 11:32:51 2013 +0100 @@ -12,7 +12,7 @@ from bx.seq.twobit import TWOBIT_MAGIC_NUMBER, TWOBIT_MAGIC_NUMBER_SWAP, TWOBIT_MAGIC_SIZE from urllib import urlencode, quote_plus import zipfile, gzip -import os, subprocess, tempfile +import os, subprocess, tempfile, shutil import struct log = logging.getLogger(__name__) diff -r 8b9ca63f9128 lib/galaxy/datatypes/data.py --- a/lib/galaxy/datatypes/data.py Wed Apr 24 16:24:41 2013 -0400 +++ b/lib/galaxy/datatypes/data.py Thu Apr 25 11:32:51 2013 +0100 @@ -3,6 +3,7 @@ import mimetypes import os import sys +import shutil import tempfile import zipfile from cgi import escape $ hg diff diff -r 8b9ca63f9128 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Wed Apr 24 16:24:41 2013 -0400 +++ b/lib/galaxy/jobs/__init__.py Thu Apr 25 11:43:30 2013 +0100 @@ -924,7 +924,13 @@ for dataset in dataset_assoc.dataset.dataset.history_associations + dataset_assoc.dataset.dataset.library_associations: #need to update all associated output hdas, i.e. history was shared with job running dataset.blurb = 'done' dataset.peek = 'no peek' - dataset.info = ( dataset.info or '' ) + context['stdout'] + context['stderr'] + dataset.info = (dataset.info or '') + if context['stdout'].strip(): + #Ensure white space between entries + dataset.info = dataset.info.rstrip() + "\n" + context['stdout'].strip() + if context['stderr'].strip(): + #Ensure white space between entries + dataset.info = dataset.info.rstrip() + "\n" + context['stderr'].strip() dataset.tool_version = self.version_string dataset.set_size() if 'uuid' in context:
Hey Peter, Thanks for the contributions, I've committed the changes to -central. You're definitely right about the merge failure not setting the job status to error, and I'm looking into that now. -Dannon On Thu, Apr 25, 2013 at 6:51 AM, Peter Cock <p.j.a.cock@googlemail.com>wrote:
Hi all,
While testing a BLAST search using plain text output (the 'txt' datatype in Galaxy) or subclasses (like 'blastxml') using galaxy-dist with task splitting enabled, I found I new bug when the files are merged:
Error merging files global name 'shutil' is not defined
This is a side effect of this commit due to a missing import (perhaps the imports changed since Bjoern wrote the patch):
https://bitbucket.org/galaxy/galaxy-central/commits/510b063792de89332428ed12...
In addition to the missing import in data.py, there is a potential similar issue with binary.py but this is masked by an evil import * line ;)
My suggested patch for the import issue below, however right now even though the merge is failing, Galaxy failed to treat the job as a failure - it was shown as a green successful job with "Error merging files" as stdout, and for stderr: "global name 'shutil' is not defined", but return code 0.
So there is a deeper problem with the merge error not marking the job as failed.
Also there is a minor bug in that the stdout and stderr are combined for the peep text without any white space - presumably most of the time the stdout will have a trailing new line, but not here - thus "filesglobal" [sic], not say "files global" or a newline. There's a second patch below for that minor issue too (although it seems my new lines get turned into spaces by the time they are shown in the history peep text).
Regards,
Peter
--
$ hg diff diff -r 8b9ca63f9128 lib/galaxy/datatypes/binary.py --- a/lib/galaxy/datatypes/binary.py Wed Apr 24 16:24:41 2013 -0400 +++ b/lib/galaxy/datatypes/binary.py Thu Apr 25 11:32:51 2013 +0100 @@ -12,7 +12,7 @@ from bx.seq.twobit import TWOBIT_MAGIC_NUMBER, TWOBIT_MAGIC_NUMBER_SWAP, TWOBIT_MAGIC_SIZE from urllib import urlencode, quote_plus import zipfile, gzip -import os, subprocess, tempfile +import os, subprocess, tempfile, shutil import struct
log = logging.getLogger(__name__) diff -r 8b9ca63f9128 lib/galaxy/datatypes/data.py --- a/lib/galaxy/datatypes/data.py Wed Apr 24 16:24:41 2013 -0400 +++ b/lib/galaxy/datatypes/data.py Thu Apr 25 11:32:51 2013 +0100 @@ -3,6 +3,7 @@ import mimetypes import os import sys +import shutil import tempfile import zipfile from cgi import escape
$ hg diff diff -r 8b9ca63f9128 lib/galaxy/jobs/__init__.py --- a/lib/galaxy/jobs/__init__.py Wed Apr 24 16:24:41 2013 -0400 +++ b/lib/galaxy/jobs/__init__.py Thu Apr 25 11:43:30 2013 +0100 @@ -924,7 +924,13 @@ for dataset in dataset_assoc.dataset.dataset.history_associations + dataset_assoc.dataset.dataset.library_associations: #need to update all associated output hdas, i.e. history was shared with job running dataset.blurb = 'done' dataset.peek = 'no peek' - dataset.info = ( dataset.info or '' ) + context['stdout'] + context['stderr'] + dataset.info = (dataset.info or '') + if context['stdout'].strip(): + #Ensure white space between entries + dataset.info = dataset.info.rstrip() + "\n" + context['stdout'].strip() + if context['stderr'].strip(): + #Ensure white space between entries + dataset.info = dataset.info.rstrip() + "\n" + context['stderr'].strip() dataset.tool_version = self.version_string dataset.set_size() if 'uuid' in context: ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Thu, Apr 25, 2013 at 1:40 PM, Dannon Baker <dannon.baker@gmail.com> wrote:
Hey Peter,
Thanks for the contributions, I've committed the changes to -central. You're definitely right about the merge failure not setting the job status to error, and I'm looking into that now.
-Dannon
Thanks - that's likely to be a slightly trickier issue to fix. Peter
participants (2)
-
Dannon Baker
-
Peter Cock