Re: [galaxy-dev] Looks like actual user breaks splitting

4 Nov 2011

      Nate,

I get the following error when I try to kill a job.

User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'.

Ilya

-----Original Message-----
From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John
Sent: Thursday, November 03, 2011 3:55 PM
To: Nate Coraor (nate@bx.psu.edu)
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Looks like actual user breaks splitting

I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it.

Muchas gracias.

John Duddy
Sr. Staff Software Engineer
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Tel: 858-736-3584
E-mail: jduddy@illumina.com

-----Original Message-----
From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu]
Sent: Thursday, November 03, 2011 3:53 PM
To: Duddy, John
Cc: galaxy-dev@lists.bx.psu.edu
Subject: Re: [galaxy-dev] Looks like actual user breaks splitting

Nate Coraor (nate@bx.psu.edu) wrote:
...
Duddy, John wrote:
...
I'm not following you - it's been 6 months since I wrote that code 
;-}
I know the feeling!
...
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks 
like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
...
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last):
      File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge
        output_type = outputs[output][0].datatype
    AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the 
dict there, but I haven't looked much into what output_dataset_paths 
is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else).

--nate
...
Thanks,
--nate
...
John Duddy
Sr. Staff Software Engineer
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Tel: 858-736-3584
E-mail: jduddy@illumina.com
-----Original Message-----
From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu]
Sent: Thursday, November 03, 2011 2:22 PM
To: Duddy, John
Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu
Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from
get_output_fnames() -> compute_outputs().  When 
outputs_to_working_directory = False (default) this method 
stores/returns a HistoryDatasetAssociation, but when True, 
stores/returns a Dataset (the original method's behavior).  Thus, 
accessing the object's .datatype attribute in the splitter's 
do_merge() fails.
Thanks,
--nate
Duddy, John wrote:
...
I'll submit a pull request shortly...
John Duddy
Sr. Staff Software Engineer
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Tel: 858-736-3584
E-mail: jduddy@illumina.com
-----Original Message-----
From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu]
Sent: Wednesday, November 02, 2011 12:24 PM
To: Duddy, John
Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu
Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like
JobWrapper.get_output_datasets_and_fnames() is not returning the 
right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to
Sequence.split() since it wasn't updated to expect 
HistoryDatasetAssociations rather than filenames.  Could you take 
a look at that when you get a chance?
--nate
Duddy, John wrote:
...
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy
Sr. Staff Software Engineer
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Tel: 858-736-3584
E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya
Sent: Wednesday, November 02, 2011 11:50 AM
To: Duddy, John
Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu
Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last):
  File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job
    tasks = splitter.do_split(job_wrapper)
  File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split
    input_type.split(input_datasets, get_new_working_directory_name, parallel_settings)
  File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split
    raise Exception("Text file splitting does not support 
multiple files")
Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D.
Bioinformatics Scientist I
Illumina, Inc.
9885 Towne Centre Drive
San Diego, CA 92121
Work: 858.202.4582
Email: ichorny@illumina.com<mailto:ichorny@illumina.com>
Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other 
Galaxy lists, please use the interface at:
http://lists.bx.psu.edu/
___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/