Looks like actual user breaks splitting
Hey John, Any thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina? John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com> From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting Hey John, Any thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
John, Ilya, I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance? --nate Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
I'll submit a pull request shortly... John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting John, Ilya, I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance? --nate Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
Hi John, It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails. Thanks, --nate Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
I'm not following you - it's been 6 months since I wrote that code ;-} IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way. Do you have a stack trace for the merge problem I can look at? John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting Hi John, It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails. Thanks, --nate Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False: { output_param_name : [ HDA, DatasetPath ], ... } And this when True: { output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block: log.exception( stdout ) You'll get: Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype' I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break. Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Nate, I get the following error when I try to kill a job. User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451' -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate, I get the following error when I try to kill a job. User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it. -----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451' -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate, I get the following error when I try to kill a job. User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error: `./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy It might have something to do with our mounts being Isilon. Any thought on how to deal with this? -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it. -----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451' -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate, I get the following error when I try to kill a job. User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
I am getting a strange error when I am uploading a file mv: inter-device move failed: `/illumina/scratch/Swim/galaxy/tmpk2sbZl' to `/home/galaxy/ichorny/galaxy-central/database/tmp/upload_file_data_isAnwO'; unable to remove target: Permission denied I figured out that the mv is coming from lib/galaxy/datatypes/sniff.py if in_place: fnamemode = stat.S_IMODE( os.stat(fname).st_mode ) os.system('mv %s %s' %(temp_name,fname)) # Return number of lines in file. return ( i + 1, None ) else: return ( i + 1, temp_name ) Any thoughts on what might be going on? If I comment out the move the upload works just fine. Thanks, Ilya -----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 11:11 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu); Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error: `./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy It might have something to do with our mounts being Isilon. Any thought on how to deal with this? -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it. -----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451' -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate, I get the following error when I try to kill a job. User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Looks like the mv is something I added and forgot about. I changed it to shutil.copyfile and it works fine. This also has something to do with the isilon. I pushed it to the actual user repository so you can check it out. Have a good weekend. Thanks, Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 4:50 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I am getting a strange error when I am uploading a file mv: inter-device move failed: `/illumina/scratch/Swim/galaxy/tmpk2sbZl' to `/home/galaxy/ichorny/galaxy-central/database/tmp/upload_file_data_isAnwO'; unable to remove target: Permission denied I figured out that the mv is coming from lib/galaxy/datatypes/sniff.py if in_place: fnamemode = stat.S_IMODE( os.stat(fname).st_mode ) os.system('mv %s %s' %(temp_name,fname)) # Return number of lines in file. return ( i + 1, None ) else: return ( i + 1, temp_name ) Any thoughts on what might be going on? If I comment out the move the upload works just fine. Thanks, Ilya -----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 11:11 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu); Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error: `./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy It might have something to do with our mounts being Isilon. Any thought on how to deal with this? -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it. -----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts? Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451' -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate, I get the following error when I try to kill a job. User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'. Ilya -----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it. Muchas gracias. John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com -----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else). --nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
On Nov 4, 2011, at 8:03 PM, Chorny, Ilya wrote:
Looks like the mv is something I added and forgot about. I changed it to shutil.copyfile and it works fine. This also has something to do with the isilon. I pushed it to the actual user repository so you can check it out.
Okay, I'll have a look. What's the copy for? We try to avoid moving bits as much as possible during upload since it can slow down the entire upload tool run pretty significantly if there are multiple copies taking place. --nate
Have a good weekend.
Thanks,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 4:50 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I am getting a strange error when I am uploading a file
mv: inter-device move failed: `/illumina/scratch/Swim/galaxy/tmpk2sbZl' to `/home/galaxy/ichorny/galaxy-central/database/tmp/upload_file_data_isAnwO'; unable to remove target: Permission denied
I figured out that the mv is coming from
lib/galaxy/datatypes/sniff.py
if in_place: fnamemode = stat.S_IMODE( os.stat(fname).st_mode ) os.system('mv %s %s' %(temp_name,fname)) # Return number of lines in file. return ( i + 1, None ) else: return ( i + 1, temp_name )
Any thoughts on what might be going on? If I comment out the move the upload works just fine.
Thanks,
Ilya
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 11:11 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu); Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error:
`./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy
It might have something to do with our mounts being Isilon. Any thought on how to deal with this?
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it.
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts?
Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate,
I get the following error when I try to kill a job.
User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it.
Muchas gracias.
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else).
--nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
I am not sure what the copy is for. I did not add the move/copy I just changed shutil.mv to os.system(mv) because it has less baggage. In fact if I comment it out the upload still works. Ilya -----Original Message----- From: Nate Coraor [mailto:nate@bx.psu.edu] Sent: Monday, November 07, 2011 7:57 AM To: Chorny, Ilya Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting On Nov 4, 2011, at 8:03 PM, Chorny, Ilya wrote:
Looks like the mv is something I added and forgot about. I changed it to shutil.copyfile and it works fine. This also has something to do with the isilon. I pushed it to the actual user repository so you can check it out.
Okay, I'll have a look. What's the copy for? We try to avoid moving bits as much as possible during upload since it can slow down the entire upload tool run pretty significantly if there are multiple copies taking place. --nate
Have a good weekend.
Thanks,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 4:50 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I am getting a strange error when I am uploading a file
mv: inter-device move failed: `/illumina/scratch/Swim/galaxy/tmpk2sbZl' to `/home/galaxy/ichorny/galaxy-central/database/tmp/upload_file_data_isA nwO'; unable to remove target: Permission denied
I figured out that the mv is coming from
lib/galaxy/datatypes/sniff.py
if in_place: fnamemode = stat.S_IMODE( os.stat(fname).st_mode ) os.system('mv %s %s' %(temp_name,fname)) # Return number of lines in file. return ( i + 1, None ) else: return ( i + 1, temp_name )
Any thoughts on what might be going on? If I comment out the move the upload works just fine.
Thanks,
Ilya
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 11:11 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu); Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error:
`./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy
It might have something to do with our mounts being Isilon. Any thought on how to deal with this?
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it.
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts?
Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate,
I get the following error when I try to kill a job.
User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it.
Muchas gracias.
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else).
--nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
This converts line endings to UNIX line endings, so in most cases it wouldn't break anything to not happen, however, if the uploaded file had DOS line endings, the uploaded file content would be wrong. The best way to handle this is probably to change the ownership of the uploaded temporary file. The code needs to be very careful not to change ownership of anything not uploaded by an actual user (i.e. files for "linking" or even copied in from the library side). --nate On Nov 7, 2011, at 11:12 AM, Chorny, Ilya wrote:
I am not sure what the copy is for. I did not add the move/copy I just changed shutil.mv to os.system(mv) because it has less baggage. In fact if I comment it out the upload still works.
Ilya
-----Original Message----- From: Nate Coraor [mailto:nate@bx.psu.edu] Sent: Monday, November 07, 2011 7:57 AM To: Chorny, Ilya Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
On Nov 4, 2011, at 8:03 PM, Chorny, Ilya wrote:
Looks like the mv is something I added and forgot about. I changed it to shutil.copyfile and it works fine. This also has something to do with the isilon. I pushed it to the actual user repository so you can check it out.
Okay, I'll have a look. What's the copy for? We try to avoid moving bits as much as possible during upload since it can slow down the entire upload tool run pretty significantly if there are multiple copies taking place.
--nate
Have a good weekend.
Thanks,
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 4:50 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I am getting a strange error when I am uploading a file
mv: inter-device move failed: `/illumina/scratch/Swim/galaxy/tmpk2sbZl' to `/home/galaxy/ichorny/galaxy-central/database/tmp/upload_file_data_isA nwO'; unable to remove target: Permission denied
I figured out that the mv is coming from
lib/galaxy/datatypes/sniff.py
if in_place: fnamemode = stat.S_IMODE( os.stat(fname).st_mode ) os.system('mv %s %s' %(temp_name,fname)) # Return number of lines in file. return ( i + 1, None ) else: return ( i + 1, temp_name )
Any thoughts on what might be going on? If I comment out the move the upload works just fine.
Thanks,
Ilya
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 11:11 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu); Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error:
`./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy
It might have something to do with our mounts being Isilon. Any thought on how to deal with this?
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it.
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts?
Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate,
I get the following error when I try to kill a job.
User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it.
Muchas gracias.
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else).
--nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
On Nov 4, 2011, at 2:10 PM, Chorny, Ilya wrote:
Well I think I figured out why we get OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'. I changed it to an rm -rf and got the following error:
`./database/job_working_directory/455/.nfs0000000161e986fc00000068' ': Device or resource busy
This is an open file on NFS where the named file on the filesystem has been deleted. Something must be holding the file open, we'd need to figure out what the file is. Checking /proc/<pid>/fd at the right moment might reveal the answer. --nate
It might have something to do with our mounts being Isilon. Any thought on how to deal with this?
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 10:50 AM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I came up with a fix for User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid' and pushed it.
-----Original Message----- From: Chorny, Ilya Sent: Friday, November 04, 2011 10:24 AM To: Chorny, Ilya; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: RE: [galaxy-dev] Looks like actual user breaks splitting
I am also getting the following error in the log file which makes no sense because the directory is empty. I will try to track it down. Thoughts?
Ilya Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/__init__.py", line 732, in cleanup shutil.rmtree( self.working_directory ) File "/usr/lib64/python2.6/shutil.py", line 225, in rmtree onerror(os.rmdir, path, sys.exc_info()) File "/usr/lib64/python2.6/shutil.py", line 223, in rmtree os.rmdir(path) OSError: [Errno 39] Directory not empty: './database/job_working_directory/451'
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Friday, November 04, 2011 9:07 AM To: Duddy, John; Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate,
I get the following error when I try to kill a job.
User killed running job, but error encountered removing from DRM queue: 'DRMAAJobRunner' object has no attribute 'job_user_uid'.
Ilya
-----Original Message----- From: galaxy-dev-bounces@lists.bx.psu.edu [mailto:galaxy-dev-bounces@lists.bx.psu.edu] On Behalf Of Duddy, John Sent: Thursday, November 03, 2011 3:55 PM To: Nate Coraor (nate@bx.psu.edu) Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
I just came to that conclusion myself. If you need me to do anything, let me know - but it sounds like you have a handle on it.
Muchas gracias.
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 3:53 PM To: Duddy, John Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] Looks like actual user breaks splitting
Nate Coraor (nate@bx.psu.edu) wrote:
Duddy, John wrote:
I'm not following you - it's been 6 months since I wrote that code ;-}
I know the feeling!
IT looks to me like a DatasetPath() object is always placed in that array, and with one exception near then, it looks like the change I made generates those objects the same way.
It's creating a dict in self.output_dataset_paths, and that dict looks like this when outputs_to_working_directory = False:
{ output_param_name : [ HDA, DatasetPath ], ... }
And this when True:
{ output_param_name : [ Dataset, DatasetPath ], ... }
Do you have a stack trace for the merge problem I can look at?
If you put this in do_merge()'s except block:
log.exception( stdout )
You'll get:
Traceback (most recent call last): File "/space/nate/galaxy-central-ichorny/lib/galaxy/jobs/splitters/multi.py", line 128, in do_merge output_type = outputs[output][0].datatype AttributeError: 'Dataset' object has no attribute 'datatype'
I could just change both methods to put an HDA in the list inside the dict there, but I haven't looked much into what output_dataset_paths is used for, so I wasn't sure what that might break.
Sorta answered it myself, it looks like you created this precisely for do_merge(), so changing it to contain the HDA fixes the problem (and shouldn't break anything else).
--nate
Thanks, --nate
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Thursday, November 03, 2011 2:22 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
Hi John,
It looks like the first issue is related to the change from get_output_fnames() -> compute_outputs(). When outputs_to_working_directory = False (default) this method stores/returns a HistoryDatasetAssociation, but when True, stores/returns a Dataset (the original method's behavior). Thus, accessing the object's .datatype attribute in the splitter's do_merge() fails.
Thanks, --nate
Duddy, John wrote:
I'll submit a pull request shortly...
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com
-----Original Message----- From: Nate Coraor (nate@bx.psu.edu) [mailto:nate@bx.psu.edu] Sent: Wednesday, November 02, 2011 12:24 PM To: Duddy, John Cc: Chorny, Ilya; galaxy-dev@lists.bx.psu.edu Subject: Re: Looks like actual user breaks splitting
John, Ilya,
I get further with sequence type inputs but it looks like JobWrapper.get_output_datasets_and_fnames() is not returning the right thing when outputs_to_working_directory = True
BTW, the base Data.split() method is broken after the updates to Sequence.split() since it wasn't updated to expect HistoryDatasetAssociations rather than filenames. Could you take a look at that when you get a chance?
--nate
Duddy, John wrote:
The datatype you are using does not define a split method. Are you working with our in-progress gz type or fastqillumina?
John Duddy Sr. Staff Software Engineer Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Tel: 858-736-3584 E-mail: jduddy@illumina.com<mailto:jduddy@illumina.com>
From: Chorny, Ilya Sent: Wednesday, November 02, 2011 11:50 AM To: Duddy, John Cc: Nate Coraor (nate@bx.psu.edu); galaxy-dev@lists.bx.psu.edu Subject: Looks like actual user breaks splitting
Hey John,
Any thoughts?
Ilya
Traceback (most recent call last): File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/runners/tasks.py", line 73, in run_job tasks = splitter.do_split(job_wrapper) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/jobs/splitters/multi.py", line 73, in do_split input_type.split(input_datasets, get_new_working_directory_name, parallel_settings) File "/home/galaxy/ichorny/galaxy-central/lib/galaxy/datatypes/data.py", line 473, in split raise Exception("Text file splitting does not support multiple files") Exception: Text file splitting does not support multiple files
Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: ichorny@illumina.com<mailto:ichorny@illumina.com> Website: www.illumina.com<http://www.illumina.com>
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (4)
-
Chorny, Ilya
-
Duddy, John
-
Nate Coraor
-
Nate Coraor (nate@bx.psu.edu)