Metadata error in uploading files to data libraries
Hi all, after an update to the following changeset(14859:7ba05957588a, stable, 05.12.14), our bam files that are uploaded(linked) to a data library, are no longer indexed. The metadata_xxx.dat is created, but it stays empty. The following error message appears in the log, although the state of the dataset is 'ok': galaxy.jobs WARNING 2014-12-05 12:47:02,218 Error accessing /g/K/K27.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K27.bam' galaxy.jobs WARNING 2014-12-08 13:38:57,045 Error accessing /g/K/K2.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K2.bam' All file permissions are correct (i.e. galaxy owns them). Furthermore, executing samtools index, just works on those files: samtools index /g/K/K2.bam /g/galaxy/galaxy_data/files/_metadata_files/006/metadata_6598.dat When uploading the file - "copy files into galaxy" - the samtools index just works. ============== Now, on a clean local install(14874:885f940bff64, stable, 05.12.14) and samtools installed globally and with the bam file sorted, I get the following situation: When I try to upload this bam to a data library by linking the following error is shown on the dataset (note: here the dataset is set in error state, which does not happen on our server) Uploaded by: scholtal@embl.de Date uploaded: Mon Dec 8 17:42:39 2014 (UTC) File size: 2.6 GB UUID: d23cf11a-0372-41cb-939a-7c8761d78b73 Data type: auto Build: ? Miscellaneous information: Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in _ Job Standard Error Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 294, in add_file if datatype.dataset_content_needs_grooming( dataset.path ): File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/idr_data/WT1.sort.bam =============================== When uploading the bam without linking, I see the following processes: Upload->set meta->samtools index->'error state' Miscellaneous information: uploaded bam file Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.p Job Standard Error Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 324, in add_file if link_data_only == 'copy_files' and datatype.dataset_content_needs_grooming( output_path ): File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/galaxy-dist-new/database/files/000/dataset_6.dat ================================= Although the error messages are different, it might be related? Cheers, Jelle
Hello, Sorry we haven't made progress on this - and thanks for creating a Trello card (https://trello.com/c/tw75nq1U). It looks like samtools is not on your path, can you use it from the command-line? If not you should probably install it - I would suggest install samtools 0.1.19 with homebrew (looks like you are a on Mac). If you can run samtools from the command-line - can you do a which `samtools` from your command-line and see where it is coming from and then add that directory explicitly to your Galaxy PATH - say at the top of run.sh in your Galaxy root (let me know if you need more details on that). -John On Tue, Dec 9, 2014 at 3:02 AM, Jelle Scholtalbers <j.scholtalbers@gmail.com> wrote:
Hi all,
after an update to the following changeset(14859:7ba05957588a, stable, 05.12.14), our bam files that are uploaded(linked) to a data library, are no longer indexed. The metadata_xxx.dat is created, but it stays empty. The following error message appears in the log, although the state of the dataset is 'ok':
galaxy.jobs WARNING 2014-12-05 12:47:02,218 Error accessing /g/K/K27.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K27.bam' galaxy.jobs WARNING 2014-12-08 13:38:57,045 Error accessing /g/K/K2.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K2.bam'
All file permissions are correct (i.e. galaxy owns them). Furthermore, executing samtools index, just works on those files: samtools index /g/K/K2.bam /g/galaxy/galaxy_data/files/_metadata_files/006/metadata_6598.dat
When uploading the file - "copy files into galaxy" - the samtools index just works.
==============
Now, on a clean local install(14874:885f940bff64, stable, 05.12.14) and samtools installed globally and with the bam file sorted, I get the following situation: When I try to upload this bam to a data library by linking the following error is shown on the dataset (note: here the dataset is set in error state, which does not happen on our server)
Uploaded by: scholtal@embl.de Date uploaded: Mon Dec 8 17:42:39 2014 (UTC) File size: 2.6 GB UUID: d23cf11a-0372-41cb-939a-7c8761d78b73 Data type: auto Build: ? Miscellaneous information: Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in _ Job Standard Error
Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 294, in add_file if datatype.dataset_content_needs_grooming( dataset.path ): File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/idr_data/WT1.sort.bam
=============================== When uploading the bam without linking, I see the following processes: Upload->set meta->samtools index->'error state'
Miscellaneous information: uploaded bam file Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.p Job Standard Error
Traceback (most recent call last): File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 324, in add_file if link_data_only == 'copy_files' and datatype.dataset_content_needs_grooming( output_path ): File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File "/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/galaxy-dist-new/database/files/000/dataset_6.dat
=================================
Although the error messages are different, it might be related?
Cheers,
Jelle
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi John and others, The "OSError: [Errno 2] No such file or directory" is solved when putting the "shell=True" in the mentioned locations - as written in the Trello card. However, even after fixing that, linked BAM files where not indexed, or at least the "files/_metadata_files/000/metadata_XXX.dat" is empty on job completion. After spending too much time on this, I nailed down the problem to this setting: outputs_to_working_directory = True With this option set, the BAM index file is never moved to the right location (or is is not being created.) Setting this option to False (I think the setting was a left over from a failed attempt to use "real user job submission") this now just works. - Jelle On Mon, Dec 15, 2014 at 8:31 PM, John Chilton <jmchilton@gmail.com> wrote:
Hello,
Sorry we haven't made progress on this - and thanks for creating a Trello card (https://trello.com/c/tw75nq1U).
It looks like samtools is not on your path, can you use it from the command-line? If not you should probably install it - I would suggest install samtools 0.1.19 with homebrew (looks like you are a on Mac). If you can run samtools from the command-line - can you do a which `samtools` from your command-line and see where it is coming from and then add that directory explicitly to your Galaxy PATH - say at the top of run.sh in your Galaxy root (let me know if you need more details on that).
-John
On Tue, Dec 9, 2014 at 3:02 AM, Jelle Scholtalbers <j.scholtalbers@gmail.com> wrote:
Hi all,
after an update to the following changeset(14859:7ba05957588a, stable, 05.12.14), our bam files that are uploaded(linked) to a data library, are no longer indexed. The metadata_xxx.dat is created, but it stays empty. The following error message appears in the log, although the state of the dataset is 'ok':
galaxy.jobs WARNING 2014-12-05 12:47:02,218 Error accessing /g/K/K27.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K27.bam' galaxy.jobs WARNING 2014-12-08 13:38:57,045 Error accessing /g/K/K2.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K2.bam'
All file permissions are correct (i.e. galaxy owns them). Furthermore, executing samtools index, just works on those files: samtools index /g/K/K2.bam /g/galaxy/galaxy_data/files/_metadata_files/006/metadata_6598.dat
When uploading the file - "copy files into galaxy" - the samtools index just works.
==============
Now, on a clean local install(14874:885f940bff64, stable, 05.12.14) and samtools installed globally and with the bam file sorted, I get the following situation: When I try to upload this bam to a data library by linking the following error is shown on the dataset (note: here the dataset is set in error state, which does not happen on our server)
Uploaded by: scholtal@embl.de Date uploaded: Mon Dec 8 17:42:39 2014 (UTC) File size: 2.6 GB UUID: d23cf11a-0372-41cb-939a-7c8761d78b73 Data type: auto Build: ? Miscellaneous information: Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 396, in _ Job Standard Error
Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 294, in add_file if datatype.dataset_content_needs_grooming( dataset.path ): File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 711, in __init__ errread, errwrite) File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/idr_data/WT1.sort.bam
=============================== When uploading the bam without linking, I see the following processes: Upload->set meta->samtools index->'error state'
Miscellaneous information: uploaded bam file Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File "/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.p Job Standard Error
Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 324, in add_file if link_data_only == 'copy_files' and datatype.dataset_content_needs_grooming( output_path ): File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 711, in __init__ errread, errwrite) File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file:
/Users/scholtalbers/workspace/galaxy-dist-new/database/files/000/dataset_6.dat
=================================
Although the error messages are different, it might be related?
Cheers,
Jelle
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
So I have been able to reproduce this problem with outputs_to_working_directory - though I am not sure how to address it I have marked the Trello card as a confirmed bug. The problem with shell=True is still confusing me - is not a normal executable on your PATH? I feel like modifying your PATH to make sure samtools is properly on it in someway is a better fix than sticking shell=True in there. I'll think about it though - certainly a lot of people get tripped up by the samtools requirement and Galaxy isn't great a reporting the nature of the problem. -John On Thu, Dec 18, 2014 at 7:51 AM, Jelle Scholtalbers <j.scholtalbers@gmail.com> wrote:
Hi John and others,
The "OSError: [Errno 2] No such file or directory" is solved when putting the "shell=True" in the mentioned locations - as written in the Trello card.
However, even after fixing that, linked BAM files where not indexed, or at least the "files/_metadata_files/000/metadata_XXX.dat" is empty on job completion. After spending too much time on this, I nailed down the problem to this setting: outputs_to_working_directory = True
With this option set, the BAM index file is never moved to the right location (or is is not being created.) Setting this option to False (I think the setting was a left over from a failed attempt to use "real user job submission") this now just works.
- Jelle
On Mon, Dec 15, 2014 at 8:31 PM, John Chilton <jmchilton@gmail.com> wrote:
Hello,
Sorry we haven't made progress on this - and thanks for creating a Trello card (https://trello.com/c/tw75nq1U).
It looks like samtools is not on your path, can you use it from the command-line? If not you should probably install it - I would suggest install samtools 0.1.19 with homebrew (looks like you are a on Mac). If you can run samtools from the command-line - can you do a which `samtools` from your command-line and see where it is coming from and then add that directory explicitly to your Galaxy PATH - say at the top of run.sh in your Galaxy root (let me know if you need more details on that).
-John
On Tue, Dec 9, 2014 at 3:02 AM, Jelle Scholtalbers <j.scholtalbers@gmail.com> wrote:
Hi all,
after an update to the following changeset(14859:7ba05957588a, stable, 05.12.14), our bam files that are uploaded(linked) to a data library, are no longer indexed. The metadata_xxx.dat is created, but it stays empty. The following error message appears in the log, although the state of the dataset is 'ok':
galaxy.jobs WARNING 2014-12-05 12:47:02,218 Error accessing /g/K/K27.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K27.bam' galaxy.jobs WARNING 2014-12-08 13:38:57,045 Error accessing /g/K/K2.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K2.bam'
All file permissions are correct (i.e. galaxy owns them). Furthermore, executing samtools index, just works on those files: samtools index /g/K/K2.bam /g/galaxy/galaxy_data/files/_metadata_files/006/metadata_6598.dat
When uploading the file - "copy files into galaxy" - the samtools index just works.
==============
Now, on a clean local install(14874:885f940bff64, stable, 05.12.14) and samtools installed globally and with the bam file sorted, I get the following situation: When I try to upload this bam to a data library by linking the following error is shown on the dataset (note: here the dataset is set in error state, which does not happen on our server)
Uploaded by: scholtal@embl.de Date uploaded: Mon Dec 8 17:42:39 2014 (UTC) File size: 2.6 GB UUID: d23cf11a-0372-41cb-939a-7c8761d78b73 Data type: auto Build: ? Miscellaneous information: Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in _ Job Standard Error
Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 294, in add_file if datatype.dataset_content_needs_grooming( dataset.path ): File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/idr_data/WT1.sort.bam
=============================== When uploading the bam without linking, I see the following processes: Upload->set meta->samtools index->'error state'
Miscellaneous information: uploaded bam file Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.p Job Standard Error
Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py", line 324, in add_file if link_data_only == 'copy_files' and datatype.dataset_content_needs_grooming( output_path ): File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py", line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 711, in __init__ errread, errwrite) File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file:
/Users/scholtalbers/workspace/galaxy-dist-new/database/files/000/dataset_6.dat
=================================
Although the error messages are different, it might be related?
Cheers,
Jelle
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi John, with this in my run.sh: echo $PATH *which samtools* python ./scripts/paster.py serve $GALAXY_CONFIG_FILE $@ I get the following output: /Users/scholtal/python-envs/galaxy-dist-new/bin:/Users/scholtal/perl5/bin:/usr/local/opt/coreutils/libexec/gnubin:/usr/local/bin: *~/bin* :/Users/scholtal/workspace/installed_tools/meme/bin:/opt/local/bin:/opt/local/sbin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/opt/X11/bin:/usr/local/MacGPG2/bin */Users/scholtal/bin/samtools* Entering daemon mode I do not know what happens after that with the path.. - Jelle On Thu, Dec 18, 2014 at 6:21 PM, John Chilton <jmchilton@gmail.com> wrote:
So I have been able to reproduce this problem with outputs_to_working_directory - though I am not sure how to address it I have marked the Trello card as a confirmed bug.
The problem with shell=True is still confusing me - is not a normal executable on your PATH? I feel like modifying your PATH to make sure samtools is properly on it in someway is a better fix than sticking shell=True in there. I'll think about it though - certainly a lot of people get tripped up by the samtools requirement and Galaxy isn't great a reporting the nature of the problem.
-John
Hi John and others,
The "OSError: [Errno 2] No such file or directory" is solved when putting the "shell=True" in the mentioned locations - as written in the Trello card.
However, even after fixing that, linked BAM files where not indexed, or at least the "files/_metadata_files/000/metadata_XXX.dat" is empty on job completion. After spending too much time on this, I nailed down the
On Thu, Dec 18, 2014 at 7:51 AM, Jelle Scholtalbers <j.scholtalbers@gmail.com> wrote: problem
to this setting: outputs_to_working_directory = True
With this option set, the BAM index file is never moved to the right location (or is is not being created.) Setting this option to False (I think the setting was a left over from a failed attempt to use "real user job submission") this now just works.
- Jelle
On Mon, Dec 15, 2014 at 8:31 PM, John Chilton <jmchilton@gmail.com> wrote:
Hello,
Sorry we haven't made progress on this - and thanks for creating a Trello card (https://trello.com/c/tw75nq1U).
It looks like samtools is not on your path, can you use it from the command-line? If not you should probably install it - I would suggest install samtools 0.1.19 with homebrew (looks like you are a on Mac). If you can run samtools from the command-line - can you do a which `samtools` from your command-line and see where it is coming from and then add that directory explicitly to your Galaxy PATH - say at the top of run.sh in your Galaxy root (let me know if you need more details on that).
-John
On Tue, Dec 9, 2014 at 3:02 AM, Jelle Scholtalbers <j.scholtalbers@gmail.com> wrote:
Hi all,
after an update to the following changeset(14859:7ba05957588a, stable, 05.12.14), our bam files that are uploaded(linked) to a data library, are no longer indexed. The metadata_xxx.dat is created, but it stays empty. The following error message appears in the log, although the state of the dataset is 'ok':
galaxy.jobs WARNING 2014-12-05 12:47:02,218 Error accessing /g/K/K27.bam, will retry: [Errno 1] Operation not permitted: '/g/K/K27.bam' galaxy.jobs WARNING 2014-12-08 13:38:57,045 Error accessing
/g/K/K2.bam,
will retry: [Errno 1] Operation not permitted: '/g/K/K2.bam'
All file permissions are correct (i.e. galaxy owns them). Furthermore, executing samtools index, just works on those files: samtools index /g/K/K2.bam /g/galaxy/galaxy_data/files/_metadata_files/006/metadata_6598.dat
When uploading the file - "copy files into galaxy" - the samtools index just works.
==============
Now, on a clean local install(14874:885f940bff64, stable, 05.12.14) and samtools installed globally and with the bam file sorted, I get the following situation: When I try to upload this bam to a data library by linking the following error is shown on the dataset (note: here the dataset is set in error state, which does not happen on our server)
Uploaded by: scholtal@embl.de Date uploaded: Mon Dec 8 17:42:39 2014 (UTC) File size: 2.6 GB UUID: d23cf11a-0372-41cb-939a-7c8761d78b73 Data type: auto Build: ? Miscellaneous information: Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 396, in _ Job Standard Error
Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 294, in add_file if datatype.dataset_content_needs_grooming( dataset.path ): File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 711, in __init__ errread, errwrite) File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file: /Users/scholtalbers/workspace/idr_data/WT1.sort.bam
=============================== When uploading the bam without linking, I see the following processes: Upload->set meta->samtools index->'error state'
Miscellaneous information: uploaded bam file Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.p
Job Standard Error
Traceback (most recent call last): File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 407, in __main__() File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 396, in __main__ add_file( dataset, registry, json_file, output_path ) File
"/Users/scholtalbers/workspace/galaxy-dist-new/tools/data_source/upload.py",
line 324, in add_file if link_data_only == 'copy_files' and datatype.dataset_content_needs_grooming( output_path ): File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 147, in dataset_content_needs_grooming version = self._get_samtools_version() File
"/Users/scholtalbers/workspace/galaxy-dist-new/lib/galaxy/datatypes/binary.py",
line 129, in _get_samtools_version output = subprocess.Popen( [ 'samtools' ], stderr=subprocess.PIPE, stdout=subprocess.PIPE ).communicate()[1] File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 711, in __init__ errread, errwrite) File
"/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py",
line 1308, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory
error Database/Build: ? Number of data lines: None Disk file:
/Users/scholtalbers/workspace/galaxy-dist-new/database/files/000/dataset_6.dat
=================================
Although the error messages are different, it might be related?
Cheers,
Jelle
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
participants (2)
-
Jelle Scholtalbers
-
John Chilton