BAM/BAI index file test problem on (Test) Tool Shed
Hi guys, I have a new wrapper for samtools idxstats with a working unit test via run_functional_tests.sh run locally or on TravisCI: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_idxstats http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-November/017406.html However, this tool's test is failing on the Test Tool Shed: http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/6564815949e... Tool test results Automated test environment Time tested: ~ 11 hours ago System: Linux 3.8.0-30-generic Architecture: x86_64 Python version: 2.7.4 Galaxy revision: 11284:28469a503b56 Galaxy database version: 117 Tool shed revision: Tool shed database version: Tool shed mercurial version: Tests that failed Tool id: samtools_idxstats Tool version: samtools_idxstats Test: test_tool_000000 (functional.test_toolbox.TestForTool_testtoolshed.g2.bx.psu.edu/repos/peterjc/samtools_idxstats/samtools_idxstats/0.0.1) Stderr: Fatal error: Exit code 1 () Input BAI file not found: None Traceback: Traceback (most recent call last): File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 216, in test_tool self.do_it( td, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 28, in do_it self.__verify_outputs( testdef, shed_tool_id, data_list ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 134, in __verify_outputs self.__verify_output( output_tuple, shed_tool_id, elem, maxseconds=maxseconds ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 141, in __verify_output self.verify_dataset_correctness( outfile, hid=elem_hid, attributes=attributes, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 782, in verify_dataset_correctness self._assert_dataset_state( elem, 'ok' ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 606, in _assert_dataset_state raise AssertionError( errmsg ) AssertionError: Expecting dataset state 'ok', but state is 'error'. Dataset blurb: error It appears that the upload has not generated the *.bai index and assigned it to the variable input_bam.metadata.bam_index (but this works via run_functional_tests.sh for me): $ ls test-data/ex1.* test-data/ex1.bam test-data/ex1.idxstats.tabular $ ./run_functional_tests.sh -id samtools_idxstats ... Ran 1 test in 32.400s OK ... (all fine) (Note that the bai file does not seem to be needed) Tested with this revision, which works: $ hg branch default $ hg log | head changeset: 12309:1df960b4892a tag: tip user: John Chilton <jmchilton@gmail.com> date: Sun Nov 10 23:37:56 2013 -0600 summary: PEP-8 cleanups of lib/galaxy/security/__init__.py. Updated to current tip, also works: $ hg branch default [galaxy@ppserver galaxy-central]$ hg log | head changeset: 12321:e12a10e5418d tag: tip user: guerler date: Mon Nov 11 16:00:10 2013 -0500 summary: UI: Fix tooltip placement for masthead icons Are there anyone known differences on the Test Tool Shed which could explain this failure? Thanks, Peter
Hello Peter, Thanks for reporting this - I've added the following Trello card for this issue. https://trello.com/c/sN2iLCCn/99-bug-in-install-and-test-framework-1 Greg Von Kuster On Nov 12, 2013, at 7:07 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Hi guys,
I have a new wrapper for samtools idxstats with a working unit test via run_functional_tests.sh run locally or on TravisCI: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_idxstats http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-November/017406.html
However, this tool's test is failing on the Test Tool Shed: http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/6564815949e...
Tool test results Automated test environment Time tested: ~ 11 hours ago System: Linux 3.8.0-30-generic Architecture: x86_64 Python version: 2.7.4 Galaxy revision: 11284:28469a503b56 Galaxy database version: 117 Tool shed revision: Tool shed database version: Tool shed mercurial version: Tests that failed Tool id: samtools_idxstats Tool version: samtools_idxstats Test: test_tool_000000 (functional.test_toolbox.TestForTool_testtoolshed.g2.bx.psu.edu/repos/peterjc/samtools_idxstats/samtools_idxstats/0.0.1) Stderr: Fatal error: Exit code 1 () Input BAI file not found: None Traceback: Traceback (most recent call last): File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 216, in test_tool self.do_it( td, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 28, in do_it self.__verify_outputs( testdef, shed_tool_id, data_list ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 134, in __verify_outputs self.__verify_output( output_tuple, shed_tool_id, elem, maxseconds=maxseconds ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 141, in __verify_output self.verify_dataset_correctness( outfile, hid=elem_hid, attributes=attributes, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 782, in verify_dataset_correctness self._assert_dataset_state( elem, 'ok' ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 606, in _assert_dataset_state raise AssertionError( errmsg ) AssertionError: Expecting dataset state 'ok', but state is 'error'. Dataset blurb: error
It appears that the upload has not generated the *.bai index and assigned it to the variable input_bam.metadata.bam_index (but this works via run_functional_tests.sh for me):
$ ls test-data/ex1.* test-data/ex1.bam test-data/ex1.idxstats.tabular
$ ./run_functional_tests.sh -id samtools_idxstats ... Ran 1 test in 32.400s
OK ... (all fine)
(Note that the bai file does not seem to be needed)
Tested with this revision, which works:
$ hg branch default $ hg log | head changeset: 12309:1df960b4892a tag: tip user: John Chilton <jmchilton@gmail.com> date: Sun Nov 10 23:37:56 2013 -0600 summary: PEP-8 cleanups of lib/galaxy/security/__init__.py.
Updated to current tip, also works:
$ hg branch default [galaxy@ppserver galaxy-central]$ hg log | head changeset: 12321:e12a10e5418d tag: tip user: guerler date: Mon Nov 11 16:00:10 2013 -0500 summary: UI: Fix tooltip placement for masthead icons
Are there anyone known differences on the Test Tool Shed which could explain this failure?
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
This sounds a lot like this; http://dev.list.galaxyproject.org/galaxy-cant-find-binaries-in-tp4662442p466.... When using the local job runner (as the test framework likely does), I believe samtools needs to be on Galaxy's path. I don't think it is enough to just have it available as a "tool dependency" (installed via tool shed or not). As workarounds Dave could either install the samtools OS package or place the tool shed install of this on Galaxy's path before starting? .Alternatively, I guess the underlying problem could be solved - though it is not entirely clear the best path forward on that, only that is a real problem. -John On Tue, Nov 12, 2013 at 6:07 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
Hi guys,
I have a new wrapper for samtools idxstats with a working unit test via run_functional_tests.sh run locally or on TravisCI: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_idxstats http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-November/017406.html
However, this tool's test is failing on the Test Tool Shed: http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/6564815949e...
Tool test results Automated test environment Time tested: ~ 11 hours ago System: Linux 3.8.0-30-generic Architecture: x86_64 Python version: 2.7.4 Galaxy revision: 11284:28469a503b56 Galaxy database version: 117 Tool shed revision: Tool shed database version: Tool shed mercurial version: Tests that failed Tool id: samtools_idxstats Tool version: samtools_idxstats Test: test_tool_000000 (functional.test_toolbox.TestForTool_testtoolshed.g2.bx.psu.edu/repos/peterjc/samtools_idxstats/samtools_idxstats/0.0.1) Stderr: Fatal error: Exit code 1 () Input BAI file not found: None Traceback: Traceback (most recent call last): File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 216, in test_tool self.do_it( td, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 28, in do_it self.__verify_outputs( testdef, shed_tool_id, data_list ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 134, in __verify_outputs self.__verify_output( output_tuple, shed_tool_id, elem, maxseconds=maxseconds ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 141, in __verify_output self.verify_dataset_correctness( outfile, hid=elem_hid, attributes=attributes, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 782, in verify_dataset_correctness self._assert_dataset_state( elem, 'ok' ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 606, in _assert_dataset_state raise AssertionError( errmsg ) AssertionError: Expecting dataset state 'ok', but state is 'error'. Dataset blurb: error
It appears that the upload has not generated the *.bai index and assigned it to the variable input_bam.metadata.bam_index (but this works via run_functional_tests.sh for me):
$ ls test-data/ex1.* test-data/ex1.bam test-data/ex1.idxstats.tabular
$ ./run_functional_tests.sh -id samtools_idxstats ... Ran 1 test in 32.400s
OK ... (all fine)
(Note that the bai file does not seem to be needed)
Tested with this revision, which works:
$ hg branch default $ hg log | head changeset: 12309:1df960b4892a tag: tip user: John Chilton <jmchilton@gmail.com> date: Sun Nov 10 23:37:56 2013 -0600 summary: PEP-8 cleanups of lib/galaxy/security/__init__.py.
Updated to current tip, also works:
$ hg branch default [galaxy@ppserver galaxy-central]$ hg log | head changeset: 12321:e12a10e5418d tag: tip user: guerler date: Mon Nov 11 16:00:10 2013 -0500 summary: UI: Fix tooltip placement for masthead icons
Are there anyone known differences on the Test Tool Shed which could explain this failure?
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Tue, Nov 12, 2013 at 2:37 PM, John Chilton <chilton@msi.umn.edu> wrote:
This sounds a lot like this; http://dev.list.galaxyproject.org/galaxy-cant-find-binaries-in-tp4662442p466....
When using the local job runner (as the test framework likely does), I believe samtools needs to be on Galaxy's path. I don't think it is enough to just have it available as a "tool dependency" (installed via tool shed or not).
As workarounds Dave could either install the samtools OS package or place the tool shed install of this on Galaxy's path before starting? .Alternatively, I guess the underlying problem could be solved - though it is not entirely clear the best path forward on that, only that is a real problem.
-John
Yes, that sounds like the same problem - and I agree a short term hack would be to put samtools on the test system. On Tue, Nov 12, 2013 at 2:35 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Peter,
Thanks for reporting this - I've added the following Trello card for this issue.
https://trello.com/c/sN2iLCCn/99-bug-in-install-and-test-framework-1
Greg Von Kuster
That Trello link seems to be broken for me :( Peter
Hi Peter, sorry for coming late to this thread. I may be wrong, but you are only specifying a repository_dependecies.xml file. You probably only need to rename it to tool_dependencies.xml file, otherwise the tool can not import it. I guess the env.sh file is not created if no tool_dependencies.xml file is present. Cheers, Bjoern
Hi guys,
I have a new wrapper for samtools idxstats with a working unit test via run_functional_tests.sh run locally or on TravisCI: https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_idxstats http://lists.bx.psu.edu/pipermail/galaxy-dev/2013-November/017406.html
However, this tool's test is failing on the Test Tool Shed: http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/6564815949e...
Tool test results Automated test environment Time tested: ~ 11 hours ago System: Linux 3.8.0-30-generic Architecture: x86_64 Python version: 2.7.4 Galaxy revision: 11284:28469a503b56 Galaxy database version: 117 Tool shed revision: Tool shed database version: Tool shed mercurial version: Tests that failed Tool id: samtools_idxstats Tool version: samtools_idxstats Test: test_tool_000000 (functional.test_toolbox.TestForTool_testtoolshed.g2.bx.psu.edu/repos/peterjc/samtools_idxstats/samtools_idxstats/0.0.1) Stderr: Fatal error: Exit code 1 () Input BAI file not found: None Traceback: Traceback (most recent call last): File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 216, in test_tool self.do_it( td, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 28, in do_it self.__verify_outputs( testdef, shed_tool_id, data_list ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 134, in __verify_outputs self.__verify_output( output_tuple, shed_tool_id, elem, maxseconds=maxseconds ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/functional/test_toolbox.py", line 141, in __verify_output self.verify_dataset_correctness( outfile, hid=elem_hid, attributes=attributes, shed_tool_id=shed_tool_id ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 782, in verify_dataset_correctness self._assert_dataset_state( elem, 'ok' ) File "/var/opt/buildslaves/buildslave-ec2-2/buildbot-install-test-test-tool-shed-py27/build/test/base/twilltestcase.py", line 606, in _assert_dataset_state raise AssertionError( errmsg ) AssertionError: Expecting dataset state 'ok', but state is 'error'. Dataset blurb: error
It appears that the upload has not generated the *.bai index and assigned it to the variable input_bam.metadata.bam_index (but this works via run_functional_tests.sh for me):
$ ls test-data/ex1.* test-data/ex1.bam test-data/ex1.idxstats.tabular
$ ./run_functional_tests.sh -id samtools_idxstats ... Ran 1 test in 32.400s
OK ... (all fine)
(Note that the bai file does not seem to be needed)
Tested with this revision, which works:
$ hg branch default $ hg log | head changeset: 12309:1df960b4892a tag: tip user: John Chilton <jmchilton@gmail.com> date: Sun Nov 10 23:37:56 2013 -0600 summary: PEP-8 cleanups of lib/galaxy/security/__init__.py.
Updated to current tip, also works:
$ hg branch default [galaxy@ppserver galaxy-central]$ hg log | head changeset: 12321:e12a10e5418d tag: tip user: guerler date: Mon Nov 11 16:00:10 2013 -0500 summary: UI: Fix tooltip placement for masthead icons
Are there anyone known differences on the Test Tool Shed which could explain this failure?
Thanks,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Wed, Nov 13, 2013 at 10:43 AM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi Peter,
sorry for coming late to this thread.
I may be wrong, but you are only specifying a repository_dependecies.xml file. You probably only need to rename it to tool_dependencies.xml file, otherwise the tool can not import it. I guess the env.sh file is not created if no tool_dependencies.xml file is present.
Yes, I think you are right about repository_dependecies.xml versus tool_dependencies.xml but that would happen later - at this point my tool hasn't actually tried to use samtools itself. Thanks, Peter
Hello Peter, Yesterday we discovered what Björn has communicated and added it to the Trello card for this issue: https://trello.com/c/sN2iLCCn/99-bug-in-install-and-test-framework-1 It also seems that your tool is attempting to use samtools. Tgis is in your config: <requirements> <requirement type="binary">samtools</requirement> <requirement type="package" version="0.1.19">samtools</requirement> </requirements> and this is in your script: #Run samtools idxstats: cmd = "samtools idxstats %s > %s" % (bam_file, tabular_filename) return_code = os.system(cmd) I belive a fix for this would be to change the name of your repository_dependencies.xml fie to be tool_dependencies.xml, and change the current contents of the file to define a complext repository dependencies for samtools. So the current contents of your repository_dependencies.xml file: <?xml version="1.0"?> <repositories description="This requires the samtools 0.1.19 binaries"> <repository changeset_revision="54195f1d4b0f" name="package_samtools_0_1_19" owner="iuc" toolshed="http://testtoolshed.g2.bx.psu.edu" /> </repositories> becomes the following in your renamed tool_dependencies.xml file: <tool_dependency> <package name="samtools" version="0.1.9"> <repository name="package_samtools_0_1_19" owner="iic" /> </package> </tool_dependency> This is all discussed in the following section of the Tool Shed wiki: http://wiki.galaxyproject.org/ToolShedToolFeatures#Automatic_third-party_too... Let me know if this doesn't correct this problem. Thanks, Greg Von Kuster On Nov 13, 2013, at 6:47 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Wed, Nov 13, 2013 at 10:43 AM, Björn Grüning <bjoern.gruening@pharmazie.uni-freiburg.de> wrote:
Hi Peter,
sorry for coming late to this thread.
I may be wrong, but you are only specifying a repository_dependecies.xml file. You probably only need to rename it to tool_dependencies.xml file, otherwise the tool can not import it. I guess the env.sh file is not created if no tool_dependencies.xml file is present.
Yes, I think you are right about repository_dependecies.xml versus tool_dependencies.xml but that would happen later - at this point my tool hasn't actually tried to use samtools itself.
Thanks,
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Wed, Nov 13, 2013 at 12:03 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Peter,
Yesterday we discovered what Björn has communicated and added it to the Trello card for this issue:
https://trello.com/c/sN2iLCCn/99-bug-in-install-and-test-framework-1
...
becomes the following in your renamed tool_dependencies.xml file:
<tool_dependency> <package name="samtools" version="0.1.9"> <repository name="package_samtools_0_1_19" owner="iic" /> </package> </tool_dependency>
This is all discussed in the following section of the Tool Shed wiki:
http://wiki.galaxyproject.org/ToolShedToolFeatures#Automatic_third-party_too...
I believe that is what I did in response to Bjoern's email, shortly before seeing your email (but thank you for the detailed reply): http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/93b8db68dde... As an aside, I personally I find the tool_dependencies.xml vs repository_dependecies.xml split confusing when defining a dependency on a third party tool which is provided by another Tool Shed repository. I don't understand why this isn't all done with repository_dependecies.xml alone.
Let me know if this doesn't correct this problem.
Last night's Test Tool Shed run confirms this does NOT fix the problem, which does indeed appear to be in the Galaxy upload tool which has an implicit dependency on samtools for indexing BAM files as John suggested. Regards, Peter
Hi Peter, On Nov 14, 2013, at 11:27 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Wed, Nov 13, 2013 at 12:03 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Peter,
Yesterday we discovered what Björn has communicated and added it to the Trello card for this issue:
https://trello.com/c/sN2iLCCn/99-bug-in-install-and-test-framework-1
...
becomes the following in your renamed tool_dependencies.xml file:
<tool_dependency> <package name="samtools" version="0.1.9"> <repository name="package_samtools_0_1_19" owner="iic" /> </package> </tool_dependency>
This is all discussed in the following section of the Tool Shed wiki:
http://wiki.galaxyproject.org/ToolShedToolFeatures#Automatic_third-party_too...
I believe that is what I did in response to Bjoern's email, shortly before seeing your email (but thank you for the detailed reply):
http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/93b8db68dde...
As an aside, I personally I find the tool_dependencies.xml vs repository_dependecies.xml split confusing when defining a dependency on a third party tool which is provided by another Tool Shed repository. I don't understand why this isn't all done with repository_dependecies.xml alone.
If tool dependencies were defined in a repository_dependencies.xml file, the definition would be more complex than the current approach we are using. 1) A "simple repository dependency" defines a relationship between 2 repositories and does not consider any of the contents of either repository. This relationship is defined in a repository_dependencies.xml file. A good xample is the following repository_dependencies.xml file associated with the emboss5 repository at: http://devteam@testtoolshed.g2.bx.psu.edu/repos/devteam/emboss_5 The relationship is between the emboss5 repository and its required repository named emboss_datatypes. <?xml version="1.0"?> <repositories description="Emboss 5 requires the Galaxy applicable data formats used by Emboss tools."> <repository toolshed="http://testtoolshed.g2.bx.psu.edu" name="emboss_datatypes" owner="devteam" changeset_revision="9f36ad2af086" /> </repositories> 2) A "complex repository dependency" defines a relationship between some of the contents of each of 2 repositories ( the relationship is usually between a tool (e.g., xml config and script combination) in the dependent repository and a tool component (e.g., 3rd party binary) in the required repository. This relationship is defined in a tool_dependencies.xml file. A good example is the following tool_dependencies.xml file in the same emboss5 repository at: http://devteam@testtoolshed.g2.bx.psu.edu/repos/devteam/emboss_5 <?xml version="1.0"?> <tool_dependency> <package name="emboss" version="5.0.0"> <repository changeset_revision="9fd501d0f295" name="package_emboss_5_0_0" owner="devteam" toolshed="http://testtoolshed.g2.bx.psu.edu" /> </package> </tool_dependency> The above definition defines only a portion of the relationship to the contents of the emboss5 repository. The remaing portion is the contents of the <requirements> tag set in the contained tool(s): <requirements><requirement type="package" version="5.0.0">emboss</requirement></requirements> For each contained tool that has the above <requirement> tag entry, the tool will find the binary dependency installed with the required package_emboss_5_0_0 repository when the tool is executed. Since this relationship is defined at the "tool" level and not the "repository" leve, it is defined in the tool_dependenciews.xml file. All of this information is explained in the Tool Shed wiki at the following pages. http://wiki.galaxyproject.org/DefiningRepositoryDependencies#Simple_reposito... http://wiki.galaxyproject.org/DefiningRepositoryDependencies#Complex_reposit...
Let me know if this doesn't correct this problem.
Last night's Test Tool Shed run confirms this does NOT fix the problem, which does indeed appear to be in the Galaxy upload tool which has an implicit dependency on samtools for indexing BAM files as John suggested.
Yes, we're working extensively on the Tool Shed's install and test framework, and we'll figure out the best way to handle this issue. Thanks!
Regards,
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Peter, It turns out there were two problems. First, the test environment was not resolving the upload tool's dependency on samtools, which I've now corrected. Second, the bam file detection on upload was broken due to the bug in python 2.7.4's gzip module, which I've also corrected. I have re-run the test framework on samtools_idxstats, and it has now passed its test. --Dave B. On 11/14/2013 11:27 AM, Peter Cock wrote:
On Wed, Nov 13, 2013 at 12:03 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
Hello Peter,
Yesterday we discovered what Björn has communicated and added it to the Trello card for this issue:
https://trello.com/c/sN2iLCCn/99-bug-in-install-and-test-framework-1
...
becomes the following in your renamed tool_dependencies.xml file:
<tool_dependency> <package name="samtools" version="0.1.9"> <repository name="package_samtools_0_1_19" owner="iic" /> </package> </tool_dependency>
This is all discussed in the following section of the Tool Shed wiki:
http://wiki.galaxyproject.org/ToolShedToolFeatures#Automatic_third-party_too...
I believe that is what I did in response to Bjoern's email, shortly before seeing your email (but thank you for the detailed reply):
http://testtoolshed.g2.bx.psu.edu/view/peterjc/samtools_idxstats/93b8db68dde...
As an aside, I personally I find the tool_dependencies.xml vs repository_dependecies.xml split confusing when defining a dependency on a third party tool which is provided by another Tool Shed repository. I don't understand why this isn't all done with repository_dependecies.xml alone.
Let me know if this doesn't correct this problem.
Last night's Test Tool Shed run confirms this does NOT fix the problem, which does indeed appear to be in the Galaxy upload tool which has an implicit dependency on samtools for indexing BAM files as John suggested.
Regards,
Peter
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
On Mon, Nov 18, 2013 at 2:24 PM, Dave Bouvier <dave@bx.psu.edu> wrote:
Peter,
It turns out there were two problems. First, the test environment was not resolving the upload tool's dependency on samtools, which I've now corrected.
Excellent. On a closely related point, I understand Galaxy likes to store all BAM files co-ordinate sorted and indexed - when a tool produces a BAM file where does this happen? i.e. Is it the individual tool's responsibility, or the framework (e.g. during setting metadata). I am assume the later, in which case is there still an implicit samtools dependency there?
Second, the bam file detection on upload was broken due to the bug in python 2.7.4's gzip module, which I've also corrected.
You mean http://bugs.python.org/issue17666 fixed in 2.7.5? I reported that when Biopython's BGZF support broke (BGZF being the gzip flavour used for BAM and tabix style indexed files).
I have re-run the test framework on samtools_idxstats, and it has now passed its test.
--Dave B.
Thanks Dave :) Peter
Hi Peter, On Nov 18, 2013, at 10:33 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On Mon, Nov 18, 2013 at 2:24 PM, Dave Bouvier <dave@bx.psu.edu> wrote:
Peter,
It turns out there were two problems. First, the test environment was not resolving the upload tool's dependency on samtools, which I've now corrected.
Excellent.
On a closely related point, I understand Galaxy likes to store all BAM files co-ordinate sorted and indexed - when a tool produces a BAM file where does this happen? i.e. Is it the individual tool's responsibility, or the framework (e.g. during setting metadata). I am assume the later, in which case is there still an implicit samtools dependency there?
This is (unfortunately) performed in multiple methods in the Bam class methods in ~/galaxy/datatypes/binary.py. There are some comments (pasted here) that include an old "TODO" in the Bam class's dataset_content_needs_grooming() method that clarifies some of the reasons for this: # Samtools version 0.1.13 or newer produces an error condition when attempting to index an # unsorted bam file - see http://biostar.stackexchange.com/questions/5273/is-my-bam-file-sorted. # So when using a newer version of samtools, we'll first check if the input BAM file is sorted # from the header information. If the header is present and sorted, we do nothing by returning False. # If it's present and unsorted or if it's missing, we'll index the bam file to see if it produces the # error. If it does, sorting is needed so we return True (otherwise False). # # TODO: we're creating an index file here and throwing it away. We then create it again when # the set_meta() method below is called later in the job process. We need to enhance this overall # process so we don't create an index twice. In order to make it worth the time to implement the # upload tool / framework to allow setting metadata from directly within the tool itself, it should be # done generically so that all tools will have the ability. In testing, a 6.6 gb BAM file took 128 # seconds to index with samtools, and 45 minutes to sort, so indexing is relatively inexpensive.
Second, the bam file detection on upload was broken due to the bug in python 2.7.4's gzip module, which I've also corrected.
You mean http://bugs.python.org/issue17666 fixed in 2.7.5?
Yes
I reported that when Biopython's BGZF support broke (BGZF being the gzip flavour used for BAM and tabix style indexed files).
Thanks!
I have re-run the test framework on samtools_idxstats, and it has now passed its test.
--Dave B.
Thanks Dave :)
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Hi Greg, On Mon, Nov 18, 2013 at 4:02 PM, Greg Von Kuster <greg@bx.psu.edu> wrote:
On Nov 18, 2013, at 10:33 AM, Peter Cock <p.j.a.cock@googlemail.com> wrote:
On a closely related point, I understand Galaxy likes to store all BAM files co-ordinate sorted and indexed - when a tool produces a BAM file where does this happen? i.e. Is it the individual tool's responsibility, or the framework (e.g. during setting metadata). I am assume the later, in which case is there still an implicit samtools dependency there?
This is (unfortunately) performed in multiple methods in the Bam class methods in ~/galaxy/datatypes/binary.py. There are some comments (pasted here) that include an old "TODO" in the Bam class's dataset_content_needs_grooming() method that clarifies some of the reasons for this: ...
Thanks for confirming that - I'll keep that in mind if I run into any BAM/BAI issues with my work on the MIRA4 and CLCbio wrappers which include BAM output. Peter
participants (5)
-
Björn Grüning
-
Dave Bouvier
-
Greg Von Kuster
-
John Chilton
-
Peter Cock