Re: [galaxy-dev] Problems with custom data types and sniffers in shed tools
Hello Ira, Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo. I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule. I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution. Thanks for your patience and help on this issue. Greg On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
Hi,
We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool.
I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this;
The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded;
galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.
I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).
Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see
-- Upload.xml <command interpreter="python"> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile
--Upload.py def __main__():
....
registry = Registry() registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )
which looks like the upload tool is not configured to use the custom datatypes available in shed tools.
Would it be possible to make the following changes;
(a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific).
(b) Change the upload tool so that it respects custom sniffers in shed tools.
I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool.
Regards Ira ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Greg, Thanks for that fix ... I'll check it out. I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used. Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults. What do you think? Is that a system that would work? Cheers ira On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
Hi,
We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool.
I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this;
The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded;
galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.
I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).
Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see
-- Upload.xml <command interpreter="python"> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile
--Upload.py def __main__():
....
registry = Registry() registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )
which looks like the upload tool is not configured to use the custom datatypes available in shed tools.
Would it be possible to make the following changes;
(a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific).
(b) Change the upload tool so that it respects custom sniffers in shed tools.
I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool.
Regards Ira ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ira, Thanks for the feedback - I'll come up with something that allows for proprietary sniffers to be loaded first and let you know when it's available. It shouldn't take me too long. I will probably continue to keep the rule of ignoring conflicts in proprietary datatypes, so sniffers for conflicting datatypes will probably be ignored as well. However, hopefully conflicts will not exist, or if one is found, it will get fixed. Greg On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
Hi,
We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool.
I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this;
The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded;
galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.
I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).
Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see
-- Upload.xml <command interpreter="python"> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile
--Upload.py def __main__():
....
registry = Registry() registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )
which looks like the upload tool is not configured to use the custom datatypes available in shed tools.
Would it be possible to make the following changes;
(a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific).
(b) Change the upload tool so that it respects custom sniffers in shed tools.
I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool.
Regards Ira ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Greg, Thanks .. that sounds like a good solution. I'll be happy to test it when its ready. Ira On 08/02/2012, at 10:22 AM, Greg Von Kuster wrote:
Ira,
Thanks for the feedback - I'll come up with something that allows for proprietary sniffers to be loaded first and let you know when it's available. It shouldn't take me too long. I will probably continue to keep the rule of ignoring conflicts in proprietary datatypes, so sniffers for conflicting datatypes will probably be ignored as well. However, hopefully conflicts will not exist, or if one is found, it will get fixed.
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
Hi,
We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool.
I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this;
The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded;
galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.
I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).
Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see
-- Upload.xml <command interpreter="python"> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile
--Upload.py def __main__():
....
registry = Registry() registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )
which looks like the upload tool is not configured to use the custom datatypes available in shed tools.
Would it be possible to make the following changes;
(a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific).
(b) Change the upload tool so that it respects custom sniffers in shed tools.
I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool.
Regards Ira ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues. Thanks! Greg On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
Hi,
We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool.
I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this;
The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded;
galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.
I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).
Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see
-- Upload.xml <command interpreter="python"> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile
--Upload.py def __main__():
....
registry = Registry() registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )
which looks like the upload tool is not configured to use the custom datatypes available in shed tools.
Would it be possible to make the following changes;
(a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific).
(b) Change the upload tool so that it respects custom sniffers in shed tools.
I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool.
Regards Ira ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts. On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
Hi,
We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool.
I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this;
The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded;
galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.
I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).
Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see
-- Upload.xml <command interpreter="python"> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile
--Upload.py def __main__():
....
registry = Registry() registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )
which looks like the upload tool is not configured to use the custom datatypes available in shed tools.
Would it be possible to make the following changes;
(a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific).
(b) Change the upload tool so that it respects custom sniffers in shed tools.
I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool.
Regards Ira ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood. On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
Hello Ira,
I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this:
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing.
galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 10 changes to 10 files updating to branch default 10 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well:
galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation
Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time.
On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote:
> Hi, > > We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. > > I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; > > The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; > > galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml > galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap > > These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. > > I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). > > Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see > > -- Upload.xml > <command interpreter="python"> > upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile > > --Upload.py > def __main__(): > > .... > > registry = Registry() > registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) > > which looks like the upload tool is not configured to use the custom datatypes available in shed tools. > > > Would it be possible to make the following changes; > > (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). > > (b) Change the upload tool so that it respects custom sniffers in shed tools. > > I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. > > > Regards > Ira > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Dear Greg, I've just tested this and all the sniffers seem to be loaded now .. thanks! I'm still encountering some other problems though. Here's what I did and what happened. 1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7" 4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet 5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info 6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes. 7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here; https://bitbucket.org/iracooke/protk-toolshed As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated. Let me know if you want me to try something specific and report back, or if you need any more info. Ira On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
Hi Greg,
As far as I know there is nothing in my local setup that affects this. As I test I did the following;
Checked out a clean copy of galaxy-central Edited my universe_wsgi.ini as follows; - Changed the port to 8300 - Added an admin user - Uncommented this line tool_config_file = tool_conf.xml,shed_tool_conf.xml
Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory)
Started up galaxy ./run.sh --reload
I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail.
My system is Mac OSX 10.7.2 and I'm running python 2.7.1
I then went to the relevant bit of code and inserted some print statements. I found that;
On line 191 in registry.py there is;
module = __import__( datatype_module )
and the value of datatype_module is galaxy.datatypes.gmap
this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools.
Strange that I'm the only one who gets these errors? Any ideas what it could be?
Ira
-- paster log --
Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap destination directory: gmap requesting all changes adding changesets adding manifests adding file changes added 11 changesets with 32 changes to 17 files updating to branch default resolving manifests getting README getting gmap.xml getting gmap_build.xml getting gsnap.xml getting iit_store.xml getting lib/galaxy/datatypes/gmap.py getting snpindex.xml getting tool-data/datatypes_conf.xml getting tool-data/gmap_indices.loc.sample 9 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" resolving manifests 0 files updated, 0 files merged, 0 files removed, 0 files unresolved docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent.
galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap
On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote:
> Hello Ira, > > I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: > > <?xml version="1.0"?> > <datatypes> > <datatype_files> > <datatype_file name="gmap.py"/> > </datatype_files> > <registration> > <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> > <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> > <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> > <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> > <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> > <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> > <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> > <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> > <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> > <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> > </registration> > <sniffers> > <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> > <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> > <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> > <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> > </sniffers> > </datatypes> > > Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. > > galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap > galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' > galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap > destination directory: gmap > requesting all changes > adding changesets > adding manifests > adding file changes > added 1 changesets with 10 changes to 10 files > updating to branch default > 10 files updated, 0 files merged, 0 files removed, 0 files unresolved > galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" > 0 files updated, 0 files merged, 0 files removed, 0 files unresolved > docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. > docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap > galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. > galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. > docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. > galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. > galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. > galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml > galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation > galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation > galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation > galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation > > > > > If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: > > galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml > galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation > galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation > galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation > galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation > > > Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. > > > On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: > >> Hi, >> >> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >> >> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >> >> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >> >> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >> >> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >> >> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >> >> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >> >> -- Upload.xml >> <command interpreter="python"> >> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >> >> --Upload.py >> def __main__(): >> >> .... >> >> registry = Registry() >> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >> >> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >> >> >> Would it be possible to make the following changes; >> >> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >> >> (b) Change the upload tool so that it respects custom sniffers in shed tools. >> >> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >> >> >> Regards >> Ira >> ___________________________________________________________ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Ira, I'll download your tool source and do some testing with it. Thanks for the link to it. Your 503 error could possibly be a timeout. Are you using nginx? If so, the default timeout is ver y short if I remember correctly. If the 503 is not due to a timeout, I'm not quite sure what could be causing it. I'll let you know if I uncover something as I do some testing with your tools. Greg On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
> Hi Greg, > > As far as I know there is nothing in my local setup that affects this. As I test I did the following; > > Checked out a clean copy of galaxy-central > Edited my universe_wsgi.ini as follows; > - Changed the port to 8300 > - Added an admin user > - Uncommented this line > tool_config_file = tool_conf.xml,shed_tool_conf.xml > > Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) > > Started up galaxy > ./run.sh --reload > > I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. > I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. > > My system is Mac OSX 10.7.2 and I'm running python 2.7.1 > > I then went to the relevant bit of code and inserted some print statements. I found that; > > On line 191 in registry.py there is; > > module = __import__( datatype_module ) > > and the value of datatype_module is galaxy.datatypes.gmap > > this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? > alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. > > Strange that I'm the only one who gets these errors? Any ideas what it could be? > > Ira > > > -- paster log -- > > Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap > destination directory: gmap > requesting all changes > adding changesets > adding manifests > adding file changes > added 11 changesets with 32 changes to 17 files > updating to branch default > resolving manifests > getting README > getting gmap.xml > getting gmap_build.xml > getting gsnap.xml > getting iit_store.xml > getting lib/galaxy/datatypes/gmap.py > getting snpindex.xml > getting tool-data/datatypes_conf.xml > getting tool-data/gmap_indices.loc.sample > 9 files updated, 0 files merged, 0 files removed, 0 files unresolved > galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" > resolving manifests > 0 files updated, 0 files merged, 0 files removed, 0 files unresolved > docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. > docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap > galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. > galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. > docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. > galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. > galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. > galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap > > > > > On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: > >> Hello Ira, >> >> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >> >> <?xml version="1.0"?> >> <datatypes> >> <datatype_files> >> <datatype_file name="gmap.py"/> >> </datatype_files> >> <registration> >> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >> </registration> >> <sniffers> >> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >> </sniffers> >> </datatypes> >> >> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >> >> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >> destination directory: gmap >> requesting all changes >> adding changesets >> adding manifests >> adding file changes >> added 1 changesets with 10 changes to 10 files >> updating to branch default >> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >> >> >> >> >> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >> >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >> >> >> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >> >> >> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >> >>> Hi, >>> >>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>> >>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>> >>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>> >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>> >>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>> >>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>> >>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>> >>> -- Upload.xml >>> <command interpreter="python"> >>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>> >>> --Upload.py >>> def __main__(): >>> >>> .... >>> >>> registry = Registry() >>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>> >>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>> >>> >>> Would it be possible to make the following changes; >>> >>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>> >>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>> >>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>> >>> >>> Regards >>> Ira >>> ___________________________________________________________ >>> Please keep all replies on the list by using "reply all" >>> in your mail client. To manage your subscriptions to this >>> and other Galaxy lists, please use the interface at: >>> >>> http://lists.bx.psu.edu/ >> >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Ira, I've not been able to reproduce the 503 error you encountered, so it must be an issue in your environment. I have fixed the problems with the upload tool and setting metadata externally not properly handling proprietary datatypes in change set 6711:62fc9e053835, which is now available from our central repo. Your proprietary datatype display applications are loading into the registry as well, but I was not able to figure out how to get one to work. Please give things a try when it's convenient and let me know if you run into additional problems. Thanks very much for helping to get this working. Access to you tools and data were invaluable. Greg On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
Hello Ira,
Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo.
I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule.
I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution.
Thanks for your patience and help on this issue.
Greg
On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote:
> Hi Greg, > > As far as I know there is nothing in my local setup that affects this. As I test I did the following; > > Checked out a clean copy of galaxy-central > Edited my universe_wsgi.ini as follows; > - Changed the port to 8300 > - Added an admin user > - Uncommented this line > tool_config_file = tool_conf.xml,shed_tool_conf.xml > > Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) > > Started up galaxy > ./run.sh --reload > > I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. > I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. > > My system is Mac OSX 10.7.2 and I'm running python 2.7.1 > > I then went to the relevant bit of code and inserted some print statements. I found that; > > On line 191 in registry.py there is; > > module = __import__( datatype_module ) > > and the value of datatype_module is galaxy.datatypes.gmap > > this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? > alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. > > Strange that I'm the only one who gets these errors? Any ideas what it could be? > > Ira > > > -- paster log -- > > Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap > destination directory: gmap > requesting all changes > adding changesets > adding manifests > adding file changes > added 11 changesets with 32 changes to 17 files > updating to branch default > resolving manifests > getting README > getting gmap.xml > getting gmap_build.xml > getting gsnap.xml > getting iit_store.xml > getting lib/galaxy/datatypes/gmap.py > getting snpindex.xml > getting tool-data/datatypes_conf.xml > getting tool-data/gmap_indices.loc.sample > 9 files updated, 0 files merged, 0 files removed, 0 files unresolved > galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" > resolving manifests > 0 files updated, 0 files merged, 0 files removed, 0 files unresolved > docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. > docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap > galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. > galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. > docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. > > galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. > galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. > galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. > galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap > galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap > > > > > On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: > >> Hello Ira, >> >> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >> >> <?xml version="1.0"?> >> <datatypes> >> <datatype_files> >> <datatype_file name="gmap.py"/> >> </datatype_files> >> <registration> >> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >> </registration> >> <sniffers> >> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >> </sniffers> >> </datatypes> >> >> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >> >> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >> destination directory: gmap >> requesting all changes >> adding changesets >> adding manifests >> adding file changes >> added 1 changesets with 10 changes to 10 files >> updating to branch default >> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >> >> >> >> >> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >> >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >> >> >> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >> >> >> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >> >>> Hi, >>> >>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>> >>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>> >>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>> >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>> >>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>> >>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>> >>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>> >>> -- Upload.xml >>> <command interpreter="python"> >>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>> >>> --Upload.py >>> def __main__(): >>> >>> .... >>> >>> registry = Registry() >>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>> >>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>> >>> >>> Would it be possible to make the following changes; >>> >>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>> >>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>> >>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>> >>> >>> Regards >>> Ira >>> ___________________________________________________________ >>> Please keep all replies on the list by using "reply all" >>> in your mail client. To manage your subscriptions to this >>> and other Galaxy lists, please use the interface at: >>> >>> http://lists.bx.psu.edu/ >> >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Greg, Thanks very much for making these changes ... sorry it took a while for me to get back with a test. Here's the results. - Upload.py and the sniffer for input data (mzML) seems to work just fine which is great. If I try uploading some of the other types (eg pepxml) they come out as xml, but this problem isn't restricted to the toolshed version so it's probably just an error in my sniffer. Since these datatypes are typically produced by tools in galaxy they get assigned to the correct type as outputs of tools, so this isn't such a big problem. - It looks like the display application functionality is almost working. I can see a link to the display application when viewing displayable data in my history, but when I click this link I get an error; Not Found The resource could not be found. No route for /display_application/03501d7626bd192f/127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0/pepxml In contrast .. for my working version (display applications not added via toolkit I see the request is ) /display_application/7335c1587be69e11/proteomics_pepxml/pepxml Seems like it's almost working. Ira On 11/02/2012, at 6:57 AM, Greg Von Kuster wrote:
Hello Ira,
I've not been able to reproduce the 503 error you encountered, so it must be an issue in your environment. I have fixed the problems with the upload tool and setting metadata externally not properly handling proprietary datatypes in change set 6711:62fc9e053835, which is now available from our central repo. Your proprietary datatype display applications are loading into the registry as well, but I was not able to figure out how to get one to work. Please give things a try when it's convenient and let me know if you run into additional problems.
Thanks very much for helping to get this working. Access to you tools and data were invaluable.
Greg
On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that fix ... I'll check it out.
I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used.
Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults.
What do you think? Is that a system that would work?
Cheers ira
On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote:
> Hello Ira, > > Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo. > > I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule. > > I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution. > > Thanks for your patience and help on this issue. > > Greg > > On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote: > >> Hi Greg, >> >> As far as I know there is nothing in my local setup that affects this. As I test I did the following; >> >> Checked out a clean copy of galaxy-central >> Edited my universe_wsgi.ini as follows; >> - Changed the port to 8300 >> - Added an admin user >> - Uncommented this line >> tool_config_file = tool_conf.xml,shed_tool_conf.xml >> >> Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) >> >> Started up galaxy >> ./run.sh --reload >> >> I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. >> I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. >> >> My system is Mac OSX 10.7.2 and I'm running python 2.7.1 >> >> I then went to the relevant bit of code and inserted some print statements. I found that; >> >> On line 191 in registry.py there is; >> >> module = __import__( datatype_module ) >> >> and the value of datatype_module is galaxy.datatypes.gmap >> >> this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? >> alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. >> >> Strange that I'm the only one who gets these errors? Any ideas what it could be? >> >> Ira >> >> >> -- paster log -- >> >> Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap >> destination directory: gmap >> requesting all changes >> adding changesets >> adding manifests >> adding file changes >> added 11 changesets with 32 changes to 17 files >> updating to branch default >> resolving manifests >> getting README >> getting gmap.xml >> getting gmap_build.xml >> getting gsnap.xml >> getting iit_store.xml >> getting lib/galaxy/datatypes/gmap.py >> getting snpindex.xml >> getting tool-data/datatypes_conf.xml >> getting tool-data/gmap_indices.loc.sample >> 9 files updated, 0 files merged, 0 files removed, 0 files unresolved >> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" >> resolving manifests >> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >> docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >> docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap >> galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. >> galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. >> docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >> >> galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. >> galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >> galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >> galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >> >> >> >> >> On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: >> >>> Hello Ira, >>> >>> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >>> >>> <?xml version="1.0"?> >>> <datatypes> >>> <datatype_files> >>> <datatype_file name="gmap.py"/> >>> </datatype_files> >>> <registration> >>> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >>> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >>> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >>> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >>> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >>> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >>> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >>> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >>> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >>> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >>> </registration> >>> <sniffers> >>> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >>> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >>> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >>> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >>> </sniffers> >>> </datatypes> >>> >>> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >>> >>> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >>> destination directory: gmap >>> requesting all changes >>> adding changesets >>> adding manifests >>> adding file changes >>> added 1 changesets with 10 changes to 10 files >>> updating to branch default >>> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >>> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >>> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >>> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >>> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>> >>> >>> >>> >>> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >>> >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>> >>> >>> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >>> >>> >>> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >>> >>>> Hi, >>>> >>>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>>> >>>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>>> >>>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>>> >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>> >>>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>>> >>>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>>> >>>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>>> >>>> -- Upload.xml >>>> <command interpreter="python"> >>>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>>> >>>> --Upload.py >>>> def __main__(): >>>> >>>> .... >>>> >>>> registry = Registry() >>>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>>> >>>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>>> >>>> >>>> Would it be possible to make the following changes; >>>> >>>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>>> >>>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>>> >>>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>>> >>>> >>>> Regards >>>> Ira >>>> ___________________________________________________________ >>>> Please keep all replies on the list by using "reply all" >>>> in your mail client. To manage your subscriptions to this >>>> and other Galaxy lists, please use the interface at: >>>> >>>> http://lists.bx.psu.edu/ >>> >> >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hello Ira, this problem should be fixed in change set 6714:33c780b4c145 on our central repo. Thanks for reporting this! Greg On Feb 13, 2012, at 10:31 PM, Ira Cooke wrote:
Hello Greg,
Thanks very much for making these changes ... sorry it took a while for me to get back with a test. Here's the results.
- Upload.py and the sniffer for input data (mzML) seems to work just fine which is great. If I try uploading some of the other types (eg pepxml) they come out as xml, but this problem isn't restricted to the toolshed version so it's probably just an error in my sniffer. Since these datatypes are typically produced by tools in galaxy they get assigned to the correct type as outputs of tools, so this isn't such a big problem.
- It looks like the display application functionality is almost working. I can see a link to the display application when viewing displayable data in my history, but when I click this link I get an error;
Not Found
The resource could not be found. No route for /display_application/03501d7626bd192f/127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0/pepxml
In contrast .. for my working version (display applications not added via toolkit I see the request is )
/display_application/7335c1587be69e11/proteomics_pepxml/pepxml
Seems like it's almost working.
Ira
On 11/02/2012, at 6:57 AM, Greg Von Kuster wrote:
Hello Ira,
I've not been able to reproduce the 503 error you encountered, so it must be an issue in your environment. I have fixed the problems with the upload tool and setting metadata externally not properly handling proprietary datatypes in change set 6711:62fc9e053835, which is now available from our central repo. Your proprietary datatype display applications are loading into the registry as well, but I was not able to figure out how to get one to work. Please give things a try when it's convenient and let me know if you run into additional problems.
Thanks very much for helping to get this working. Access to you tools and data were invaluable.
Greg
On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues.
Thanks!
Greg
On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote:
> Hi Greg, > > Thanks for that fix ... I'll check it out. > > I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used. > > Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults. > > What do you think? Is that a system that would work? > > Cheers > ira > > > > > On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote: > >> Hello Ira, >> >> Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo. >> >> I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule. >> >> I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution. >> >> Thanks for your patience and help on this issue. >> >> Greg >> >> On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote: >> >>> Hi Greg, >>> >>> As far as I know there is nothing in my local setup that affects this. As I test I did the following; >>> >>> Checked out a clean copy of galaxy-central >>> Edited my universe_wsgi.ini as follows; >>> - Changed the port to 8300 >>> - Added an admin user >>> - Uncommented this line >>> tool_config_file = tool_conf.xml,shed_tool_conf.xml >>> >>> Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) >>> >>> Started up galaxy >>> ./run.sh --reload >>> >>> I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. >>> I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. >>> >>> My system is Mac OSX 10.7.2 and I'm running python 2.7.1 >>> >>> I then went to the relevant bit of code and inserted some print statements. I found that; >>> >>> On line 191 in registry.py there is; >>> >>> module = __import__( datatype_module ) >>> >>> and the value of datatype_module is galaxy.datatypes.gmap >>> >>> this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? >>> alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. >>> >>> Strange that I'm the only one who gets these errors? Any ideas what it could be? >>> >>> Ira >>> >>> >>> -- paster log -- >>> >>> Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap >>> destination directory: gmap >>> requesting all changes >>> adding changesets >>> adding manifests >>> adding file changes >>> added 11 changesets with 32 changes to 17 files >>> updating to branch default >>> resolving manifests >>> getting README >>> getting gmap.xml >>> getting gmap_build.xml >>> getting gsnap.xml >>> getting iit_store.xml >>> getting lib/galaxy/datatypes/gmap.py >>> getting snpindex.xml >>> getting tool-data/datatypes_conf.xml >>> getting tool-data/gmap_indices.loc.sample >>> 9 files updated, 0 files merged, 0 files removed, 0 files unresolved >>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" >>> resolving manifests >>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>> docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>> docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap >>> galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. >>> galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. >>> docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>> >>> galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. >>> galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>> galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>> galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>> >>> >>> >>> >>> On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: >>> >>>> Hello Ira, >>>> >>>> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >>>> >>>> <?xml version="1.0"?> >>>> <datatypes> >>>> <datatype_files> >>>> <datatype_file name="gmap.py"/> >>>> </datatype_files> >>>> <registration> >>>> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >>>> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >>>> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >>>> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >>>> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >>>> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >>>> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >>>> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >>>> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >>>> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >>>> </registration> >>>> <sniffers> >>>> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >>>> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >>>> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >>>> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >>>> </sniffers> >>>> </datatypes> >>>> >>>> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >>>> >>>> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >>>> destination directory: gmap >>>> requesting all changes >>>> adding changesets >>>> adding manifests >>>> adding file changes >>>> added 1 changesets with 10 changes to 10 files >>>> updating to branch default >>>> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >>>> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >>>> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >>>> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >>>> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>> >>>> >>>> >>>> >>>> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >>>> >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>> >>>> >>>> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >>>> >>>> >>>> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >>>> >>>>> Hi, >>>>> >>>>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>>>> >>>>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>>>> >>>>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>>>> >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>>> >>>>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>>>> >>>>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>>>> >>>>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>>>> >>>>> -- Upload.xml >>>>> <command interpreter="python"> >>>>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>>>> >>>>> --Upload.py >>>>> def __main__(): >>>>> >>>>> .... >>>>> >>>>> registry = Registry() >>>>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>>>> >>>>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>>>> >>>>> >>>>> Would it be possible to make the following changes; >>>>> >>>>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>>>> >>>>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>>>> >>>>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>>>> >>>>> >>>>> Regards >>>>> Ira >>>>> ___________________________________________________________ >>>>> Please keep all replies on the list by using "reply all" >>>>> in your mail client. To manage your subscriptions to this >>>>> and other Galaxy lists, please use the interface at: >>>>> >>>>> http://lists.bx.psu.edu/ >>>> >>> >> > > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Greg, Thanks for that. I just tested and it works! For now I think that's all my toolshed issues solved. We'll try to clean things up and submit to the toolshed soon. For reference, our display application is available here https://bitbucket.org/Andrew_Brock/proteomics-visualise and the files I provided ought to be visualisable with it (eg the pepXML) though many links will be broken because you won't have all the other files it refers to in your history. For now the display application needs to run on the same computer as galaxy. This is because many of the files reference other files in the history and (for example) the display app wants to show raw spectra (initial input file) when viewing a protein prophet search result (many steps down the pipeline). Best regards Ira On 15/02/2012, at 3:59 AM, Greg Von Kuster wrote:
Hello Ira,
this problem should be fixed in change set 6714:33c780b4c145 on our central repo. Thanks for reporting this!
Greg
On Feb 13, 2012, at 10:31 PM, Ira Cooke wrote:
Hello Greg,
Thanks very much for making these changes ... sorry it took a while for me to get back with a test. Here's the results.
- Upload.py and the sniffer for input data (mzML) seems to work just fine which is great. If I try uploading some of the other types (eg pepxml) they come out as xml, but this problem isn't restricted to the toolshed version so it's probably just an error in my sniffer. Since these datatypes are typically produced by tools in galaxy they get assigned to the correct type as outputs of tools, so this isn't such a big problem.
- It looks like the display application functionality is almost working. I can see a link to the display application when viewing displayable data in my history, but when I click this link I get an error;
Not Found
The resource could not be found. No route for /display_application/03501d7626bd192f/127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0/pepxml
In contrast .. for my working version (display applications not added via toolkit I see the request is )
/display_application/7335c1587be69e11/proteomics_pepxml/pepxml
Seems like it's almost working.
Ira
On 11/02/2012, at 6:57 AM, Greg Von Kuster wrote:
Hello Ira,
I've not been able to reproduce the 503 error you encountered, so it must be an issue in your environment. I have fixed the problems with the upload tool and setting metadata externally not properly handling proprietary datatypes in change set 6711:62fc9e053835, which is now available from our central repo. Your proprietary datatype display applications are loading into the registry as well, but I was not able to figure out how to get one to work. Please give things a try when it's convenient and let me know if you run into additional problems.
Thanks very much for helping to get this working. Access to you tools and data were invaluable.
Greg
On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts.
On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote:
> Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues. > > Thanks! > > Greg > > > On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote: > >> Hi Greg, >> >> Thanks for that fix ... I'll check it out. >> >> I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used. >> >> Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults. >> >> What do you think? Is that a system that would work? >> >> Cheers >> ira >> >> >> >> >> On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote: >> >>> Hello Ira, >>> >>> Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo. >>> >>> I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule. >>> >>> I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution. >>> >>> Thanks for your patience and help on this issue. >>> >>> Greg >>> >>> On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote: >>> >>>> Hi Greg, >>>> >>>> As far as I know there is nothing in my local setup that affects this. As I test I did the following; >>>> >>>> Checked out a clean copy of galaxy-central >>>> Edited my universe_wsgi.ini as follows; >>>> - Changed the port to 8300 >>>> - Added an admin user >>>> - Uncommented this line >>>> tool_config_file = tool_conf.xml,shed_tool_conf.xml >>>> >>>> Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) >>>> >>>> Started up galaxy >>>> ./run.sh --reload >>>> >>>> I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. >>>> I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. >>>> >>>> My system is Mac OSX 10.7.2 and I'm running python 2.7.1 >>>> >>>> I then went to the relevant bit of code and inserted some print statements. I found that; >>>> >>>> On line 191 in registry.py there is; >>>> >>>> module = __import__( datatype_module ) >>>> >>>> and the value of datatype_module is galaxy.datatypes.gmap >>>> >>>> this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? >>>> alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. >>>> >>>> Strange that I'm the only one who gets these errors? Any ideas what it could be? >>>> >>>> Ira >>>> >>>> >>>> -- paster log -- >>>> >>>> Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap >>>> destination directory: gmap >>>> requesting all changes >>>> adding changesets >>>> adding manifests >>>> adding file changes >>>> added 11 changesets with 32 changes to 17 files >>>> updating to branch default >>>> resolving manifests >>>> getting README >>>> getting gmap.xml >>>> getting gmap_build.xml >>>> getting gsnap.xml >>>> getting iit_store.xml >>>> getting lib/galaxy/datatypes/gmap.py >>>> getting snpindex.xml >>>> getting tool-data/datatypes_conf.xml >>>> getting tool-data/gmap_indices.loc.sample >>>> 9 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" >>>> resolving manifests >>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>> docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>> docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap >>>> galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. >>>> galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. >>>> docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>> >>>> galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. >>>> galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>> galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>> galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>> >>>> >>>> >>>> >>>> On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: >>>> >>>>> Hello Ira, >>>>> >>>>> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >>>>> >>>>> <?xml version="1.0"?> >>>>> <datatypes> >>>>> <datatype_files> >>>>> <datatype_file name="gmap.py"/> >>>>> </datatype_files> >>>>> <registration> >>>>> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >>>>> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >>>>> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >>>>> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >>>>> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >>>>> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >>>>> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >>>>> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >>>>> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >>>>> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >>>>> </registration> >>>>> <sniffers> >>>>> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >>>>> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >>>>> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >>>>> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >>>>> </sniffers> >>>>> </datatypes> >>>>> >>>>> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >>>>> >>>>> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >>>>> destination directory: gmap >>>>> requesting all changes >>>>> adding changesets >>>>> adding manifests >>>>> adding file changes >>>>> added 1 changesets with 10 changes to 10 files >>>>> updating to branch default >>>>> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >>>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>>> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >>>>> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>>> >>>>> >>>>> >>>>> >>>>> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >>>>> >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>>> >>>>> >>>>> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >>>>> >>>>> >>>>> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>>>>> >>>>>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>>>>> >>>>>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>>>>> >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>>>> >>>>>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>>>>> >>>>>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>>>>> >>>>>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>>>>> >>>>>> -- Upload.xml >>>>>> <command interpreter="python"> >>>>>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>>>>> >>>>>> --Upload.py >>>>>> def __main__(): >>>>>> >>>>>> .... >>>>>> >>>>>> registry = Registry() >>>>>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>>>>> >>>>>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>>>>> >>>>>> >>>>>> Would it be possible to make the following changes; >>>>>> >>>>>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>>>>> >>>>>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>>>>> >>>>>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>>>>> >>>>>> >>>>>> Regards >>>>>> Ira >>>>>> ___________________________________________________________ >>>>>> Please keep all replies on the list by using "reply all" >>>>>> in your mail client. To manage your subscriptions to this >>>>>> and other Galaxy lists, please use the interface at: >>>>>> >>>>>> http://lists.bx.psu.edu/ >>>>> >>>> >>> >> >> ___________________________________________________________ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ > > ___________________________________________________________ > Please keep all replies on the list by using "reply all" > in your mail client. To manage your subscriptions to this > and other Galaxy lists, please use the interface at: > > http://lists.bx.psu.edu/
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Thanks for your patience and help in working trough these issues Ira - it's been a pleasure! It's resulted in a great benefit to the Galaxy community as well. Greg On Feb 14, 2012, at 7:59 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that. I just tested and it works!
For now I think that's all my toolshed issues solved.
We'll try to clean things up and submit to the toolshed soon.
For reference, our display application is available here https://bitbucket.org/Andrew_Brock/proteomics-visualise and the files I provided ought to be visualisable with it (eg the pepXML) though many links will be broken because you won't have all the other files it refers to in your history.
For now the display application needs to run on the same computer as galaxy. This is because many of the files reference other files in the history and (for example) the display app wants to show raw spectra (initial input file) when viewing a protein prophet search result (many steps down the pipeline).
Best regards Ira
On 15/02/2012, at 3:59 AM, Greg Von Kuster wrote:
Hello Ira,
this problem should be fixed in change set 6714:33c780b4c145 on our central repo. Thanks for reporting this!
Greg
On Feb 13, 2012, at 10:31 PM, Ira Cooke wrote:
Hello Greg,
Thanks very much for making these changes ... sorry it took a while for me to get back with a test. Here's the results.
- Upload.py and the sniffer for input data (mzML) seems to work just fine which is great. If I try uploading some of the other types (eg pepxml) they come out as xml, but this problem isn't restricted to the toolshed version so it's probably just an error in my sniffer. Since these datatypes are typically produced by tools in galaxy they get assigned to the correct type as outputs of tools, so this isn't such a big problem.
- It looks like the display application functionality is almost working. I can see a link to the display application when viewing displayable data in my history, but when I click this link I get an error;
Not Found
The resource could not be found. No route for /display_application/03501d7626bd192f/127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0/pepxml
In contrast .. for my working version (display applications not added via toolkit I see the request is )
/display_application/7335c1587be69e11/proteomics_pepxml/pepxml
Seems like it's almost working.
Ira
On 11/02/2012, at 6:57 AM, Greg Von Kuster wrote:
Hello Ira,
I've not been able to reproduce the 503 error you encountered, so it must be an issue in your environment. I have fixed the problems with the upload tool and setting metadata externally not properly handling proprietary datatypes in change set 6711:62fc9e053835, which is now available from our central repo. Your proprietary datatype display applications are loading into the registry as well, but I was not able to figure out how to get one to work. Please give things a try when it's convenient and let me know if you run into additional problems.
Thanks very much for helping to get this working. Access to you tools and data were invaluable.
Greg
On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood.
On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote:
> To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts. > > On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote: > >> Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues. >> >> Thanks! >> >> Greg >> >> >> On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote: >> >>> Hi Greg, >>> >>> Thanks for that fix ... I'll check it out. >>> >>> I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used. >>> >>> Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults. >>> >>> What do you think? Is that a system that would work? >>> >>> Cheers >>> ira >>> >>> >>> >>> >>> On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote: >>> >>>> Hello Ira, >>>> >>>> Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo. >>>> >>>> I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule. >>>> >>>> I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution. >>>> >>>> Thanks for your patience and help on this issue. >>>> >>>> Greg >>>> >>>> On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote: >>>> >>>>> Hi Greg, >>>>> >>>>> As far as I know there is nothing in my local setup that affects this. As I test I did the following; >>>>> >>>>> Checked out a clean copy of galaxy-central >>>>> Edited my universe_wsgi.ini as follows; >>>>> - Changed the port to 8300 >>>>> - Added an admin user >>>>> - Uncommented this line >>>>> tool_config_file = tool_conf.xml,shed_tool_conf.xml >>>>> >>>>> Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) >>>>> >>>>> Started up galaxy >>>>> ./run.sh --reload >>>>> >>>>> I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. >>>>> I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. >>>>> >>>>> My system is Mac OSX 10.7.2 and I'm running python 2.7.1 >>>>> >>>>> I then went to the relevant bit of code and inserted some print statements. I found that; >>>>> >>>>> On line 191 in registry.py there is; >>>>> >>>>> module = __import__( datatype_module ) >>>>> >>>>> and the value of datatype_module is galaxy.datatypes.gmap >>>>> >>>>> this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? >>>>> alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. >>>>> >>>>> Strange that I'm the only one who gets these errors? Any ideas what it could be? >>>>> >>>>> Ira >>>>> >>>>> >>>>> -- paster log -- >>>>> >>>>> Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap >>>>> destination directory: gmap >>>>> requesting all changes >>>>> adding changesets >>>>> adding manifests >>>>> adding file changes >>>>> added 11 changesets with 32 changes to 17 files >>>>> updating to branch default >>>>> resolving manifests >>>>> getting README >>>>> getting gmap.xml >>>>> getting gmap_build.xml >>>>> getting gsnap.xml >>>>> getting iit_store.xml >>>>> getting lib/galaxy/datatypes/gmap.py >>>>> getting snpindex.xml >>>>> getting tool-data/datatypes_conf.xml >>>>> getting tool-data/gmap_indices.loc.sample >>>>> 9 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" >>>>> resolving manifests >>>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>> docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>>> docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap >>>>> galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. >>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. >>>>> docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>> >>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. >>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>>> galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>>> >>>>> >>>>> >>>>> >>>>> On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: >>>>> >>>>>> Hello Ira, >>>>>> >>>>>> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >>>>>> >>>>>> <?xml version="1.0"?> >>>>>> <datatypes> >>>>>> <datatype_files> >>>>>> <datatype_file name="gmap.py"/> >>>>>> </datatype_files> >>>>>> <registration> >>>>>> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >>>>>> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >>>>>> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >>>>>> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >>>>>> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >>>>>> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >>>>>> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >>>>>> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >>>>>> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >>>>>> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >>>>>> </registration> >>>>>> <sniffers> >>>>>> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >>>>>> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >>>>>> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >>>>>> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >>>>>> </sniffers> >>>>>> </datatypes> >>>>>> >>>>>> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >>>>>> >>>>>> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >>>>>> destination directory: gmap >>>>>> requesting all changes >>>>>> adding changesets >>>>>> adding manifests >>>>>> adding file changes >>>>>> added 1 changesets with 10 changes to 10 files >>>>>> updating to branch default >>>>>> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >>>>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>>> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>>>> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >>>>>> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >>>>>> >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>>>> >>>>>> >>>>>> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >>>>>> >>>>>> >>>>>> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>>>>>> >>>>>>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>>>>>> >>>>>>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>>>>>> >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>>>>> >>>>>>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>>>>>> >>>>>>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>>>>>> >>>>>>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>>>>>> >>>>>>> -- Upload.xml >>>>>>> <command interpreter="python"> >>>>>>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>>>>>> >>>>>>> --Upload.py >>>>>>> def __main__(): >>>>>>> >>>>>>> .... >>>>>>> >>>>>>> registry = Registry() >>>>>>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>>>>>> >>>>>>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>>>>>> >>>>>>> >>>>>>> Would it be possible to make the following changes; >>>>>>> >>>>>>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>>>>>> >>>>>>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>>>>>> >>>>>>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>>>>>> >>>>>>> >>>>>>> Regards >>>>>>> Ira >>>>>>> ___________________________________________________________ >>>>>>> Please keep all replies on the list by using "reply all" >>>>>>> in your mail client. To manage your subscriptions to this >>>>>>> and other Galaxy lists, please use the interface at: >>>>>>> >>>>>>> http://lists.bx.psu.edu/ >>>>>> >>>>> >>>> >>> >>> ___________________________________________________________ >>> Please keep all replies on the list by using "reply all" >>> in your mail client. To manage your subscriptions to this >>> and other Galaxy lists, please use the interface at: >>> >>> http://lists.bx.psu.edu/ >> >> ___________________________________________________________ >> Please keep all replies on the list by using "reply all" >> in your mail client. To manage your subscriptions to this >> and other Galaxy lists, please use the interface at: >> >> http://lists.bx.psu.edu/ >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
No worries :) Ira On 15/02/2012, at 2:15 PM, Greg Von Kuster wrote:
Thanks for your patience and help in working trough these issues Ira - it's been a pleasure! It's resulted in a great benefit to the Galaxy community as well.
Greg
On Feb 14, 2012, at 7:59 PM, Ira Cooke wrote:
Hi Greg,
Thanks for that. I just tested and it works!
For now I think that's all my toolshed issues solved.
We'll try to clean things up and submit to the toolshed soon.
For reference, our display application is available here https://bitbucket.org/Andrew_Brock/proteomics-visualise and the files I provided ought to be visualisable with it (eg the pepXML) though many links will be broken because you won't have all the other files it refers to in your history.
For now the display application needs to run on the same computer as galaxy. This is because many of the files reference other files in the history and (for example) the display app wants to show raw spectra (initial input file) when viewing a protein prophet search result (many steps down the pipeline).
Best regards Ira
On 15/02/2012, at 3:59 AM, Greg Von Kuster wrote:
Hello Ira,
this problem should be fixed in change set 6714:33c780b4c145 on our central repo. Thanks for reporting this!
Greg
On Feb 13, 2012, at 10:31 PM, Ira Cooke wrote:
Hello Greg,
Thanks very much for making these changes ... sorry it took a while for me to get back with a test. Here's the results.
- Upload.py and the sniffer for input data (mzML) seems to work just fine which is great. If I try uploading some of the other types (eg pepxml) they come out as xml, but this problem isn't restricted to the toolshed version so it's probably just an error in my sniffer. Since these datatypes are typically produced by tools in galaxy they get assigned to the correct type as outputs of tools, so this isn't such a big problem.
- It looks like the display application functionality is almost working. I can see a link to the display application when viewing displayable data in my history, but when I click this link I get an error;
Not Found
The resource could not be found. No route for /display_application/03501d7626bd192f/127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0/pepxml
In contrast .. for my working version (display applications not added via toolkit I see the request is )
/display_application/7335c1587be69e11/proteomics_pepxml/pepxml
Seems like it's almost working.
Ira
On 11/02/2012, at 6:57 AM, Greg Von Kuster wrote:
Hello Ira,
I've not been able to reproduce the 503 error you encountered, so it must be an issue in your environment. I have fixed the problems with the upload tool and setting metadata externally not properly handling proprietary datatypes in change set 6711:62fc9e053835, which is now available from our central repo. Your proprietary datatype display applications are loading into the registry as well, but I was not able to figure out how to get one to work. Please give things a try when it's convenient and let me know if you run into additional problems.
Thanks very much for helping to get this working. Access to you tools and data were invaluable.
Greg
On Feb 8, 2012, at 7:50 PM, Ira Cooke wrote:
Dear Greg,
I've just tested this and all the sniffers seem to be loaded now .. thanks!
I'm still encountering some other problems though. Here's what I did and what happened.
1. Did a fresh clone and install from the latest galaxy-central (all databases and shed_tools directories cleaned). 2. Uploaded my tool to the toolshed (both toolshed and galaxy are running on the local host). 3. Attempted to install my tool ... when I did this I got a 503 Error ( though if you look in the log below it looks like the error was actually a 500 ... so I'm not really sure what was going on. Let me know if you can't reproduce this and I'll try to track it down more. It happens every time I try to install a tool for the first time. ). Even though I got the error it seems that the tool was actually installed properly. In my paster log I had
galaxy.util.shed_util DEBUG 2012-02-09 09:33:17,640 Adding new row (or updating an existing row) for repository 'protk' in the tool_shed_repository table. galaxy.tools DEBUG 2012-02-09 09:33:17,879 Reloading section: Proteomics galaxy.tools DEBUG 2012-02-09 09:33:17,905 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_convert_annotate_ids_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,942 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_interprophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:17,974 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_mascot_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,003 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mascot_to_pepxml_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,037 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/mzml_to_mgf_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,150 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_omssa_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,171 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_peptide_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,209 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_protein_prophet_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,236 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_search_tandem_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,276 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/xls_to_table_1/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,288 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_pepxml/1.0.0, version: 1.0.0. galaxy.tools DEBUG 2012-02-09 09:33:18,303 Loaded tool id: 127.0.0.1:9009/repos/iracooke/protk/proteomics_protxml/1.0.0, version: 1.0.0. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,048 Loading datatypes from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml galaxy.datatypes.registry WARNING 2012-02-09 09:33:21,050 Overriding conflicting datatype with extension 'xls', using datatype from /Users/iracooke/Sources/shed_tools/127.0.0.1/repos/iracooke/protk/73136f00a3fd/protk/tool-data/datatypes_conf.xml. galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,050 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:PepXml' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Mgf' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:ProtXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,051 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:MzXML' galaxy.datatypes.registry DEBUG 2012-02-09 09:33:21,052 Loaded sniffer for datatype 'galaxy.datatypes.proteomics:Xls' 127.0.0.1 - - [09/Feb/2012:09:33:16 +1100] "POST /admin_toolshed/install_repository?tool_shed_url=http%3A%2F%2F127.0.0.1%3A9009%2F&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26%3A7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True HTTP/1.1" 500 - "http://127.0.0.1:8300/admin_toolshed/install_repository?tool_shed_url=http://127.0.0.1:9009/&repo_info_dict=dc4bb7f7442f37f9321a02977bb24641470dae26:7b2270726f746b223a205b2250726f74656f6d69637320546f6f6c6b6974222c2022687474703a2f2f697261636f6f6b65403132372e302e302e313a393030392f7265706f732f697261636f6f6b652f70726f746b222c2022373331333666303061336664225d7d&includes_tools=True" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_2) AppleWebKit/535.7 (KHTML, like Gecko) Chrome/16.0.912.77 Safari/535.7"
4. I then tried uploading a file from one of my proprietary datatypes (mzml) ... but it looks like the upload tool doesn't use my proprietary sniffers yet
5. I then set the datatype manually and galaxy now prints the correct metadata for my proprietary datatype under the dataset info
6. Datatypes are recognised correctly when the data is created as the output of one of my tools. (Not sure if this uses sniffers though?). My tools also properly filter inputs based on the datatypes.
7. Unfortunately my display application setup doesn't seem to be working. All the code for my tool is here;
https://bitbucket.org/iracooke/protk-toolshed
As far as I can tell I've followed the recommendations in the wiki to create my files ... but something isn't right as I don't see the link to my display application when I view information about my dataset. When I view details about my tool in the galaxy admin section I see a list of all my tools plus two tools at the end called "view list of" ... it looks as if my display application files have somehow been interpreted as tools. Any advice on how I could fix this would be much appreciated.
Let me know if you want me to try something specific and report back, or if you need any more info.
Ira
On 09/02/2012, at 5:50 AM, Greg Von Kuster wrote:
> In revision 6695:c06d7cef9125 I simplified the precedence rules. A datatypes currently being loaded will always replace a conflicting datatype that was previously loaded. This same behavior applies to datatype sniffers (sniffers loaded later will replace sniffers loaded previously), but will be appended to the sniff order, not placed in the same position as the replaced sniffer. This makes things cleaner and more easily understood. > > > On Feb 8, 2012, at 11:37 AM, Greg Von Kuster wrote: > >> To clarify, proprietary datatypes currently being loaded will be ignored if they conflict with a proprietary datatypes that was already loaded. Only datatypes defined in the datatypes_conf.xml file will take precedence, and override conflicts. >> >> On Feb 8, 2012, at 11:34 AM, Greg Von Kuster wrote: >> >>> Proprietary datatypes are loaded before datatypes in the distribution as of change set revision 6693:865d998a693d, which is available in our central repo. Proprietary datatypes contained in installed repositories are loaded in order of oldest installation first, followed by next olderst installation, etc. Datatypes included in the distribution and loaded per the datatypes_conf.xml file will override any conflicting datatypes (datatypes are conflicting if they have the same extension) that were loaded into the datatypes registry from an installed tool shed repository. Please let me know if you encounter any issues. >>> >>> Thanks! >>> >>> Greg >>> >>> >>> On Feb 7, 2012, at 6:16 PM, Ira Cooke wrote: >>> >>>> Hi Greg, >>>> >>>> Thanks for that fix ... I'll check it out. >>>> >>>> I think the issue of sniff order is a pretty important one for shed tools that require proprietary datatypes. The problem is that galaxy's default sniffers include some extremely generic sniffers (eg text,xml) which will catch pretty much anything so it's hard to imagine how a tool writer will ever have their sniffers used. >>>> >>>> Since the potential for conflicts with galaxy's defaults is an issue how about changing sniffer behaviour so that datatypes which subclass always have priority over their parent class ... then among those subclasses the order in which sniffers are listed in datatypes_conf is used as the second priority. I imagine that this would require that the sniff order code be changed so that all datatypes and sniffers are first loaded .. then reshuffled to get the correct order according to the class hierarchy. I believe this option makes sense since a subclass of a datatype should always override its parent .. and I think it would also avoid the potential for conflicts with galaxy's defaults. >>>> >>>> What do you think? Is that a system that would work? >>>> >>>> Cheers >>>> ira >>>> >>>> >>>> >>>> >>>> On 08/02/2012, at 3:25 AM, Greg Von Kuster wrote: >>>> >>>>> Hello Ira, >>>>> >>>>> Very sorry for the back-and-forth on this. The behavior you encountered was accurate - the environment in which I was testing was not pristine. I have committed a fix for this issue in revision b4ba8b20d78d, which is now available from our central repo. >>>>> >>>>> I started looking at changing things so that proprietary sniffers are loaded before the sniffers defined in the Galaxy distribution, and determined that this will be a bit non-trivial. Problems arise due to potential conflicts between a proprietary datatype that has the same extension as a datatype defined in the Galaxy distribution. The approach I've taken to deal with this is that datatypes in the Galaxy distribution get loaded first, and then any conflicting proprietary datatypes will be ignored. Loading proprietary datatypes first will not allow for this rule. >>>>> >>>>> I'll gladly listen to advice from the community on this, but in the meantime, proprietary datatypes and sniffers are loaded after those in the distribution. >>>>> >>>>> Thanks for your patience and help on this issue. >>>>> >>>>> Greg >>>>> >>>>> On Feb 6, 2012, at 11:37 PM, Ira Cooke wrote: >>>>> >>>>>> Hi Greg, >>>>>> >>>>>> As far as I know there is nothing in my local setup that affects this. As I test I did the following; >>>>>> >>>>>> Checked out a clean copy of galaxy-central >>>>>> Edited my universe_wsgi.ini as follows; >>>>>> - Changed the port to 8300 >>>>>> - Added an admin user >>>>>> - Uncommented this line >>>>>> tool_config_file = tool_conf.xml,shed_tool_conf.xml >>>>>> >>>>>> Edited shed_tool_conf.xml to point to ../shed_tools_galaxy_central (which is an empty directory) >>>>>> >>>>>> Started up galaxy >>>>>> ./run.sh --reload >>>>>> >>>>>> I then went straight to admin -> "Search and browse tool sheds" and selected the main galaxy toolshed. >>>>>> I then selected the gmap tool and told it to install. My paster log is at the bottom of this mail. >>>>>> >>>>>> My system is Mac OSX 10.7.2 and I'm running python 2.7.1 >>>>>> >>>>>> I then went to the relevant bit of code and inserted some print statements. I found that; >>>>>> >>>>>> On line 191 in registry.py there is; >>>>>> >>>>>> module = __import__( datatype_module ) >>>>>> >>>>>> and the value of datatype_module is galaxy.datatypes.gmap >>>>>> >>>>>> this line seems to be causing the exception ... I'm not a python programmer so I don't really know how module importing works ... could this be related to the version of python I'm running? >>>>>> alternatively should the "imported_modules" variable be used here .. since wouldn't galaxy.datatypes.gmap suggest that the module needs to be located inside the main galaxy lib rather than in the shed_tools. >>>>>> >>>>>> Strange that I'm the only one who gets these errors? Any ideas what it could be? >>>>>> >>>>>> Ira >>>>>> >>>>>> >>>>>> -- paster log -- >>>>>> >>>>>> Cloning http://toolshed.g2.bx.psu.edu/repos/jjohnson/gmap >>>>>> destination directory: gmap >>>>>> requesting all changes >>>>>> adding changesets >>>>>> adding manifests >>>>>> adding file changes >>>>>> added 11 changesets with 32 changes to 17 files >>>>>> updating to branch default >>>>>> resolving manifests >>>>>> getting README >>>>>> getting gmap.xml >>>>>> getting gmap_build.xml >>>>>> getting gsnap.xml >>>>>> getting iit_store.xml >>>>>> getting lib/galaxy/datatypes/gmap.py >>>>>> getting snpindex.xml >>>>>> getting tool-data/datatypes_conf.xml >>>>>> getting tool-data/gmap_indices.loc.sample >>>>>> 9 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,100 Updating cloned repository to revision "93911bac43da" >>>>>> resolving manifests >>>>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>>> docutils WARNING 2012-02-07 14:25:35,512 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> galaxy.util.shed_util DEBUG 2012-02-07 14:25:35,589 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>>>> docutils WARNING 2012-02-07 14:25:35,701 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> docutils WARNING 2012-02-07 14:25:35,863 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> galaxy.tools DEBUG 2012-02-07 14:25:35,932 Reloading section: gmap >>>>>> galaxy.tools DEBUG 2012-02-07 14:25:35,997 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap/2.0.1, version: 2.0.1. >>>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,026 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_build/2.0.0, version: 2.0.0. >>>>>> docutils WARNING 2012-02-07 14:25:36,077 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>> >>>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,103 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gsnap/2.0.1, version: 2.0.1. >>>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,268 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>>>> galaxy.tools DEBUG 2012-02-07 14:25:36,316 Loaded tool id: toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>>>> galaxy.datatypes.registry DEBUG 2012-02-07 14:25:38,608 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,609 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>>>> galaxy.datatypes.registry WARNING 2012-02-07 14:25:38,610 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 07/02/2012, at 3:05 AM, Greg Von Kuster wrote: >>>>>> >>>>>>> Hello Ira, >>>>>>> >>>>>>> I believe proprietary datatype sniffers included in tool shed repositories are loading as expected - at least I cannot reproduce the behavior you are seeing. The datatypes_conf.xml file included in the latest revision of the gmap repository on the main Galaxy tool shed looks like this: >>>>>>> >>>>>>> <?xml version="1.0"?> >>>>>>> <datatypes> >>>>>>> <datatype_files> >>>>>>> <datatype_file name="gmap.py"/> >>>>>>> </datatype_files> >>>>>>> <registration> >>>>>>> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> >>>>>>> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> >>>>>>> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> >>>>>>> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> >>>>>>> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> >>>>>>> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> >>>>>>> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> >>>>>>> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> >>>>>>> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> >>>>>>> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> >>>>>>> </registration> >>>>>>> <sniffers> >>>>>>> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> >>>>>>> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> >>>>>>> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> >>>>>>> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> >>>>>>> </sniffers> >>>>>>> </datatypes> >>>>>>> >>>>>>> Here is the snippet of my paster log when I install the gmap tool shed repository - notice that the tools, datatypes and sniffers are all loaded. I'm installing it from a local tool shed, but I downloaded the latest version from the main Galaxy tool shed and uploaded it with no changes to my local tool shed for testing. >>>>>>> >>>>>>> galaxy.web.controllers.admin_toolshed DEBUG 2012-02-06 10:50:33,686 Loading new tool panel section: gmap >>>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Installing repository 'gmap' >>>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:33,687 Cloning http://test@gvk.bx.psu.edu:9009/repos/test/gmap >>>>>>> destination directory: gmap >>>>>>> requesting all changes >>>>>>> adding changesets >>>>>>> adding manifests >>>>>>> adding file changes >>>>>>> added 1 changesets with 10 changes to 10 files >>>>>>> updating to branch default >>>>>>> 10 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,062 Updating cloned repository to revision "dbcccd1e4dfd" >>>>>>> 0 files updated, 0 files merged, 0 files removed, 0 files unresolved >>>>>>> docutils WARNING 2012-02-06 10:50:34,388 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>>> >>>>>>> galaxy.util.shed_util DEBUG 2012-02-06 10:50:34,437 Adding new row (or updating an existing row) for repository 'gmap' in the tool_shed_repository table. >>>>>>> docutils WARNING 2012-02-06 10:50:34,533 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>>> >>>>>>> docutils WARNING 2012-02-06 10:50:34,657 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>>> >>>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,718 Reloading section: gmap >>>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,769 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap/2.0.1, version: 2.0.1. >>>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,813 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_build/2.0.0, version: 2.0.0. >>>>>>> docutils WARNING 2012-02-06 10:50:34,846 <string>:10: (WARNING/2) Explicit markup ends without a blank line; unexpected unindent. >>>>>>> >>>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,877 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gsnap/2.0.1, version: 2.0.1. >>>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,911 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_iit_store/2.0.0, version: 2.0.0. >>>>>>> galaxy.tools DEBUG 2012-02-06 10:50:34,962 Loaded tool id: gvk.bx.psu.edu:9009/repos/test/gmap/gmap_snpindex/2.0.0, version: 2.0.0. >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,147 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,159 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,160 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:50:35,161 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> If I stop and restart my Galaxy server after I've installed the gmap repository, everything loads correctly as well: >>>>>>> >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,713 Loading datatypes from ../shed_tools/gvk.bx.psu.edu/repos/test/gmap/dbcccd1e4dfd/gmap/gmap-93911bac43da/tool-data/datatypes_conf.xml >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntervalAnnotation >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,715 Loaded sniffer for datatype: galaxy.datatypes.gmap:SpliceSiteAnnotation >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:IntronAnnotation >>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:57:10,716 Loaded sniffer for datatype: galaxy.datatypes.gmap:SNPAnnotation >>>>>>> >>>>>>> >>>>>>> Have you made any changes to your local Galaxy instance that may have resulted in proprietary datatypes / sniffers not being loaded correctly? What version of the Galaxy distribution are you running? You should be at 6672:e38a9eb21336 from the central repository for the latest tool shed code. However, proprietary datatype support has not been touched in some time. >>>>>>> >>>>>>> >>>>>>> On Feb 5, 2012, at 7:04 PM, Ira Cooke wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> We're developing a suite of galaxy tools for doing proteomics. As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise ) for viewing pepXML and protXML files and we rely very heavily on custom datatypes. We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. >>>>>>>> >>>>>>>> I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_... and was able to successfully get my custom datatypes to load. Unfortunately though I could not get my sniffers to work. I think there are two reasons for this; >>>>>>>> >>>>>>>> The first problem is that it looks like the sniffers are never loaded. I get errors like this when the tool is loaded; >>>>>>>> >>>>>>>> galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml >>>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap >>>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap >>>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap >>>>>>>> galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap >>>>>>>> >>>>>>>> These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed. >>>>>>>> >>>>>>>> I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable. I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type). >>>>>>>> >>>>>>>> Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see >>>>>>>> >>>>>>>> -- Upload.xml >>>>>>>> <command interpreter="python"> >>>>>>>> upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile >>>>>>>> >>>>>>>> --Upload.py >>>>>>>> def __main__(): >>>>>>>> >>>>>>>> .... >>>>>>>> >>>>>>>> registry = Registry() >>>>>>>> registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] ) >>>>>>>> >>>>>>>> which looks like the upload tool is not configured to use the custom datatypes available in shed tools. >>>>>>>> >>>>>>>> >>>>>>>> Would it be possible to make the following changes; >>>>>>>> >>>>>>>> (a) Add custom sniffers to the sniff order when shed tools are loaded. Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order? Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). >>>>>>>> >>>>>>>> (b) Change the upload tool so that it respects custom sniffers in shed tools. >>>>>>>> >>>>>>>> I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority. Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. >>>>>>>> >>>>>>>> >>>>>>>> Regards >>>>>>>> Ira >>>>>>>> ___________________________________________________________ >>>>>>>> Please keep all replies on the list by using "reply all" >>>>>>>> in your mail client. To manage your subscriptions to this >>>>>>>> and other Galaxy lists, please use the interface at: >>>>>>>> >>>>>>>> http://lists.bx.psu.edu/ >>>>>>> >>>>>> >>>>> >>>> >>>> ___________________________________________________________ >>>> Please keep all replies on the list by using "reply all" >>>> in your mail client. To manage your subscriptions to this >>>> and other Galaxy lists, please use the interface at: >>>> >>>> http://lists.bx.psu.edu/ >>> >>> ___________________________________________________________ >>> Please keep all replies on the list by using "reply all" >>> in your mail client. To manage your subscriptions to this >>> and other Galaxy lists, please use the interface at: >>> >>> http://lists.bx.psu.edu/ >> >
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Greg Von Kuster
-
Ira Cooke