Hi, 

We're developing a suite of galaxy tools for doing proteomics.  As part of that we're developing a display application ( https://bitbucket.org/Andrew_Brock/proteomics-visualise )  for viewing pepXML and protXML files and we rely very heavily on custom datatypes.  We've got this working in a fork of galaxy-dist ( https://bitbucket.org/iracooke/galaxy-proteomics ) but would love to be able to get rid of this fork and integrate all our customisations into a shed tool. 

I initially tried following the instructions here http://wiki.g2.bx.psu.edu/Tool%20Shed#Including_proprietary_data_types_that_use_class_modules_contained_in_your_repository and was able to successfully get my custom datatypes to load.  Unfortunately though I could not get my sniffers to work.  I think there are two reasons for this; 

The first problem is that it looks like the sniffers are never loaded.  I get errors like this when the tool is loaded;

galaxy.datatypes.registry DEBUG 2012-02-06 10:26:51,317 Loading datatypes from /Users/iracooke/Sources/shed_tools_galaxy_central/toolshed.g2.bx.psu.edu/repos/jjohnson/gmap/93911bac43da/gmap/tool-data/datatypes_conf.xml
galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,318 Error appending sniffer for datatype galaxy.datatypes.gmap:IntervalAnnotation to sniff_order: No module named gmap
galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SpliceSiteAnnotation to sniff_order: No module named gmap
galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:IntronAnnotation to sniff_order: No module named gmap
galaxy.datatypes.registry WARNING 2012-02-06 10:26:51,319 Error appending sniffer for datatype galaxy.datatypes.gmap:SNPAnnotation to sniff_order: No module named gmap

These errors aren't just restricted to my tool ... as you can see from the above they also occur in the gmap example installed from the main galaxy toolshed.  

I tried hacking the code in registration.py to get this to work ... it looks like the section where sniffers are loaded does not use the "imported_modules" variable.  I was able to get this error message to go away, but I still don't get proper sniffer behaviour (eg when I click "Auto detect" when editing a dataset the dataset is set to a generic type).  

Another issue is that I would like the sniffers loaded from my shed_tool to be used by the "Upload" tool .. but if I look at the source for upload.py and upload.xml I see 

-- Upload.xml
 <command interpreter="python">
      upload.py $GALAXY_ROOT_DIR $GALAXY_DATATYPES_CONF_FILE $paramfile

--Upload.py
def __main__():

....

    registry = Registry()
    registry.load_datatypes( root_dir=sys.argv[1], config=sys.argv[2] )

which looks like the upload tool is not configured to use the custom datatypes available in shed tools.


Would it be possible to make the following changes;

(a) Add custom sniffers to the sniff order when shed tools are loaded.   Importantly since custom datatypes are usually quite specific I would suggest that these are loaded at the top of the sniff order?  Or alternatively if a sniffer is for a datatype that descends from a superclass it should have priority over the parent class (since by definition it is more specific). 

(b) Change the upload tool so that it respects custom sniffers in shed tools. 

I guess that our case is a bit unusual in that we are trying to co-opt galaxy ( a genomics tool) to do proteomics ... so I understand that these changes might not be a priority.  Nevertheless, if this could be done it would be fantastic for us as we could abandon our fork and have all our functionality included in a shed tool. 


Regards
Ira