On Jun 2, 2011, at 1:29 PM, Nate Coraor wrote:
Peter Cock wrote:
Hi all,
Something I've not needed to do until now is define a new file format in Galaxy. I understand the basic principle and defining a subclass in Python... however, how does this work with new tools on the Tool Shed? In particular, if an output format is likely to be used by more than one tool, can we get it added to the Galaxy core?
I think people have provided the new subclass as a patch with the tool, but probably many of them, if well written, could be added to the core.
As an example, the basic functionality of the Blast2GO for pipelines tool (b2g4pipe) takes a BLAST XML input file, and gives a tab separated annotation output file. Galaxy already has 'blastxml' and 'tabular' file formats defined, so I didn't need to do anything extra. However, the tool can also take (a directory of) InterProScan XML files as input, so here a new 'interproscanxml' format would useful. Then any wrapper using or producing InterProScan XML could take advantage of this. e.g. Konrad's InterProScan wrapper could then offer the XML output as an option in addition to or instead of the tabular output.
We will certainly include support for new data formats into the Galaxy core. In case you haven't seen it, details for adding new formats is available in our wiki at https://bitbucket.org/galaxy/galaxy-central/wiki/AddingDatatypes. It's fairly straightforward. However, glancing at the wiki, it looks like there is no mention of functional tests for the new format. If we could get a patch that includes a functional test for uploading the format as new method(s) in ~/test/functional/test_get_data.py, it would be great.
Related to this example, why isn't there a generic base class for XML formats in general? https://bitbucket.org/galaxy/galaxy-central/issue/568/missing-xml-datatype-b...
It just hadn't been necessary in the past and no one had the time to write it, I agree it could be helpful since there are other more specific XML types.
Yes, XML formats have not yet been abstracted, and certainly can be. Just a matter of bandwidth...
--nate
Regards,
Peter ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Greg Von Kuster Galaxy Development Team greg@bx.psu.edu