Outputting zip files from galaxy
I have an application that needs to return a zip file. I attempted to follow the instructions to add a "zip" datatype, with no success. It appears to me that Galaxy insists on generating the output file name and I can't get it to generate one with a ".zip" extension. The documentation on the wiki is missing a lot of details. Can you please point me to a working example of some code returning a zip file? Below are snippets of my code. Thanks, Bill Martin ================== tool_conf.xml: <toolbox> <section name="test" id="test_section_1"> <tool file="test/my_test.xml"/> </section> ================= tools/test/my_test.pl: #!/usr/bin/perl use strict; my ($output_1) = @ARGV; system("zip ${output_1} /home/bill/.* in.csv"); ===================== tools/test/my_test.xml: <tool id="my_test_1" name="my_test"> <description> the descr </description> <command interpreter="perl"> my_test.pl $output_1 </command> <inputs> <param format="tabular" name="input" type="data" label="inp" help="Dataset missing? See TIP below."/> </inputs> <outputs> <data format="zip" name="output_1"/> </outputs> </tool> =================== datatypes_conf.xml: <datatypes> <registration converters_path="lib/galaxy/datatypes/converters" display_path="display_applications"> { some lines omitted here } <datatype extension="zip" type="galaxy.datatypes.binary:Zip" mimetype="application/zip" display_in_upload="true"/> { ... } </registration> ====================== lib/galaxy/datatypes/binary.py: { many lines omitted here } class Binary( data.Data ): """Binary data""" def set_peek( self, dataset, is_multi_byte=False ): """Set the peek and blurb text""" if not dataset.dataset.purged: dataset.peek = 'binary data' dataset.blurb = 'data' else: dataset.peek = 'file does not exist' dataset.blurb = 'file purged from disk' def get_mime( self ): """Returns the mime type of the datatype""" return 'application/octet-stream' class Zip( Binary ): """Zip File""" file_ext = "zip" def get_mime( self ): """Returns the mime type of the datatype""" return 'application/zip' { many lines omitted here }
On Nov 9, 2011, at 5:21 PM, wfmartin wrote:
I have an application that needs to return a zip file.
I attempted to follow the instructions to add a "zip" datatype, with no success. It appears to me that Galaxy insists on generating the output file name and I can't get it to generate one with a ".zip" extension. The documentation on the wiki is missing a lot of details.
Can you please point me to a working example of some code returning a zip file?
Bill, I don't believe there are any tools in the distribution outputting zip files, although there might be other third party tools which do. Is there a reason that the disk filename must end with the .zip extension? Galaxy names all of its datasets uniformly, and in the cases where certain tools need inputs to have a certain extension, we've worked around this with symbolic links. The goal is to keep data access as abstract as possible. --nate
Below are snippets of my code.
Thanks, Bill Martin
================== tool_conf.xml:
<toolbox> <section name="test" id="test_section_1"> <tool file="test/my_test.xml"/> </section>
================= tools/test/my_test.pl:
#!/usr/bin/perl use strict;
my ($output_1) = @ARGV;
system("zip ${output_1} /home/bill/.* in.csv");
===================== tools/test/my_test.xml:
<tool id="my_test_1" name="my_test"> <description> the descr </description> <command interpreter="perl"> my_test.pl $output_1 </command> <inputs> <param format="tabular" name="input" type="data" label="inp" help="Dataset missing? See TIP below."/> </inputs>
<outputs> <data format="zip" name="output_1"/> </outputs>
</tool>
=================== datatypes_conf.xml:
<datatypes> <registration converters_path="lib/galaxy/datatypes/converters" display_path="display_applications"> { some lines omitted here } <datatype extension="zip" type="galaxy.datatypes.binary:Zip" mimetype="application/zip" display_in_upload="true"/> { ... } </registration>
====================== lib/galaxy/datatypes/binary.py:
{ many lines omitted here }
class Binary( data.Data ): """Binary data""" def set_peek( self, dataset, is_multi_byte=False ): """Set the peek and blurb text""" if not dataset.dataset.purged: dataset.peek = 'binary data' dataset.blurb = 'data' else: dataset.peek = 'file does not exist' dataset.blurb = 'file purged from disk' def get_mime( self ): """Returns the mime type of the datatype""" return 'application/octet-stream'
class Zip( Binary ): """Zip File""" file_ext = "zip" def get_mime( self ): """Returns the mime type of the datatype""" return 'application/zip'
{ many lines omitted here } ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
Hi Nate: Thanks for your response. The zip command always creates files with the extension. Utilities that validate, examine, or extract from zip files expect the file to have a ".zip" extension. I can understand why it was easier to architect Galaxy with a simple, uniform file naming scheme. Thanks again, Bill On 11/10/2011 08:11 AM, Nate Coraor wrote:
On Nov 9, 2011, at 5:21 PM, wfmartin wrote:
I have an application that needs to return a zip file.
I attempted to follow the instructions to add a "zip" datatype, with no success. It appears to me that Galaxy insists on generating the output file name and I can't get it to generate one with a ".zip" extension. The documentation on the wiki is missing a lot of details.
Can you please point me to a working example of some code returning a zip file? Bill,
I don't believe there are any tools in the distribution outputting zip files, although there might be other third party tools which do.
Is there a reason that the disk filename must end with the .zip extension? Galaxy names all of its datasets uniformly, and in the cases where certain tools need inputs to have a certain extension, we've worked around this with symbolic links. The goal is to keep data access as abstract as possible.
--nate
Below are snippets of my code.
Thanks, Bill Martin
================== tool_conf.xml:
<toolbox> <section name="test" id="test_section_1"> <tool file="test/my_test.xml"/> </section>
================= tools/test/my_test.pl:
#!/usr/bin/perl use strict;
my ($output_1) = @ARGV;
system("zip ${output_1} /home/bill/.* in.csv");
===================== tools/test/my_test.xml:
<tool id="my_test_1" name="my_test"> <description> the descr</description> <command interpreter="perl"> my_test.pl $output_1 </command> <inputs> <param format="tabular" name="input" type="data" label="inp" help="Dataset missing? See TIP below."/> </inputs>
<outputs> <data format="zip" name="output_1"/> </outputs>
</tool>
=================== datatypes_conf.xml:
<datatypes> <registration converters_path="lib/galaxy/datatypes/converters" display_path="display_applications"> { some lines omitted here } <datatype extension="zip" type="galaxy.datatypes.binary:Zip" mimetype="application/zip" display_in_upload="true"/> { ... } </registration>
====================== lib/galaxy/datatypes/binary.py:
{ many lines omitted here }
class Binary( data.Data ): """Binary data""" def set_peek( self, dataset, is_multi_byte=False ): """Set the peek and blurb text""" if not dataset.dataset.purged: dataset.peek = 'binary data' dataset.blurb = 'data' else: dataset.peek = 'file does not exist' dataset.blurb = 'file purged from disk' def get_mime( self ): """Returns the mime type of the datatype""" return 'application/octet-stream'
class Zip( Binary ): """Zip File""" file_ext = "zip" def get_mime( self ): """Returns the mime type of the datatype""" return 'application/zip'
{ many lines omitted here } ___________________________________________________________ Please keep all replies on the list by using "reply all" in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at:
participants (2)
-
Nate Coraor
-
wfmartin