Help with adding new datatype
Hi, I'm confused on how to add a new datatype. I've read the http://g2.trac.bx.psu.edu/wiki/AddingDatatypes wiki page, but it isn't clear what I need to do for a completely new type. Specifically, I want to add a gzip tool to compress and uncompress files to help up/downloading of large files. I've added the following line to the datatypes_conf.xml: <datatype extension="gz" type="galaxy.datatypes.images:Gzip" mimetype="application/gzip" display_in_upload="true"/> But I am stumped as to where I should add the additional information as per step 3. on the wiki page. Can someone help, please? BTW I do code, but not in python. Thanks, Chris
Hello Chris, I'm not quite sure adding a tool to Galaxy to compress data files is the best approach. The upload tool already handles compressed files, and retrieving compressed data from UCSC is also supported. For download, are you referring to the "save" link in the history item? This simply opens a file handle to the data, so it would have to be stored as compressed, which is currently not supported in many of the tools. Can you provide some more details about the scenarios you are attempting to cover with this ( i.e., are your users performing tasks that are not going well due to large file sizes )? Perhaps we can find a better solution for any problems you've encountered. Thanks Chris, Greg Von Kuster Galaxy Development Team Chris Cole wrote:
Hi,
I'm confused on how to add a new datatype. I've read the http://g2.trac.bx.psu.edu/wiki/AddingDatatypes wiki page, but it isn't clear what I need to do for a completely new type.
Specifically, I want to add a gzip tool to compress and uncompress files to help up/downloading of large files. I've added the following line to the datatypes_conf.xml: <datatype extension="gz" type="galaxy.datatypes.images:Gzip" mimetype="application/gzip" display_in_upload="true"/>
But I am stumped as to where I should add the additional information as per step 3. on the wiki page. Can someone help, please? BTW I do code, but not in python. Thanks,
Chris _______________________________________________ galaxy-user mailing list galaxy-user@bx.psu.edu http://mail.bx.psu.edu/cgi-bin/mailman/listinfo/galaxy-user
Hi Greg, Greg Von Kuster wrote:
Hello Chris,
I'm not quite sure adding a tool to Galaxy to compress data files is the best approach. The upload tool already handles compressed files, and retrieving compressed data from UCSC is also supported.
I can already upload compressed files? <checks...> So you can. That's great - it automatically uncompresses the file and recognises the file-type - perfect!
For download, are you referring to the "save" link in the history item?
Yes. That's right.
This simply opens a file handle to the data, so it would have to be stored as compressed, which is currently not supported in many of the tools.
Ah. I am aware that most tools won't support that file type, but I'm only considering it as the penultimate step before saving it locally.
Can you provide some more details about the scenarios you are attempting to cover with this ( i.e., are your users performing tasks that are not going well due to large file sizes )? Perhaps we can find a better solution for any problems you've encountered.
The is mainly to do with Next Generation Sequencing as I'm wanting to use Galaxy as a 'first stop' tool for analysing Solexa data. The problem is that the data files are >200Mb in size and uploading the data takes a while. The tasks themselves aren't problem as they run acceptably fast on our cluster. It's just getting the data into galaxy in the first place... I think the problem is mostly solved already - with the uploads. It's less of a problem for saving as my analysis output is usually smaller, but still fairly large. Would it be possible to compress files during the save...? Thanks very much for your help. Chris
Chris Cole wrote:
Hi,
I'm confused on how to add a new datatype. I've read the http://g2.trac.bx.psu.edu/wiki/AddingDatatypes wiki page, but it isn't clear what I need to do for a completely new type.
Specifically, I want to add a gzip tool to compress and uncompress files to help up/downloading of large files. I've added the following line to the datatypes_conf.xml: <datatype extension="gz" type="galaxy.datatypes.images:Gzip" mimetype="application/gzip" display_in_upload="true"/>
But I am stumped as to where I should add the additional information as per step 3. on the wiki page. Can someone help, please? BTW I do code, but not in python. Thanks,
Chris Cole wrote:
I think the problem is mostly solved already - with the uploads. It's less of a problem for saving as my analysis output is usually smaller, but still fairly large. Would it be possible to compress files during the save...?
Hi Chris, You can compress downloads on the fly using Apache as a proxy to Galaxy: http://g2.trac.bx.psu.edu/wiki/HowToInstall/ApacheProxy The relevant directives to compress downloads (assuming your Galaxy is at the Apache/VirtualHost root (/) would be: <LocationMatch "(/root)?/display"> SetOutputFilter DEFLATE </LocationMatch> <LocationMatch "/datasets/\d+/display"> SetOutputFilter DEFLATE </LocationMatch> --nate
participants (3)
-
Chris Cole
-
Greg Von Kuster
-
Nate Coraor